Searchcaster

In reply to @giu

Yeah I don’t have good intuition for the size of dataset required but my guess is a lot less than TTS, maybe similar to ASR given one-to-many problem, so was thinking there is enough public speech dataset (>10k hrs) plus non-speech dataset which can be mixed together to synthesize for training.

Open in Warpcast

In reply to @kn

*one-to-one problem

Open in Warpcast