Roger Brown

ms13d

AI & ML interests

None yet

Recent Activity

commented on an article 2 days ago

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

liked a Space 2 days ago

srinivasbilla/llasa-8b-tts

liked a Space 2 days ago

srinivasbilla/llasa-3b-tts

View all activity

Organizations

None yet

ms13d's activity

commented on The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... 2 days ago

Yesterday's GitHub update was great!!

But I'm having a problem. The Huggingface spaces you have generate very natural and too close to the given reference audio.
But when i installed the GitHub version it was a little different like a bit more fast speech and doesn't respect given reference audio, sounds too robotic, and also it takes 3-4 generations to get a perfect audio (not the previously said problems but the audio is morphed into non-verbal sounds).
is there any custom configuration you did in the Huggingface spaces?
my config is this
max_length=2048,
top_p=1,
temperature=0.8