What model are you using? Chatterbox tends to degrade over longer speech.
Also, are you using the Lab or Projects interface to generate speech? For text that's longer than a few sentences you should create a project so VCP can properly chunk the text.