升级 Kokoro：短促语音的自然 TTS

社区文章发布于 2024 年 11 月 22 日

Kokoro 刚刚升级，显著提高了短促语音的 TTS 自然度，同时保持了长句的同等性能。

以前，当你要求 Kokoro 使用 Sarah (af_sarah) 的声音说“你好！”时，你会得到这样的效果：输出音频带有不自然的呼吸声，而且这还是在默认后处理下进行的：（1）两端裁剪，以及（2）使用 noisereduce 进行降噪。

现在，相同的声音在相同的文本上听起来像这样：这**好多了**。此外，我们不再导入 noisereduce，因为它有或没有都听起来差不多。

让我们检查一下长句的同等性能。模型在这方面已经相当不错了，所以我们至少要确保没有退步。

This morning, The Information published an article titled "A Complex New Age of Face Tech". The first sentence reads: "In September, Instagram unveiled a splashy new feature called Teen Accounts, an effort by Meta Platforms, the app’s owner, to show it’s better protecting young people with stricter privacy and safety settings."

之前