[index-tts]嘿!洗手间在哪儿?!,读成 w 这里

2025-10-30 104 views
4

https://github.com/user-attachments/assets/d03a2494-7564-4856-92e7-108d2a012f4c

回答

8

你可以检查一下Tokenizer结果,比如上面那句 应该是 ['▁', 'HEY', '!', '▁WHERE', "'", 'S', '▁THE', '▁', 'BA', 'TH', 'RO', 'O', 'M', '?', '!'],正常来说应该没问题。

gen 2.wav.zip

3

@index-tts 实测是 WeTextProcessing 这个包有BUG。

from tn.english.normalizer import Normalizer as NormalizerEn
en_normalizer = NormalizerEn(overwrite_cache=False)

replaced_text = "Hey! where's the bathroom?!"
result = en_normalizer.normalize(replaced_text)
print(result )
输出:Hey! w here is the bathroom?!  结果错误!其中 w 被分开了

【如何解决?】改用 wetext 包结果正常。

from wetext import Normalizer
en_normalizer = Normalizer(lang="en",operator="tn")

replaced_text = "Hey! where's the bathroom?!"
result = en_normalizer.normalize(replaced_text)
print(result )
输出:Hey! where's the bathroom?!  结果正确!
8

@juntaosun 谢谢,成功了!