4
https://github.com/user-attachments/assets/d03a2494-7564-4856-92e7-108d2a012f4c
https://github.com/user-attachments/assets/d03a2494-7564-4856-92e7-108d2a012f4c
你可以检查一下Tokenizer结果,比如上面那句 应该是 ['▁', 'HEY', '!', '▁WHERE', "'", 'S', '▁THE', '▁', 'BA', 'TH', 'RO', 'O', 'M', '?', '!'],正常来说应该没问题。
gen 2.wav.zip
@index-tts 实测是 WeTextProcessing 这个包有BUG。
from tn.english.normalizer import Normalizer as NormalizerEn
en_normalizer = NormalizerEn(overwrite_cache=False)
replaced_text = "Hey! where's the bathroom?!"
result = en_normalizer.normalize(replaced_text)
print(result )
输出:Hey! w here is the bathroom?! 结果错误!其中 w 被分开了
【如何解决?】改用 wetext 包结果正常。
from wetext import Normalizer
en_normalizer = Normalizer(lang="en",operator="tn")
replaced_text = "Hey! where's the bathroom?!"
result = en_normalizer.normalize(replaced_text)
print(result )
输出:Hey! where's the bathroom?! 结果正确!
@juntaosun 谢谢,成功了!