2
换了不同的参考音频和目标文本,发现“啊”这个字总是读不出来。替换成“阿”字或者替换成拼音a1或a5,读出来也很不自然。请问怎么可以解决呢?
换了不同的参考音频和目标文本,发现“啊”这个字总是读不出来。替换成“阿”字或者替换成拼音a1或a5,读出来也很不自然。请问怎么可以解决呢?
能给下参考音频和要合成的文本?
index-tts\indextts\utils\front.py 我也发现这个问题,和这两个库有关,不知道有没有替代方案,分词都会在某些情况下丢“啊”字 def load(self):
print(os.path.join(os.path.dirname(os.path.abspath(file)), "..")) # sys.path.append(model_dir)
import platform
if self.zh_normalizer is not None and self.en_normalizer is not None:
return
if platform.system() == "Darwin":
from wetext import Normalizer
self.zh_normalizer = Normalizer(remove_erhua=False, lang="zh", operator="tn")
self.en_normalizer = Normalizer(lang="en", operator="tn")
else:
from tn.chinese.normalizer import Normalizer as NormalizerZh
from tn.english.normalizer import Normalizer as NormalizerEn
self.zh_normalizer = NormalizerZh(remove_interjections=False, remove_erhua=False, overwrite_cache=False)
self.en_normalizer = NormalizerEn(overwrite_cache=False)
能给下参考音频和要合成的文本?
任意中文参考音频和目标文本,只要有“啊”字都读不出来,比如“这太好了啊。”
same problem
啊被WeTextProcessing配成黑名单了,https://github.com/wenet-e2e/WeTextProcessing/blob/7fda17b30de7ab1015c8ce56139c075fcb0f7262/tn/chinese/data/default/blacklist.tsv#L1-L2
临时解决方案:给 zh_normalizer加上另外的cache_dir 和remove_interjections=False:
self.zh_normalizer = NormalizerZh(cache_dir=files("indextts"), remove_interjections=False, remove_erhua=False, overwrite_cache=False)
会重新生成规则文件。