[2noise/ChatTTS]合成语音质量问题

2025-11-10 330 views
6

指定女声种子:

torch.manual_seed(-3.355)

笑声也关闭了:

params_refine_text = ChatTTS.Chat.RefineTextParams(
    prompt='[oral_5][laugh_0][break_0]',
)

全部脚本如下:

import ChatTTS
import torch
import torchaudio

import cn2an
import sys

chat = ChatTTS.Chat()
chat.load(compile=True) # Set to True for better performance
torch._dynamo.config.suppress_errors = True

torch.manual_seed(-3.355)
#torch.manual_seed(948323456)
rand_spk = chat.sample_random_speaker()

params_infer_code = ChatTTS.Chat.InferCodeParams(
    spk_emb = rand_spk, # add sampled speaker 
    temperature = .3,   # using custom temperature
    top_P = 0.7,        # top P decode
    top_K = 20,         # top K decode
    prompt = '[speed_5]'
)

###################################
# For sentence level manual control.

# use oral_(0-9), laugh_(0-2), break_(0-7) 
# to generate special token in text to synthesize.
params_refine_text = ChatTTS.Chat.RefineTextParams(
    prompt='[oral_5][laugh_0][break_0]',
)

texts = ["PUT YOUR 1st TEXT HERE", "PUT YOUR 2nd TEXT HERE"]
wavs = chat.infer(
    texts,
    params_refine_text=params_refine_text,
    params_infer_code=params_infer_code,
)

###################################
generatePath = sys.argv[1]
fileId = sys.argv[2]
generateFilePath = generatePath + "/" + fileId + ".wav"
print(generateFilePath)
inputTxt = sys.argv[3]
transformText = cn2an.transform(inputTxt, "an2cn").replace("KV","千伏")
# transformText = inputTxt.replace("KV","千伏")
transformText_coding = transformText.encode('utf-8').decode('utf-8')
print(transformText_coding)
wavs = chat.infer(transformText_coding+"[uv_break]", skip_refine_text=True, params_refine_text=params_refine_text,  params_infer_code=params_infer_code)
torchaudio.save("/data/wav", torch.from_numpy([[wavs0]]), 24000, bits_per_sample=16)

问题: 平均生成100个语音文件或更多(每个文本大概30个字以内,语音时长10秒以内)。 个别语音文件会存在以下问题:语气有笑声、某一段出现男声、或者听不清的奇怪声音。 望解决

回答

4

是在银河麒麟服务器下。

9

ChatTTS的目标并不是文本映射语音的严格一致性,而是口语化的表达,再加上为避免滥用,训练时有意添加了高频噪声,因此出现这些问题是正常的。如有严谨TTS需求,建议使用商业TTS服务,如微软的TTS。

4

好的,感谢解答。