6
指定女声种子:
torch.manual_seed(-3.355)
笑声也关闭了:
params_refine_text = ChatTTS.Chat.RefineTextParams(
prompt='[oral_5][laugh_0][break_0]',
)
全部脚本如下:
import ChatTTS
import torch
import torchaudio
import cn2an
import sys
chat = ChatTTS.Chat()
chat.load(compile=True) # Set to True for better performance
torch._dynamo.config.suppress_errors = True
torch.manual_seed(-3.355)
#torch.manual_seed(948323456)
rand_spk = chat.sample_random_speaker()
params_infer_code = ChatTTS.Chat.InferCodeParams(
spk_emb = rand_spk, # add sampled speaker
temperature = .3, # using custom temperature
top_P = 0.7, # top P decode
top_K = 20, # top K decode
prompt = '[speed_5]'
)
###################################
# For sentence level manual control.
# use oral_(0-9), laugh_(0-2), break_(0-7)
# to generate special token in text to synthesize.
params_refine_text = ChatTTS.Chat.RefineTextParams(
prompt='[oral_5][laugh_0][break_0]',
)
texts = ["PUT YOUR 1st TEXT HERE", "PUT YOUR 2nd TEXT HERE"]
wavs = chat.infer(
texts,
params_refine_text=params_refine_text,
params_infer_code=params_infer_code,
)
###################################
generatePath = sys.argv[1]
fileId = sys.argv[2]
generateFilePath = generatePath + "/" + fileId + ".wav"
print(generateFilePath)
inputTxt = sys.argv[3]
transformText = cn2an.transform(inputTxt, "an2cn").replace("KV","千伏")
# transformText = inputTxt.replace("KV","千伏")
transformText_coding = transformText.encode('utf-8').decode('utf-8')
print(transformText_coding)
wavs = chat.infer(transformText_coding+"[uv_break]", skip_refine_text=True, params_refine_text=params_refine_text, params_infer_code=params_infer_code)
torchaudio.save("/data/wav", torch.from_numpy([[wavs0]]), 24000, bits_per_sample=16)
问题: 平均生成100个语音文件或更多(每个文本大概30个字以内,语音时长10秒以内)。 个别语音文件会存在以下问题:语气有笑声、某一段出现男声、或者听不清的奇怪声音。 望解决