在 webui 仅上传了 sample audio 和 sample text,其余参数默认不变(refine 开启, audio temperature 0.3, top_P 0.7, top_K 20,DVAE 等均未修改)
Input Text
四川美食确实以辣闻名
Sample Text
The following is a conversation with the founding members of the Cursor team, Michael Truel, Swale Asif, Arvid Lundmark, and Aman Sanger.
Sample Audio
lex_ai_cursor_team-00.00.00.000-00.00.09.945.mp3.zip
Terminal 中显示:
* Running on local URL: http://0.0.0.0:8080
To create a public link, set `share=True` in `launch()`.
C:\Users\xxx\anaconda3\envs\chattts\Lib\site-packages\gradio\blocks.py:1746: UserWarning: A function returned too many output values (needed: 0, returned: 1). Ignoring extra values.
Output components:
[]
Output values returned:
[None]
warnings.warn(
text: 0%| | 0/384(max) [00:00, ?it/s]C:\Users\xxx\anaconda3\envs\chattts\Lib\site-packages\transformers\models\llama\modeling_llama.py:655: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
attn_output = torch.nn.functional.scaled_dot_product_attention(
text: 0%|▏ | 1/384(max) [00:00, 2.85it/s]We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and will be removed in v4.47. Please convert your cache or use an appropriate `Cache` class (https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)
text: 4%|██▊ | 14/384(max) [00:00, 15.51it/s]
code: 0%|▏ | 4/2048(max) [00:00, 19.47it/s]