具体日志如下
/index-tts main > uv run webui.py --model_dir checkpoints --fp16 19:33:38
>> GPT weights restored from: checkpoints/gpt.pth
[2025-09-09 19:33:59,644] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-09-09 19:34:00,925] [INFO] [logging.py:107:log_dist] [Rank -1] [TorchCheckpointEngine] Initialized with serialization = False
GPT2InferenceModel has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
- If you're using `trust_remote_code=True`, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes
- If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
- If you are not the owner of the model architecture class, please contact the model code owner to update it.
[2025-09-09 19:34:00,931] [INFO] [logging.py:107:log_dist] [Rank -1] DeepSpeed info: version=0.17.1, git-hash=unknown, git-branch=unknown
[2025-09-09 19:34:00,931] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2025-09-09 19:34:00,931] [INFO] [logging.py:107:log_dist] [Rank -1] [TorchCheckpointEngine] Initialized with serialization = False
[2025-09-09 19:34:00,931] [INFO] [logging.py:107:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[2025-09-09 19:34:00,969] [INFO] [logging.py:107:log_dist] [Rank -1] DeepSpeed-Inference config: {'layer_id': 0, 'hidden_size': 1280, 'intermediate_size': 5120, 'heads': 20, 'num_hidden_layers': -1, 'dtype': torch.float16, 'pre_layer_norm': True, 'norm_type': <NormType.LayerNorm: 1>, 'local_rank': -1, 'stochastic_mode': False, 'epsilon': 1e-05, 'mp_size': 1, 'scale_attention': True, 'triangular_masking': True, 'local_attention': False, 'window_size': 1, 'rotary_dim': -1, 'rotate_half': False, 'rotate_every_two': True, 'return_tuple': True, 'mlp_after_attn': True, 'mlp_act_func_type': <ActivationFuncType.GELU: 1>, 'training_mp_size': 1, 'bigscience_bloom': False, 'max_out_tokens': 1024, 'min_out_tokens': 1, 'scale_attn_by_inverse_layer_idx': False, 'enable_qkv_quantization': False, 'use_mup': False, 'return_single_tuple': False, 'set_empty_params': False, 'transposed_mode': False, 'use_triton': False, 'triton_autotune': False, 'num_kv': -1, 'rope_theta': 10000, 'invert_mask': True}
W0909 19:34:00.982000 6163 .venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2425] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
W0909 19:34:00.982000 6163 .venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2425] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
ninja: no work to do.
Time to load transformer_inference op: 0.023575544357299805 seconds
W0909 19:34:01.017000 6163 .venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2425] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
W0909 19:34:01.017000 6163 .venv/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2425] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
ninja: no work to do.
>> Preload custom CUDA kernel for BigVGAN <module 'anti_alias_activation_cuda' from '/home/workibear/Desktop/Python/index-tts/indextts/BigVGAN/alias_free_activation/cuda/build/anti_alias_activation_cuda.so'>
>> semantic_codec weights restored from: /home/workibear/.cache/huggingface/hub/models--amphion--MaskGCT/snapshots/265c6cef07625665d0c28d2faafb1415562379dc/semantic_codec/model.safetensors
cfm loaded
length_regulator loaded
gpt_layer loaded
>> s2mel weights restored from: checkpoints/s2mel.pth
>> campplus_model weights restored from: /home/workibear/.cache/huggingface/hub/models--funasr--campplus/snapshots/fb71fe990cbf6031ae6987a2d76fe64f94377b7e/campplus_cn_common.bin
Loading weights from nvidia/bigvgan_v2_22khz_80band_256x
Removing weight norm...
>> bigvgan weights restored from: nvidia/bigvgan_v2_22khz_80band_256x
2025-09-09 19:34:12,276 WETEXT INFO found existing fst: /home/workibear/Desktop/Python/index-tts/indextts/utils/tagger_cache/zh_tn_tagger.fst
2025-09-09 19:34:12,277 WETEXT INFO /home/workibear/Desktop/Python/index-tts/indextts/utils/tagger_cache/zh_tn_verbalizer.fst
2025-09-09 19:34:12,277 WETEXT INFO skip building fst for zh_normalizer ...
2025-09-09 19:34:12,432 WETEXT INFO found existing fst: /home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/tn/en_tn_tagger.fst
2025-09-09 19:34:12,432 WETEXT INFO /home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/tn/en_tn_verbalizer.fst
2025-09-09 19:34:12,432 WETEXT INFO skip building fst for en_normalizer ...
>> TextNormalizer loaded
>> bpe model loaded from: checkpoints/bpe.model
* Running on local URL: http://0.0.0.0:7860
* To create a public link, set `share=True` in `launch()`.
Emo control mode:0,vec:None
>> start inference...
Traceback (most recent call last):
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/gradio/queueing.py", line 667, in process_events
response = await route_utils.call_process_api(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/gradio/route_utils.py", line 349, in call_process_api
output = await app.get_blocks().process_api(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 2274, in process_api
result = await self.call_function(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/gradio/blocks.py", line 1781, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2476, in run_sync_in_worker_thread
return await future
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/gradio/utils.py", line 915, in wrapper
response = f(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/webui.py", line 142, in gen_single
output = tts.infer(spk_audio_prompt=prompt, text=text,
File "/home/workibear/Desktop/Python/index-tts/indextts/infer_v2.py", line 340, in infer
spk_cond_emb = self.get_emb(input_features, attention_mask)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/indextts/infer_v2.py", line 197, in get_emb
vq_emb = self.semantic_model(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 1027, in forward
encoder_outputs = self.encoder(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 533, in forward
layer_outputs = layer(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 441, in forward
hidden_states, attn_weigts = self.self_attn(
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
File "/home/workibear/Desktop/Python/index-tts/.venv/lib/python3.10/site-packages/transformers/models/wav2vec2_bert/modeling_wav2vec2_bert.py", line 319, in forward
scores = scores + (relative_position_attn_weights / math.sqrt(self.head_size))
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 714.00 MiB. GPU 0 has a total capacity of 11.60 GiB of which 253.75 MiB is free. Including non-PyTorch memory, this process has 11.33 GiB memory in use. Of the allocated memory 10.44 GiB is allocated by PyTorch, and 684.31 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)