[index-tts]Failed to load custom CUDA kernel for BigVGAN

5

一开始有如下错误，

GPT weights restored from: checkpoints/gpt.pth DeepSpeed加载失败，回退到标准推理: No module named 'deepspeed' Failed to load custom CUDA kernel for BigVGAN. Falling back to torch.

安装deepspeed之后就好了：pip install deepspeed

AaronPanXiaoFeng

5

一开始有如下错误，

GPT weights restored from: checkpoints/gpt.pth DeepSpeed加载失败，回退到标准推理: No module named 'deepspeed' Failed to load custom CUDA kernel for BigVGAN. Falling back to torch.

安装deepspeed之后就好了：pip install deepspeed

我是win，cuda128，编译没成功

Jandown

5

linux下需要安装 ninja-build

syaofox

5

我用 uv 执行了 uv pip install deepspeed 也没用，会卡在

uv run webui.py
>> GPT weights restored from: checkpoints/gpt.pth
[2025-05-22 11:08:28,577] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-05-22 11:08:33,501] [INFO] [logging.py:107:log_dist] [Rank -1] DeepSpeed info: version=0.16.8, git-hash=unknown, git-branch=unknown
[2025-05-22 11:08:33,501] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2025-05-22 11:08:33,501] [INFO] [logging.py:107:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1

需要手动 ctrl +c 取消，才能回退 torch

> git pull
已经是最新的。
> uv run webui.py
>> GPT weights restored from: checkpoints/gpt.pth
[2025-05-22 11:08:28,577] [INFO] [real_accelerator.py:239:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2025-05-22 11:08:33,501] [INFO] [logging.py:107:log_dist] [Rank -1] DeepSpeed info: version=0.16.8, git-hash=unknown, git-branch=unknown
[2025-05-22 11:08:33,501] [WARNING] [config_utils.py:70:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2025-05-22 11:08:33,501] [INFO] [logging.py:107:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
^C>> Failed to load custom CUDA kernel for BigVGAN. Falling back to torch.
Removing weight norm...
>> bigvgan weights restored from: checkpoints/bigvgan_generator.pth
2025-05-22 11:09:38,518 WETEXT INFO found existing fst: /mnt/data/workspace/ai/tts/index-tts/indextts/utils/tagger_cache/zh_tn_tagger.fst
2025-05-22 11:09:38,518 WETEXT INFO                     /mnt/data/workspace/ai/tts/index-tts/indextts/utils/tagger_cache/zh_tn_verbalizer.fst
2025-05-22 11:09:38,518 WETEXT INFO skip building fst for zh_normalizer ...
2025-05-22 11:09:38,756 WETEXT INFO found existing fst: /mnt/data/workspace/ai/tts/index-tts/.venv/lib/python3.10/site-packages/tn/en_tn_tagger.fst
2025-05-22 11:09:38,757 WETEXT INFO                     /mnt/data/workspace/ai/tts/index-tts/.venv/lib/python3.10/site-packages/tn/en_tn_verbalizer.fst
2025-05-22 11:09:38,757 WETEXT INFO skip building fst for en_normalizer ...
>> TextNormalizer loaded
>> bpe model loaded from: checkpoints/bpe.model
* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.

tonewww

1

[!IMPORTANT] 这个用于加速 BigVGAN 推理，如果加载失败不影响，可以忽略。

仍然想尝试的可以按下面的步骤安装环境：

确认 PyTorch 与你的 CUDA 驱动版本兼容：https://pytorch.org/get-started/locally/
确认 cuda 工具链正确安装（环境里执行 nvcc -V），如果没有则查看： https://developer.nvidia.com/cuda-toolkit-archive

nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Mar_28_02:30:10_Pacific_Daylight_Time_2024 Cuda compilation tools, release 12.4, V12.4.131 Build cuda_12.4.r12.4/compiler.34097967_0

需要配置与cuda 工具版本兼容的编译器，并确保在index-tts 运行环境中能加载到
- Windows用户安装 Microsoft Build Tools for Visual Studio https://visualstudio.microsoft.com/zh-hans/visual-cpp-build-tools/ 安装完成后，配置环境变量CC和CXX为cl.exe所在路径，类似：C:\Program Files\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.34.31933\bin\Hostx64\x64\cl.exe

Linux 用户则需要确认当前环境中的 gcc 是否与nvcc兼容，可以使用conda 安装 gcc 兼容版本

测试环境是否能正常工作：

> git clone git@github.com:NVIDIA/cuda-samples.git --depth=1
> cd cuda-samples
> nvcc -I.\Common Samples\1_Utilities\deviceQuery\deviceQuery.cpp -O3 -o deviceQuery.exe
> deviceQuery.exe
deviceQuery.exe Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce GTX 970"
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.6, CUDA Runtime Version = 12.4, NumDevs = 1
Result = PASS

> nvcc -I.\Common Samples\1_Utilities\bandwidthTest\bandwidthTest.cu -O3 -o bandwidthTest.exe
bandwidthTest.exe

[CUDA Bandwidth Test] - Starting...
Running on...

Device 0: NVIDIA GeForce GTX 970
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)        Bandwidth(GB/s)
32000000                     12.7

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)        Bandwidth(GB/s)
32000000                     12.7

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes)        Bandwidth(GB/s)
32000000                     142.2

Result = PASS

yrom

4

Failed to load custom CuDA kernel for BigVGAN. Falling back to torch. 好像GPU就无法被调用，即使nvidia-smi显示正常，只能用CPU

robin12jbj

5

linux下需要sudo apt update sudo apt install ninja-build 或者在虚拟环境中 pip install ninja --user

robin12jbj

[index-tts]Failed to load custom CUDA kernel for BigVGAN

回答