[index-tts]mac用户安装有问题，各种报错

8

https://github.com/fire3/index-tts

我在Mac Mini M4乞丐版试过。修改 indextts/infer.py ，增加对MPS 的支持即可

主要就是慢，但是也能用。

fire3

9

https://github.com/fire3/index-tts

我在Mac Mini M4乞丐版试过。修改 indextts/infer.py ，增加对MPS 的支持即可

主要就是慢，但是也能用。

多谢兄弟，我试试

departurechen

0

mac电脑跑还是有问题，所有模型均已成功下载，并通过Claude修改了infer.py,能成功看到启动界面，但是合成语音失败，一直提示：合成失败，可能是因为使用了替代模型。请确保所有模型文件都存在。

departurechen

6

Errored while running on macbook M4 pro

THERE IS A VEHICLE ARRIVING IN DOCK NUMBER SEVEN?
tensor([[10242, 10219, 10209, 11702, 10201, 11374, 11884, 10443, 10264, 10218,
         10246, 11163, 10440, 10495, 10301]], device='mps:0',
       dtype=torch.int32)
text_tokens shape: torch.Size([1, 15]), text_tokens type: torch.int32
['▁THERE', '▁IS', '▁A', '▁VEHICLE', '▁', 'AR', 'RI', 'V', 'ING', '▁IN', '▁DO', 'CK', '▁NUMBER', '▁SEVEN', '?']
tensor([15], device='mps:0', dtype=torch.int32)
===================
/Users/guest/dev/ai/repos/index-tts/indextts/infer.py:356: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  with torch.cuda.amp.autocast(enabled=self.dtype is not None, dtype=self.dtype):
/Users/guest/.pyenv/versions/venv310/lib/python3.10/site-packages/torch/amp/autocast_mode.py:266: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn(
Traceback (most recent call last):
  File "/Users/guest/dev/ai/repos/index-tts/indextts/infer.py", line 446, in <module>
    tts.infer(audio_prompt=prompt_wav, text=text, output_path="gen.wav")
  File "/Users/guest/dev/ai/repos/index-tts/indextts/infer.py", line 357, in infer
    codes = self.gpt.inference_speech(auto_conditioning, text_tokens,
  File "/Users/guest/dev/ai/repos/index-tts/indextts/gpt/model.py", line 598, in inference_speech
    speech_conditioning_latent = self.get_conditioning(speech_conditioning_latent, cond_mel_lengths)
  File "/Users/guest/dev/ai/repos/index-tts/indextts/gpt/model.py", line 496, in get_conditioning
    speech_conditioning_input, mask = self.conditioning_encoder(speech_conditioning_input.transpose(1, 2),
  File "/Users/guest/.pyenv/versions/venv310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/guest/.pyenv/versions/venv310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/guest/dev/ai/repos/index-tts/indextts/gpt/conformer_encoder.py", line 426, in forward
    xs, pos_emb, masks = self.embed(xs, masks)
  File "/Users/guest/.pyenv/versions/venv310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
...
RuntimeError: Input type (float) and bias type (c10::Half) should be the same

zlace0x

3

试试 https://github.com/index-tts/index-tts/pull/78 已支持mps

yrom

[index-tts]mac用户安装有问题，各种报错

回答