[THUDM/ChatGLM-6B]我的gpu为11G，跑不了小模型THUDM/chatglm-6b-int4-qe

python wll.py 内容如下：

测试 - GPU

from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True) model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).half().cuda()

model = model.eval() response, history = model.chat(tokenizer, "你好", history=[]) print(response)

为什么报cpu的错，小模型有点问题，我明明的用GPU跑的，只有 THUDM/chatglm-6b 是对的，但是我的内存又不够。想测试一下小模型都不行，我太难了...

(mygpt) D:\dzkj\chatGlmBase>python wll.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torchvision\io\image.py:11: UserWarning: Failed to load image Python extension: Could not find module 'D:\Anaconda3\envs\mygpt\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax. warn(f"Failed to load image Python extension: {e}") Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "wll.py", line 55, in model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).half().cuda() File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\transformers\models\auto\auto_factory.py", line 459, in from_pretrained return model_class.from_pretrained( File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\transformers\modeling_utils.py", line 2362, in from_pretrained model = cls(config, *model_args, *model_kwargs) File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 928, in init self.lm_head = skip_init( File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\utils\init.py", line 51, in skip_init return module_cls(args, **kwargs).to_empty(device=final_device) File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 780, in to_empty return self._apply(lambda t: torch.empty_like(t, device=device)) File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 593, in _apply param_applied = fn(param) File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 780, in return self._apply(lambda t: torch.empty_like(t, device=device)) RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:76] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1233125376 bytes.

gg22mm

看这报错，是没调用GPU

Thunderltx

看这报错，是没调用GPU

有GPU, 不懂奇葩的模型：THUDM/chatglm-6b-int4-qe自己非要去调cpu，这个咋整? 在线等.....................

gg22mm

模型要先创建在内存里然后才转移到GPU上，应该是你的内存不够了。你看一下现在的空余内存？

duzx16

模型要先创建在内存里然后才转移到GPU上，应该是你的内存不够了。你看一下现在的空余内存？

不像是内存呢，我的16G内存了

gg22mm

我知道是怎么回事了，是不兼容windows

gg22mm

想问下，这个问题解决了吗

getidone

我知道是怎么回事了，是不兼容windows

问题解决了吗？

getidone

[THUDM/ChatGLM-6B]我的gpu为11G，跑不了小模型THUDM/chatglm-6b-int4-qe

回答