python wll.py 内容如下:
测试 - GPU
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
为什么报cpu的错, 小模型有点问题,我明明的用GPU跑的,只有 THUDM/chatglm-6b 是对的,但是我的内存又不够。想测试一下小模型都不行,我太难了...
(mygpt) D:\dzkj\chatGlmBase>python wll.py
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torchvision\io\image.py:11: UserWarning: Failed to load image Python extension: Could not find module 'D:\Anaconda3\envs\mygpt\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.
warn(f"Failed to load image Python extension: {e}")
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "wll.py", line 55, in 
model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4-qe", trust_remote_code=True).half().cuda()
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\transformers\models\auto\auto_factory.py", line 459, in from_pretrained
return model_class.from_pretrained(
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\transformers\modeling_utils.py", line 2362, in from_pretrained
model = cls(config, *model_args, *model_kwargs)
File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 928, in init
self.lm_head = skip_init(
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\utils\init.py", line 51, in skip_init
return module_cls(args, **kwargs).to_empty(device=final_device)
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 780, in to_empty
return self._apply(lambda t: torch.empty_like(t, device=device))
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 593, in _apply
param_applied = fn(param)
File "C:\ProgramData\Anaconda3\envs\mygpt\lib\site-packages\torch\nn\modules\module.py", line 780, in 
return self._apply(lambda t: torch.empty_like(t, device=device))
RuntimeError: [enforce fail at ..\c10\core\CPUAllocator.cpp:76] data. DefaultCPUAllocator: not enough memory: you tried to allocate 1233125376 bytes.
