【记录一下】LMDeploy学习笔记及遇到的问题
LMDeploy 是一个用于大型语言模型(LLMs)和视觉-语言模型(VLMs)压缩、部署和服务的 Python 库。 其核心推理引擎包括 TurboMind 引擎和 PyTorch 引擎。前者由 C++ 和 CUDA 开发,致力于推理性能的优化,而后者纯 Python 开发,旨在降低开发者的门槛。
LMDeploy 支持在 Linux 和 Windows 平台上部署 LLMs 和 VLMs,最低要求 CUDA 版本为 11.3。此外,它还与以下 NVIDIA GPU 兼容:
Volta(sm70): V100 Turing(sm75): 20 系列,T4 Ampere(sm80,sm86): 30 系列,A10, A16, A30, A100 Ada Lovelace(sm89): 40 系列
LMDeploy显存优化比vllm更好
nvitop #查看显存占用
在一个干净的conda环境下(python3.8 - 3.12),安装 lmdeploy
一、安装
**linux环境目前不推荐使用3.12的版本**,但是windows环境不报错 就很迷,但是windows环境安装的torch没有自带安装CUDA,因此启动时会报错,报错信息在下面
conda create -n lmdeploy python=3.12 -y
conda activate lmdeploy
pip install lmdeploy
二、报错1
因为在pip install lmdeploy时,下载Downloading fire-0.7.0.tar.gz报错,存在兼容性问题,这个版本的fire与12不兼容
(lmdeploy) root@dsw-942822-5c5dcbf687-85ktw:/mnt/workspace/Anaconda3/envs# pip install lmdeploy
Collecting lmdeployDownloading lmdeploy-0.7.2.post1-cp312-cp312-manylinux2014_x86_64.whl.metadata (17 kB)
Collecting accelerate>=0.29.3 (from lmdeploy)Downloading accelerate-1.5.2-py3-none-any.whl.metadata (19 kB)
Collecting einops (from lmdeploy)Downloading einops-0.8.1-py3-none-any.whl.metadata (13 kB)
Collecting fastapi (from lmdeploy)Downloading fastapi-0.115.11-py3-none-any.whl.metadata (27 kB)
Collecting fire (from lmdeploy)Downloading fire-0.7.0.tar.gz (87 kB)Preparing metadata (setup.py) ... errorerror: subprocess-exited-with-error× python setup.py egg_info did not run successfully.│ exit code: 1╰─> [3 lines of output]/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.12/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.ymlwarnings.warn(ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed× Encountered error while generating package metadata.
╰─> See above for output.note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
推荐python3.8 - 3.11
conda create -n lmdeploy python=3.11 -y
conda activate lmdeploy
pip install lmdeploy
不在报错
三、启动
linux 下所下载的模型的绝对路径
lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct
四、报错2
启动过程中报错如下:
(lmdeploy) root@dsw-942822-5c5dcbf687-85ktw:/mnt/workspace/Anaconda3/envs# lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct
Traceback (most recent call last):File "/mnt/workspace/Anaconda3/envs/lmdeploy/bin/lmdeploy", line 8, in <module>sys.exit(run())^^^^^File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/entrypoint.py", line 14, in runSubCliServe.add_parsers()File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/serve.py", line 361, in add_parsersSubCliServe.add_parser_api_server()File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/serve.py", line 142, in add_parser_api_serverArgumentHelper.tool_call_parser(parser_group)File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/utils.py", line 375, in tool_call_parserfrom lmdeploy.serve.openai.tool_parser import ToolParserManagerFile "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/serve/openai/tool_parser/__init__.py", line 2, in <module>from .internlm2_parser import Internlm2ToolParserFile "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/serve/openai/tool_parser/internlm2_parser.py", line 6, in <module>import partial_json_parser
ModuleNotFoundError: No module named 'partial_json_parser'
原因:由于缺少 partial_json_parser 模块。这是 lmdeploy 的依赖项之一,但可能未自动安装。
您遇到的错误是由于缺少 partial_json_parser
模块。这是 lmdeploy
的依赖项之一,但可能未自动安装。以下是解决方案:
1. 安装缺失的依赖项
pip install partial-json-parser
2. 重新运行 lmdeploy serve
命令
lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct
再次启动不报错
openai没有安装的记得安装
pip install openai
五、代码测试(linux环境下)
#多轮对话
from openai import OpenAI#定义多轮对话方法
def run_chat_session():#初始化客户端client = OpenAI(base_url="http://localhost:23333/v1/",api_key="123456")#初始化对话历史chat_history = []#启动对话循环while True:#获取用户输入user_input = input("用户:")if user_input.lower() == "exit":print("退出对话。")break#更新对话历史(添加用户输入)chat_history.append({"role":"user","content":user_input})#调用模型回答try:chat_complition = client.chat.completions.create(messages=chat_history,model="/mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct")#获取最新回答model_response = chat_complition.choices[0]print("AI:",model_response.message.content)#更新对话历史(添加AI模型的回复)chat_history.append({"role":"assistant","content":model_response.message.content})except Exception as e:print("发生错误:",e)break
if __name__ == '__main__':run_chat_session()
六、windows环境安装的torch没有自带安装CUDA,因此启动时会报错,报错信息在下面
(lmdeploy) PS C:\Users\fengxinzi> lmdeploy serve api_server "D:\Program Files\python\PycharmProjects\AiStudyProject\demo06\models\Qwen\Qwen2___5-0___5B-Instruct"
Traceback (most recent call last):File "<frozen runpy>", line 198, in _run_module_as_mainFile "<frozen runpy>", line 88, in _run_codeFile "D:\envs\lmdeploy\Scripts\lmdeploy.exe\__main__.py", line 7, in <module>File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\cli\entrypoint.py", line 39, in runargs.run(args)File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\cli\serve.py", line 283, in api_serverelse get_max_batch_size(args.device)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\utils.py", line 338, in get_max_batch_sizedevice_name = torch.cuda.get_device_name(0).lower()^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 493, in get_device_namereturn get_device_properties(device).name^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 523, in get_device_properties_lazy_init() # will define _get_device_properties^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 310, in _lazy_initraise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
conda list 查出来也表明 没有cuda
遇到的错误表明PyTorch未正确启用CUDA支持。
因此我们需要安装cuda,版本至少11.8
# 以下二选一
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
安装好以后,再次启动,没有报错
如此就可以通过代码连接,跑起来了。