当前位置: 首页 > news >正文

【记录一下】LMDeploy学习笔记及遇到的问题

LMDeploy 是一个用于大型语言模型(LLMs)和视觉-语言模型(VLMs)压缩、部署和服务的 Python 库。 其核心推理引擎包括 TurboMind 引擎和 PyTorch 引擎。前者由 C++ 和 CUDA 开发,致力于推理性能的优化,而后者纯 Python 开发,旨在降低开发者的门槛。

LMDeploy 支持在 Linux 和 Windows 平台上部署 LLMs 和 VLMs,最低要求 CUDA 版本为 11.3。此外,它还与以下 NVIDIA GPU 兼容:

Volta(sm70): V100 Turing(sm75): 20 系列,T4 Ampere(sm80,sm86): 30 系列,A10, A16, A30, A100 Ada Lovelace(sm89): 40 系列

LMDeploy显存优化比vllm更好

nvitop  #查看显存占用

在一个干净的conda环境下(python3.8 - 3.12),安装 lmdeploy

一、安装

**linux环境目前不推荐使用3.12的版本**,但是windows环境不报错 就很迷,但是windows环境安装的torch没有自带安装CUDA,因此启动时会报错,报错信息在下面

conda create -n lmdeploy python=3.12 -y
conda activate lmdeploy
pip install lmdeploy

二、报错1

因为在pip install lmdeploy时,下载Downloading fire-0.7.0.tar.gz报错,存在兼容性问题,这个版本的fire与12不兼容

(lmdeploy) root@dsw-942822-5c5dcbf687-85ktw:/mnt/workspace/Anaconda3/envs# pip install lmdeploy
Collecting lmdeployDownloading lmdeploy-0.7.2.post1-cp312-cp312-manylinux2014_x86_64.whl.metadata (17 kB)
Collecting accelerate>=0.29.3 (from lmdeploy)Downloading accelerate-1.5.2-py3-none-any.whl.metadata (19 kB)
Collecting einops (from lmdeploy)Downloading einops-0.8.1-py3-none-any.whl.metadata (13 kB)
Collecting fastapi (from lmdeploy)Downloading fastapi-0.115.11-py3-none-any.whl.metadata (27 kB)
Collecting fire (from lmdeploy)Downloading fire-0.7.0.tar.gz (87 kB)Preparing metadata (setup.py) ... errorerror: subprocess-exited-with-error× python setup.py egg_info did not run successfully.│ exit code: 1╰─> [3 lines of output]/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.12/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.ymlwarnings.warn(ERROR: Can not execute `setup.py` since setuptools is not available in the build environment.[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed× Encountered error while generating package metadata.
╰─> See above for output.note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

推荐python3.8 - 3.11

conda create -n lmdeploy python=3.11 -y
conda activate lmdeploy
pip install lmdeploy

不在报错
在这里插入图片描述

三、启动

linux 下所下载的模型的绝对路径

lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct

四、报错2

启动过程中报错如下:

(lmdeploy) root@dsw-942822-5c5dcbf687-85ktw:/mnt/workspace/Anaconda3/envs# lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct
Traceback (most recent call last):File "/mnt/workspace/Anaconda3/envs/lmdeploy/bin/lmdeploy", line 8, in <module>sys.exit(run())^^^^^File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/entrypoint.py", line 14, in runSubCliServe.add_parsers()File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/serve.py", line 361, in add_parsersSubCliServe.add_parser_api_server()File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/serve.py", line 142, in add_parser_api_serverArgumentHelper.tool_call_parser(parser_group)File "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/cli/utils.py", line 375, in tool_call_parserfrom lmdeploy.serve.openai.tool_parser import ToolParserManagerFile "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/serve/openai/tool_parser/__init__.py", line 2, in <module>from .internlm2_parser import Internlm2ToolParserFile "/mnt/workspace/Anaconda3/envs/lmdeploy/lib/python3.11/site-packages/lmdeploy/serve/openai/tool_parser/internlm2_parser.py", line 6, in <module>import partial_json_parser
ModuleNotFoundError: No module named 'partial_json_parser'

原因:由于缺少 partial_json_parser 模块。这是 lmdeploy 的依赖项之一,但可能未自动安装。
您遇到的错误是由于缺少 partial_json_parser 模块。这是 lmdeploy 的依赖项之一,但可能未自动安装。以下是解决方案:


1. 安装缺失的依赖项
pip install partial-json-parser
2. 重新运行 lmdeploy serve 命令
lmdeploy serve api_server /mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct

再次启动不报错
在这里插入图片描述
openai没有安装的记得安装

pip install openai

五、代码测试(linux环境下)

#多轮对话
from openai import OpenAI#定义多轮对话方法
def run_chat_session():#初始化客户端client = OpenAI(base_url="http://localhost:23333/v1/",api_key="123456")#初始化对话历史chat_history = []#启动对话循环while True:#获取用户输入user_input = input("用户:")if user_input.lower() == "exit":print("退出对话。")break#更新对话历史(添加用户输入)chat_history.append({"role":"user","content":user_input})#调用模型回答try:chat_complition = client.chat.completions.create(messages=chat_history,model="/mnt/workspace/llm/Qwen/Qwen2.5-0.5B-Instruct")#获取最新回答model_response = chat_complition.choices[0]print("AI:",model_response.message.content)#更新对话历史(添加AI模型的回复)chat_history.append({"role":"assistant","content":model_response.message.content})except Exception as e:print("发生错误:",e)break
if __name__ == '__main__':run_chat_session()

六、windows环境安装的torch没有自带安装CUDA,因此启动时会报错,报错信息在下面

(lmdeploy) PS C:\Users\fengxinzi> lmdeploy serve api_server "D:\Program Files\python\PycharmProjects\AiStudyProject\demo06\models\Qwen\Qwen2___5-0___5B-Instruct"
Traceback (most recent call last):File "<frozen runpy>", line 198, in _run_module_as_mainFile "<frozen runpy>", line 88, in _run_codeFile "D:\envs\lmdeploy\Scripts\lmdeploy.exe\__main__.py", line 7, in <module>File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\cli\entrypoint.py", line 39, in runargs.run(args)File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\cli\serve.py", line 283, in api_serverelse get_max_batch_size(args.device)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\lmdeploy\utils.py", line 338, in get_max_batch_sizedevice_name = torch.cuda.get_device_name(0).lower()^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 493, in get_device_namereturn get_device_properties(device).name^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 523, in get_device_properties_lazy_init()  # will define _get_device_properties^^^^^^^^^^^^File "D:\envs\lmdeploy\Lib\site-packages\torch\cuda\__init__.py", line 310, in _lazy_initraise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

conda list 查出来也表明 没有cuda
在这里插入图片描述

遇到的错误表明PyTorch未正确启用CUDA支持。
因此我们需要安装cuda,版本至少11.8

# 以下二选一
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

安装好以后,再次启动,没有报错
在这里插入图片描述
如此就可以通过代码连接,跑起来了。
在这里插入图片描述


http://www.mrgr.cn/news/95544.html

相关文章:

  • 【算法】常见dp、多状态dp、背包问题、子序列问题
  • 蓝桥杯 劲舞团
  • 给语言模型增加知识逻辑校验智能,识别网络信息增量的垃圾模式
  • 大数据环境搭建
  • 关于网络的一点知识(持续更新)
  • LangChain Chat Model学习笔记
  • windows清除电脑开机密码,可保留原本的系统和资料,不重装系统
  • python-selenium 爬虫 由易到难
  • 用 pytorch 从零开始创建大语言模型(零):汇总
  • Ubuntu实时读取音乐软件的音频流
  • ‘闭包‘, ‘装饰器‘及其应用场景
  • (四)---四元数的基础知识-(定义)-(乘法)-(逆)-(退化到二维复平面)-(四元数乘法的导数)
  • 链表题型-链表操作-JS
  • ffmpeg介绍(一)——解封装
  • pycharm快捷键汇总(持续更新)
  • ROS melodic 安装 python3 cv_bridge
  • JVM垃圾回收笔记01-垃圾回收算法
  • 【Vue3入门1】02- Vue3的基本操作(上)
  • 第16届蓝桥杯单片机4T模拟赛三
  • Androidstudio实现引导页文字动画