如何使用 Ollama 的 API 来生成文本
如何使用 Ollama 的 API 来生成文本
简介
生成文本
生成文本的示例
加载模型
卸载模型
简介
Ollama 提供了一个 RESTful API,允许开发者通过 HTTP 请求与 Ollama 服务进行交互。这个 API 覆盖了所有 Ollama 的核心功能,包括模型管理、运行和监控。本篇将介绍如何调用 Ollama 的 RESTful API 来生成文本。
生成文本
端点:
POST /api/generate
该 API 可以根据给定的提示,使用提供的模型生成一个回复。要注意这是一个流式传输端点,因此会有一系列的回复,也可以通过设置 stream 为 false 来一次性回复。最终回复回来的响应会包含来自请求的统计信息和其他数据。
一、参数
基本参数:
- model(必需) : 模型名称
- prompt :用于生成回复内容的提示信息(这项参数如果不填可能会造成模型生成大量空白)
- suffix : 模型回复内容之后的文本
- images (可选):一个以 Base64 编码的图片列表(适用于如 LLaVA 等多模态模型)
高级参数(可选):
- format:返回回复所采用的格式。格式可以是 json 或 JSON 模式
- options:模型文件文档中列出的其他模型参数,例如温度
- system:系统消息(覆盖模型文件中定义的内容)
- template:要使用的提示模板(覆盖模型文件中定义的内容)
- stream:如果为 false,回复将作为单个回复对象返回,而不是一系列对象的流式传输
- raw:如果为 true,则不会对提示进行任何格式化。如果在向 API 发出的请求中指定了完整的模板化提示,你可以选择使用 raw 参数
- keep_alive:控制请求后模型在内存中保持加载的时长(默认值:5 分钟)
二、格式化输出与 JSON 模式
格式化输出:通过在 format 参数中提供一个 JSON 模式来支持结构化输出。模型将生成一个符合该模式的回复。具体的可以查看下面的结构化输出示例。
JSON 模式:通过将 format 参数设置为 json 来启用 JSON 模式。这会将回复构建为一个有效的 JSON 对象。具体的可以查看下面的JSON 模式示例。
生成文本的示例
一、生成请求(流式)
请求:
curl http://localhost:11434/api/generate -d '{"model": "deepseek-r1:32b","prompt": "你是谁?"
}'
注意:如果模型不存在的话并不会像 cmd 中那样自动从库中拉模型下来,而是直接报错说找不到模型
响应:
这将会返回一系列的 JSON 对象,其中一个 JSON 对象如下所示
{"model": "deepseek-r1:32b","created_at": "2025-02-27T07:01:05.3153434Z","response": "我是","done": false
}
流中的最终响应还包含有关生成的其他数据:
- total_duration:生成响应所花费的时间
- load_duration:加载模型所花费的时间(以纳秒为单位)
- prompt_eval_count:提示中的 token 数量
- prompt_eval_duration:评估提示所花费的时间(以纳秒为单位)
- eval_count:响应中的 token 数量
- eval_duration:生成响应所花费的时间(以纳秒为单位)
- context:此响应中使用的对话的编码,可在下次请求中发送此编码以保留对话记忆
- response:如果响应是流式传输的,则为空;如果不是流式传输的,它将包含完整的响应内容
当响应输出完成后 done 会 变为 true,这时候就会显示上述的其他数据,如果要计算每秒产生的 token,可通过将 eval_count 除以 eval_duration 再乘以来计算。响应输出完成后的 JSON 对象如下所示
{"model": "deepseek-r1:32b","created_at": "2025-02-27T07:01:06.4779939Z","response": "","done": true,"done_reason": "stop","context": [151644,105043,100165,11319,151645,151648,271,151649,198,198,111308,6313,104198,67071,105538,102217,30918,50984,9909,33464,39350,7552,73218,100013,9370,100168,110498,33464,39350,12,49,16,1773,29524,87026,110117,99885,86119,3837,105351,99739,35946,111079,113445,100364,1773],"total_duration": 63783363300,"load_duration": 62248853200,"prompt_eval_count": 6,"prompt_eval_duration": 78000000,"eval_count": 40,"eval_duration": 1451000000
}
二、生成请求(非流式)
请求:
非流式的请求将会一次性的在回复中接收到完整的响应,请求如下所示
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "Why is the sky blue?","stream": false
}'
响应:
在请求是将 stream 设置为 false,响应只会返回一个 JSON 对象,如下所示
{"model": "llama3.2","created_at": "2025-02-27T07:42:46.982222Z","response": "The sky appears blue because of a phenomenon called Rayleigh scattering, ..., making shorter wavelengths like blue and violet more visible to our eyes.","done": true,"done_reason": "stop","context": [1, 2, 3],"total_duration": 10079066100,"load_duration": 8257234300,"prompt_eval_count": 31,"prompt_eval_duration": 73000000,"eval_count": 285,"eval_duration": 1747000000
}
三、生成请求(带 suffix)
请求:
curl http://localhost:11434/api/generate -d '{"model": "codellama:code","prompt": "def compute_gcd(a, b):","suffix": " return result","options": {"temperature": 0},"stream": false
}'
响应:
{"model": "codellama:code","created_at": "2025-02-27T07:46:10.1774118Z","response": "\n if a == 0:\n return b\n else:\n return compute_gcd(b % a, a)\n\ndef compute_lcm(a, b):\n result = (a * b) / compute_gcd(a, b)\n <EOT>","done": true,"done_reason": "stop","context": [...],"total_duration": 12949560100,"load_duration": 12393509100,"prompt_eval_count": 17,"prompt_eval_duration": 57000000,"eval_count": 64,"eval_duration": 498000000
}
四、生成请求(格式化输出)
请求:
curl -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d '{"model": "llama3.2","prompt": "Ollama is 22 years old and is busy saving the world. Respond using JSON","stream": false,"format": {"type": "object","properties": {"age": {"type": "integer"},"available": {"type": "boolean"}},"required": ["age","available"]}
}'
响应:
{"model": "llama3.2","created_at": "2025-02-27T07:49:37.5080349Z","response": "{\"age\": 22, \"available\": false}","done": true,"done_reason": "stop","context": [...],"total_duration": 8505815800,"load_duration": 8227758700,"prompt_eval_count": 43,"prompt_eval_duration": 54000000,"eval_count": 16,"eval_duration": 221000000
}
五、生成请求(JSON 模式)
当 format 参数设置为 json 时,响应输出的将是一个完整的 JSON 对象。除了参数设置之外,还需要告知模型要以 JSON 格式进行回复,这也很关键。
请求:
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "What color is the sky at different times of the day? Respond using JSON","format": "json","stream": false
}'
响应:
{"model": "llama3.2","created_at": "2025-02-27T07:51:55.0642636Z","response": "{\"times\": [\n {\"time\": \"Morning\", \"sky_color\": \"Light Blue\"},\n {\"time\": \"Day\", \"sky_color\": \"Blue\"},\n {\"time\": \"Afternoon\", \"sky_color\": \"Pale Blue to Grey\"},\n {\"time\": \"Evening\", \"sky_color\": \"Orange to Pink\"}\n], \"additional_info\": {\n \"Explanation\": \"The color of the sky changes throughout the day due to the changing angle of sunlight and atmospheric conditions.\"\n}}","done": true,"done_reason": "stop","context": [...],"total_duration": 660195300,"load_duration": 19656100,"prompt_eval_count": 40,"prompt_eval_duration": 5000000,"eval_count": 101,"eval_duration": 634000000
}
把响应中的 response 编辑一下,我们得到的响应就是如下所示的 JSON 格式了
"response": {"times": [{"time": "Morning", "sky_color": "Light Blue"},{"time": "Day", "sky_color": "Blue"},{"time": "Afternoon", "sky_color": "Pale Blue to Grey"},{"time": "Evening", "sky_color": "Orange to Pink"}], "additional_info": {"Explanation": "The color of the sky changes throughout the day due to the changing angle of sunlight and atmospheric conditions."}}
六、生成请求(带 images)
在使用 LLaVA 或 BakLLaVA 这类多模态模型时,会需要提交图像,可以使用 images 来提供一个经过 Base64 编码的图像列表给模型。
请求:
curl http://localhost:11434/api/generate -d '{"model": "llava","prompt":"What is in this picture?","stream": false,"images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
}'
响应:
{"model": "llava","created_at": "2025-02-27T08:45:21.5525754Z","response": " The image shows an animated character that appears to be a cute, cartoonish animal with two antennae on its head and one large ear. The character has a simplistic design with big eyes and a small mouth, giving it a happy or surprised expression. It seems to be waving with both arms raised. The character is styled in a playful manner that is typical for cute animal mascots often found in various forms of media, such as games, stickers, and graphic art. ","done": true,"done_reason": "stop","context": [...],"total_duration": 14909229400,"load_duration": 13603557400,"prompt_eval_count": 594,"prompt_eval_duration": 439000000,"eval_count": 103,"eval_duration": 865000000
}
七、生成请求(raw 模式)
有时候会希望不使用模板系统提供的提示信息,这是我们可以使用 raw 模式(原始模式)来禁用模板功能,需要注意的是,这个模式不会返回上下文信息。
请求:
curl http://localhost:11434/api/generate -d '{"model": "mistral","prompt": "[INST] why is the sky blue? [/INST]","raw": true,"stream": false
}'
八、生成请求(固定输出)
如果你想要输入同样的请求时得到的是同样的输出,可以通过设定 seed 值来实现。
请求:
curl http://localhost:11434/api/generate -d '{"model": "mistral","prompt": "Why is the sky blue?","options": {"seed": 123},"stream": false
}'
响应:
第一次响应
{"model": "mistral","created_at": "2025-02-27T08:40:43.0961712Z","response": " The sky appears blue due to a process called Rayleigh scattering. As sunlight reaches Earth, it is made up of different wavelengths of light. Short wavelengths, such as violet and blue, are scattered in all directions more than longer wavelengths like red, orange, and yellow because they interact more with molecules in the atmosphere. While we see the sun as white, it contains all colors. When sunlight enters Earth's atmosphere, the shorter wavelengths (blue and violet) get scattered in every direction. However, our eyes are more sensitive to blue light, and we perceive the sky as blue rather than violet. At sunrise or sunset, the sky can appear red or orange because at these times, light has to pass through a thicker layer of the atmosphere, allowing longer wavelengths like red and orange to be scattered in our direction.","done": true,"done_reason": "stop","context": [...],"total_duration": 1397722900,"load_duration": 3039000,"prompt_eval_count": 11,"prompt_eval_duration": 4000000,"eval_count": 183,"eval_duration": 1389000000
}
第二次使用同样 seed 的响应
{"model": "mistral","created_at": "2025-02-27T08:42:36.5417829Z", # 注意看时间"response": " The sky appears blue due to a process called Rayleigh scattering. As sunlight reaches Earth, it is made up of different wavelengths of light. Short wavelengths, such as violet and blue, are scattered in all directions more than longer wavelengths like red, orange, and yellow because they interact more with molecules in the atmosphere. While we see the sun as white, it contains all colors. When sunlight enters Earth's atmosphere, the shorter wavelengths (blue and violet) get scattered in every direction. However, our eyes are more sensitive to blue light, and we perceive the sky as blue rather than violet. At sunrise or sunset, the sky can appear red or orange because at these times, light has to pass through a thicker layer of the atmosphere, allowing longer wavelengths like red and orange to be scattered in our direction.","done": true,"done_reason": "stop","context": [...],"total_duration": 1400445300,"load_duration": 4049800,"prompt_eval_count": 11,"prompt_eval_duration": 2000000,"eval_count": 183,"eval_duration": 1394000000
}
九、生成请求(带 option)
如果想在 Modelfile 文件以外设置模型参数,那么可以使用 option 来实现,下面的示例设置了所有可用的选项,如果只想设置其中某个选项那么只需要在 option 中写下你想设置的选项就好了。
请求:
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","prompt": "Why is the sky blue?","stream": false,"options": {"num_keep": 5,"seed": 42,"num_predict": 100,"top_k": 20,"top_p": 0.9,"min_p": 0.0,"typical_p": 0.7,"repeat_last_n": 33,"temperature": 0.8,"repeat_penalty": 1.2,"presence_penalty": 1.5,"frequency_penalty": 1.0,"mirostat": 1,"mirostat_tau": 0.8,"mirostat_eta": 0.6,"penalize_newline": true,"stop": ["\n", "user:"],"numa": false,"num_ctx": 1024,"num_batch": 2,"num_gpu": 1,"main_gpu": 0,"low_vram": false,"vocab_only": false,"use_mmap": true,"use_mlock": false,"num_thread": 8}
}'
响应:
{"model": "llama3.2","created_at": "2025-02-27T08:01:58.8596643Z","response": "The sky appears blue because of a phenomenon called scattering, which occurs when sunlight interacts with the tiny molecules of gases in the Earth's atmosphere. Here's a simplified explanation:","done": true,"done_reason": "stop","context": [...],"total_duration": 9641145600,"load_duration": 7649268900,"prompt_eval_count": 31,"prompt_eval_duration": 817000000,"eval_count": 34,"eval_duration": 1174000000
}
加载模型
如果只是想加载模型而不需要模型响应内容的话,我们只需要不填写 prompt 这个参数即可,这样就会直接把模型加载到内存当中了。
请求:
curl http://localhost:11434/api/generate -d '{"model": "llama3.2"
}'
响应:
{"model": "llama3.2","created_at": "2025-02-27T08:04:29.3483927Z","response": "","done": true,"done_reason": "load"
}
卸载模型
当你想把模型从内存中卸载掉时,只需要在加载模型的基础上把 keep_alive 参数设置为0即可。
请求:
curl http://localhost:11434/api/generate -d '{"model": "llama3.2","keep_alive": 0
}'
响应:
{"model": "llama3.2","created_at": "2025-02-27T08:05:12.4781816Z","response": "","done": true,"done_reason": "unload"
}