ollama.md
raw
1# API
2
3> Note: Ollama's API docs are moving to https://docs.ollama.com/api
4
5## Endpoints
6
7- [Generate a completion](#generate-a-completion)
8- [Generate a chat completion](#generate-a-chat-completion)
9- [Create a Model](#create-a-model)
10- [List Local Models](#list-local-models)
11- [Show Model Information](#show-model-information)
12- [Copy a Model](#copy-a-model)
13- [Delete a Model](#delete-a-model)
14- [Pull a Model](#pull-a-model)
15- [Push a Model](#push-a-model)
16- [Generate Embeddings](#generate-embeddings)
17- [List Running Models](#list-running-models)
18- [Version](#version)
19- [Experimental: Image Generation](#image-generation-experimental)
20
21## Conventions
22
23### Model names
24
25Model names follow a `model:tag` format, where `model` can have an optional namespace such as `example/model`. Some examples are `orca-mini:3b-q8_0` and `llama3:70b`. The tag is optional and, if not provided, will default to `latest`. The tag is used to identify a specific version.
26
27### Durations
28
29All durations are returned in nanoseconds.
30
31### Streaming responses
32
33Certain endpoints stream responses as JSON objects. Streaming can be disabled by providing `{"stream": false}` for these endpoints.
34
35## Generate a completion
36
37```
38POST /api/generate
39```
40
41Generate a response for a given prompt with a provided model. This is a streaming endpoint, so there will be a series of responses. The final response object will include statistics and additional data from the request.
42
43### Parameters
44
45- `model`: (required) the [model name](#model-names)
46- `prompt`: the prompt to generate a response for
47- `suffix`: the text after the model response
48- `images`: (optional) a list of base64-encoded images (for multimodal models such as `llava`)
49- `think`: (for thinking models) should the model think before responding?
50
51Advanced parameters (optional):
52
53- `format`: the format to return a response in. Format can be `json` or a JSON schema
54- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.mdx#valid-parameters-and-values) such as `temperature`
55- `system`: system message to (overrides what is defined in the `Modelfile`)
56- `template`: the prompt template to use (overrides what is defined in the `Modelfile`)
57- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
58- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API
59- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
60- `context` (deprecated): the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
61
62Experimental image generation parameters (for image generation models only):
63
64> [!WARNING]
65> These parameters are experimental and may change in future versions.
66
67- `width`: width of the generated image in pixels
68- `height`: height of the generated image in pixels
69- `steps`: number of diffusion steps
70
71#### Structured outputs
72
73Structured outputs are supported by providing a JSON schema in the `format` parameter. The model will generate a response that matches the schema. See the [structured outputs](#request-structured-outputs) example below.
74
75#### JSON mode
76
77Enable JSON mode by setting the `format` parameter to `json`. This will structure the response as a valid JSON object. See the JSON mode [example](#request-json-mode) below.
78
79> [!IMPORTANT]
80> It's important to instruct the model to use JSON in the `prompt`. Otherwise, the model may generate large amounts whitespace.
81
82### Examples
83
84#### Generate request (Streaming)
85
86##### Request
87
88```shell
89curl http://localhost:11434/api/generate -d '{
90 "model": "llama3.2",
91 "prompt": "Why is the sky blue?"
92}'
93```
94
95##### Response
96
97A stream of JSON objects is returned:
98
99```json
100{
101 "model": "llama3.2",
102 "created_at": "2023-08-04T08:52:19.385406455-07:00",
103 "response": "The",
104 "done": false
105}
106```
107
108The final response in the stream also includes additional data about the generation:
109
110- `total_duration`: time spent generating the response
111- `load_duration`: time spent in nanoseconds loading the model
112- `prompt_eval_count`: number of tokens in the prompt
113- `prompt_eval_duration`: time spent in nanoseconds evaluating the prompt
114- `eval_count`: number of tokens in the response
115- `eval_duration`: time in nanoseconds spent generating the response
116- `context`: an encoding of the conversation used in this response, this can be sent in the next request to keep a conversational memory
117- `response`: empty if the response was streamed, if not streamed, this will contain the full response
118
119To calculate how fast the response is generated in tokens per second (token/s), divide `eval_count` / `eval_duration` \* `10^9`.
120
121```json
122{
123 "model": "llama3.2",
124 "created_at": "2023-08-04T19:22:45.499127Z",
125 "response": "",
126 "done": true,
127 "context": [1, 2, 3],
128 "total_duration": 10706818083,
129 "load_duration": 6338219291,
130 "prompt_eval_count": 26,
131 "prompt_eval_duration": 130079000,
132 "eval_count": 259,
133 "eval_duration": 4232710000
134}
135```
136
137#### Request (No streaming)
138
139##### Request
140
141A response can be received in one reply when streaming is off.
142
143```shell
144curl http://localhost:11434/api/generate -d '{
145 "model": "llama3.2",
146 "prompt": "Why is the sky blue?",
147 "stream": false
148}'
149```
150
151##### Response
152
153If `stream` is set to `false`, the response will be a single JSON object:
154
155```json
156{
157 "model": "llama3.2",
158 "created_at": "2023-08-04T19:22:45.499127Z",
159 "response": "The sky is blue because it is the color of the sky.",
160 "done": true,
161 "context": [1, 2, 3],
162 "total_duration": 5043500667,
163 "load_duration": 5025959,
164 "prompt_eval_count": 26,
165 "prompt_eval_duration": 325953000,
166 "eval_count": 290,
167 "eval_duration": 4709213000
168}
169```
170
171#### Request (with suffix)
172
173##### Request
174
175```shell
176curl http://localhost:11434/api/generate -d '{
177 "model": "codellama:code",
178 "prompt": "def compute_gcd(a, b):",
179 "suffix": " return result",
180 "options": {
181 "temperature": 0
182 },
183 "stream": false
184}'
185```
186
187##### Response
188
189```json5
190{
191 "model": "codellama:code",
192 "created_at": "2024-07-22T20:47:51.147561Z",
193 "response": "\n if a == 0:\n return b\n else:\n return compute_gcd(b % a, a)\n\ndef compute_lcm(a, b):\n result = (a * b) / compute_gcd(a, b)\n",
194 "done": true,
195 "done_reason": "stop",
196 "context": [...],
197 "total_duration": 1162761250,
198 "load_duration": 6683708,
199 "prompt_eval_count": 17,
200 "prompt_eval_duration": 201222000,
201 "eval_count": 63,
202 "eval_duration": 953997000
203}
204```
205
206#### Request (Structured outputs)
207
208##### Request
209
210```shell
211curl -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d '{
212 "model": "llama3.1:8b",
213 "prompt": "Ollama is 22 years old and is busy saving the world. Respond using JSON",
214 "stream": false,
215 "format": {
216 "type": "object",
217 "properties": {
218 "age": {
219 "type": "integer"
220 },
221 "available": {
222 "type": "boolean"
223 }
224 },
225 "required": [
226 "age",
227 "available"
228 ]
229 }
230}'
231```
232
233##### Response
234
235```json
236{
237 "model": "llama3.1:8b",
238 "created_at": "2024-12-06T00:48:09.983619Z",
239 "response": "{\n \"age\": 22,\n \"available\": true\n}",
240 "done": true,
241 "done_reason": "stop",
242 "context": [1, 2, 3],
243 "total_duration": 1075509083,
244 "load_duration": 567678166,
245 "prompt_eval_count": 28,
246 "prompt_eval_duration": 236000000,
247 "eval_count": 16,
248 "eval_duration": 269000000
249}
250```
251
252#### Request (JSON mode)
253
254> [!IMPORTANT]
255> When `format` is set to `json`, the output will always be a well-formed JSON object. It's important to also instruct the model to respond in JSON.
256
257##### Request
258
259```shell
260curl http://localhost:11434/api/generate -d '{
261 "model": "llama3.2",
262 "prompt": "What color is the sky at different times of the day? Respond using JSON",
263 "format": "json",
264 "stream": false
265}'
266```
267
268##### Response
269
270```json
271{
272 "model": "llama3.2",
273 "created_at": "2023-11-09T21:07:55.186497Z",
274 "response": "{\n\"morning\": {\n\"color\": \"blue\"\n},\n\"noon\": {\n\"color\": \"blue-gray\"\n},\n\"afternoon\": {\n\"color\": \"warm gray\"\n},\n\"evening\": {\n\"color\": \"orange\"\n}\n}\n",
275 "done": true,
276 "context": [1, 2, 3],
277 "total_duration": 4648158584,
278 "load_duration": 4071084,
279 "prompt_eval_count": 36,
280 "prompt_eval_duration": 439038000,
281 "eval_count": 180,
282 "eval_duration": 4196918000
283}
284```
285
286The value of `response` will be a string containing JSON similar to:
287
288```json
289{
290 "morning": {
291 "color": "blue"
292 },
293 "noon": {
294 "color": "blue-gray"
295 },
296 "afternoon": {
297 "color": "warm gray"
298 },
299 "evening": {
300 "color": "orange"
301 }
302}
303```
304
305#### Request (with images)
306
307To submit images to multimodal models such as `llava` or `bakllava`, provide a list of base64-encoded `images`:
308
309#### Request
310
311```shell
312curl http://localhost:11434/api/generate -d '{
313 "model": "llava",
314 "prompt":"What is in this picture?",
315 "stream": false,
316 "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
317}'
318```
319
320#### Response
321
322```json
323{
324 "model": "llava",
325 "created_at": "2023-11-03T15:36:02.583064Z",
326 "response": "A happy cartoon character, which is cute and cheerful.",
327 "done": true,
328 "context": [1, 2, 3],
329 "total_duration": 2938432250,
330 "load_duration": 2559292,
331 "prompt_eval_count": 1,
332 "prompt_eval_duration": 2195557000,
333 "eval_count": 44,
334 "eval_duration": 736432000
335}
336```
337
338#### Request (Raw Mode)
339
340In some cases, you may wish to bypass the templating system and provide a full prompt. In this case, you can use the `raw` parameter to disable templating. Also note that raw mode will not return a context.
341
342##### Request
343
344```shell
345curl http://localhost:11434/api/generate -d '{
346 "model": "mistral",
347 "prompt": "[INST] why is the sky blue? [/INST]",
348 "raw": true,
349 "stream": false
350}'
351```
352
353#### Request (Reproducible outputs)
354
355For reproducible outputs, set `seed` to a number:
356
357##### Request
358
359```shell
360curl http://localhost:11434/api/generate -d '{
361 "model": "mistral",
362 "prompt": "Why is the sky blue?",
363 "options": {
364 "seed": 123
365 }
366}'
367```
368
369##### Response
370
371```json
372{
373 "model": "mistral",
374 "created_at": "2023-11-03T15:36:02.583064Z",
375 "response": " The sky appears blue because of a phenomenon called Rayleigh scattering.",
376 "done": true,
377 "total_duration": 8493852375,
378 "load_duration": 6589624375,
379 "prompt_eval_count": 14,
380 "prompt_eval_duration": 119039000,
381 "eval_count": 110,
382 "eval_duration": 1779061000
383}
384```
385
386#### Generate request (With options)
387
388If you want to set custom options for the model at runtime rather than in the Modelfile, you can do so with the `options` parameter. This example sets every available option, but you can set any of them individually and omit the ones you do not want to override.
389
390##### Request
391
392```shell
393curl http://localhost:11434/api/generate -d '{
394 "model": "llama3.2",
395 "prompt": "Why is the sky blue?",
396 "stream": false,
397 "options": {
398 "num_keep": 5,
399 "seed": 42,
400 "num_predict": 100,
401 "top_k": 20,
402 "top_p": 0.9,
403 "min_p": 0.0,
404 "typical_p": 0.7,
405 "repeat_last_n": 33,
406 "temperature": 0.8,
407 "repeat_penalty": 1.2,
408 "presence_penalty": 1.5,
409 "frequency_penalty": 1.0,
410 "penalize_newline": true,
411 "stop": ["\n", "user:"],
412 "numa": false,
413 "num_ctx": 1024,
414 "num_batch": 2,
415 "num_gpu": 1,
416 "main_gpu": 0,
417 "use_mmap": true,
418 "num_thread": 8
419 }
420}'
421```
422
423##### Response
424
425```json
426{
427 "model": "llama3.2",
428 "created_at": "2023-08-04T19:22:45.499127Z",
429 "response": "The sky is blue because it is the color of the sky.",
430 "done": true,
431 "context": [1, 2, 3],
432 "total_duration": 4935886791,
433 "load_duration": 534986708,
434 "prompt_eval_count": 26,
435 "prompt_eval_duration": 107345000,
436 "eval_count": 237,
437 "eval_duration": 4289432000
438}
439```
440
441#### Load a model
442
443If an empty prompt is provided, the model will be loaded into memory.
444
445##### Request
446
447```shell
448curl http://localhost:11434/api/generate -d '{
449 "model": "llama3.2"
450}'
451```
452
453##### Response
454
455A single JSON object is returned:
456
457```json
458{
459 "model": "llama3.2",
460 "created_at": "2023-12-18T19:52:07.071755Z",
461 "response": "",
462 "done": true
463}
464```
465
466#### Unload a model
467
468If an empty prompt is provided and the `keep_alive` parameter is set to `0`, a model will be unloaded from memory.
469
470##### Request
471
472```shell
473curl http://localhost:11434/api/generate -d '{
474 "model": "llama3.2",
475 "keep_alive": 0
476}'
477```
478
479##### Response
480
481A single JSON object is returned:
482
483```json
484{
485 "model": "llama3.2",
486 "created_at": "2024-09-12T03:54:03.516566Z",
487 "response": "",
488 "done": true,
489 "done_reason": "unload"
490}
491```
492
493## Generate a chat completion
494
495```
496POST /api/chat
497```
498
499Generate the next message in a chat with a provided model. This is a streaming endpoint, so there will be a series of responses. Streaming can be disabled using `"stream": false`. The final response object will include statistics and additional data from the request.
500
501### Parameters
502
503- `model`: (required) the [model name](#model-names)
504- `messages`: the messages of the chat, this can be used to keep a chat memory
505- `tools`: list of tools in JSON for the model to use if supported
506- `think`: (for thinking models) should the model think before responding?
507
508The `message` object has the following fields:
509
510- `role`: the role of the message, either `system`, `user`, `assistant`, or `tool`
511- `content`: the content of the message
512- `thinking`: (for thinking models) the model's thinking process
513- `images` (optional): a list of images to include in the message (for multimodal models such as `llava`)
514- `tool_calls` (optional): a list of tools in JSON that the model wants to use
515- `tool_name` (optional): add the name of the tool that was executed to inform the model of the result
516
517Advanced parameters (optional):
518
519- `format`: the format to return a response in. Format can be `json` or a JSON schema.
520- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.mdx#valid-parameters-and-values) such as `temperature`
521- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
522- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
523
524### Tool calling
525
526Tool calling is supported by providing a list of tools in the `tools` parameter. The model will generate a response that includes a list of tool calls. See the [Chat request (Streaming with tools)](#chat-request-streaming-with-tools) example below.
527
528Models can also explain the result of the tool call in the response. See the [Chat request (With history, with tools)](#chat-request-with-history-with-tools) example below.
529
530[See models with tool calling capabilities](https://ollama.com/search?c=tool).
531
532### Structured outputs
533
534Structured outputs are supported by providing a JSON schema in the `format` parameter. The model will generate a response that matches the schema. See the [Chat request (Structured outputs)](#chat-request-structured-outputs) example below.
535
536### Examples
537
538#### Chat request (Streaming)
539
540##### Request
541
542Send a chat message with a streaming response.
543
544```shell
545curl http://localhost:11434/api/chat -d '{
546 "model": "llama3.2",
547 "messages": [
548 {
549 "role": "user",
550 "content": "why is the sky blue?"
551 }
552 ]
553}'
554```
555
556##### Response
557
558A stream of JSON objects is returned:
559
560```json
561{
562 "model": "llama3.2",
563 "created_at": "2023-08-04T08:52:19.385406455-07:00",
564 "message": {
565 "role": "assistant",
566 "content": "The",
567 "images": null
568 },
569 "done": false
570}
571```
572
573Final response:
574
575```json
576{
577 "model": "llama3.2",
578 "created_at": "2023-08-04T19:22:45.499127Z",
579 "message": {
580 "role": "assistant",
581 "content": ""
582 },
583 "done": true,
584 "total_duration": 4883583458,
585 "load_duration": 1334875,
586 "prompt_eval_count": 26,
587 "prompt_eval_duration": 342546000,
588 "eval_count": 282,
589 "eval_duration": 4535599000
590}
591```
592
593#### Chat request (Streaming with tools)
594
595##### Request
596
597```shell
598curl http://localhost:11434/api/chat -d '{
599 "model": "llama3.2",
600 "messages": [
601 {
602 "role": "user",
603 "content": "what is the weather in tokyo?"
604 }
605 ],
606 "tools": [
607 {
608 "type": "function",
609 "function": {
610 "name": "get_weather",
611 "description": "Get the weather in a given city",
612 "parameters": {
613 "type": "object",
614 "properties": {
615 "city": {
616 "type": "string",
617 "description": "The city to get the weather for"
618 }
619 },
620 "required": ["city"]
621 }
622 }
623 }
624 ],
625 "stream": true
626}'
627```
628
629##### Response
630
631A stream of JSON objects is returned:
632
633```json
634{
635 "model": "llama3.2",
636 "created_at": "2025-07-07T20:22:19.184789Z",
637 "message": {
638 "role": "assistant",
639 "content": "",
640 "tool_calls": [
641 {
642 "function": {
643 "name": "get_weather",
644 "arguments": {
645 "city": "Tokyo"
646 }
647 }
648 }
649 ]
650 },
651 "done": false
652}
653```
654
655Final response:
656
657```json
658{
659 "model": "llama3.2",
660 "created_at": "2025-07-07T20:22:19.19314Z",
661 "message": {
662 "role": "assistant",
663 "content": ""
664 },
665 "done_reason": "stop",
666 "done": true,
667 "total_duration": 182242375,
668 "load_duration": 41295167,
669 "prompt_eval_count": 169,
670 "prompt_eval_duration": 24573166,
671 "eval_count": 15,
672 "eval_duration": 115959084
673}
674```
675
676#### Chat request (No streaming)
677
678##### Request
679
680```shell
681curl http://localhost:11434/api/chat -d '{
682 "model": "llama3.2",
683 "messages": [
684 {
685 "role": "user",
686 "content": "why is the sky blue?"
687 }
688 ],
689 "stream": false
690}'
691```
692
693##### Response
694
695```json
696{
697 "model": "llama3.2",
698 "created_at": "2023-12-12T14:13:43.416799Z",
699 "message": {
700 "role": "assistant",
701 "content": "Hello! How are you today?"
702 },
703 "done": true,
704 "total_duration": 5191566416,
705 "load_duration": 2154458,
706 "prompt_eval_count": 26,
707 "prompt_eval_duration": 383809000,
708 "eval_count": 298,
709 "eval_duration": 4799921000
710}
711```
712
713#### Chat request (No streaming, with tools)
714
715##### Request
716
717```shell
718curl http://localhost:11434/api/chat -d '{
719 "model": "llama3.2",
720 "messages": [
721 {
722 "role": "user",
723 "content": "what is the weather in tokyo?"
724 }
725 ],
726 "tools": [
727 {
728 "type": "function",
729 "function": {
730 "name": "get_weather",
731 "description": "Get the weather in a given city",
732 "parameters": {
733 "type": "object",
734 "properties": {
735 "city": {
736 "type": "string",
737 "description": "The city to get the weather for"
738 }
739 },
740 "required": ["city"]
741 }
742 }
743 }
744 ],
745 "stream": false
746}'
747```
748
749##### Response
750
751```json
752{
753 "model": "llama3.2",
754 "created_at": "2025-07-07T20:32:53.844124Z",
755 "message": {
756 "role": "assistant",
757 "content": "",
758 "tool_calls": [
759 {
760 "function": {
761 "name": "get_weather",
762 "arguments": {
763 "city": "Tokyo"
764 }
765 }
766 }
767 ]
768 },
769 "done_reason": "stop",
770 "done": true,
771 "total_duration": 3244883583,
772 "load_duration": 2969184542,
773 "prompt_eval_count": 169,
774 "prompt_eval_duration": 141656333,
775 "eval_count": 18,
776 "eval_duration": 133293625
777}
778```
779
780#### Chat request (Structured outputs)
781
782##### Request
783
784```shell
785curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
786 "model": "llama3.1",
787 "messages": [{"role": "user", "content": "Ollama is 22 years old and busy saving the world. Return a JSON object with the age and availability."}],
788 "stream": false,
789 "format": {
790 "type": "object",
791 "properties": {
792 "age": {
793 "type": "integer"
794 },
795 "available": {
796 "type": "boolean"
797 }
798 },
799 "required": [
800 "age",
801 "available"
802 ]
803 },
804 "options": {
805 "temperature": 0
806 }
807}'
808```
809
810##### Response
811
812```json
813{
814 "model": "llama3.1",
815 "created_at": "2024-12-06T00:46:58.265747Z",
816 "message": {
817 "role": "assistant",
818 "content": "{\"age\": 22, \"available\": false}"
819 },
820 "done_reason": "stop",
821 "done": true,
822 "total_duration": 2254970291,
823 "load_duration": 574751416,
824 "prompt_eval_count": 34,
825 "prompt_eval_duration": 1502000000,
826 "eval_count": 12,
827 "eval_duration": 175000000
828}
829```
830
831#### Chat request (With History)
832
833Send a chat message with a conversation history. You can use this same approach to start the conversation using multi-shot or chain-of-thought prompting.
834
835##### Request
836
837```shell
838curl http://localhost:11434/api/chat -d '{
839 "model": "llama3.2",
840 "messages": [
841 {
842 "role": "user",
843 "content": "why is the sky blue?"
844 },
845 {
846 "role": "assistant",
847 "content": "due to rayleigh scattering."
848 },
849 {
850 "role": "user",
851 "content": "how is that different than mie scattering?"
852 }
853 ]
854}'
855```
856
857##### Response
858
859A stream of JSON objects is returned:
860
861```json
862{
863 "model": "llama3.2",
864 "created_at": "2023-08-04T08:52:19.385406455-07:00",
865 "message": {
866 "role": "assistant",
867 "content": "The"
868 },
869 "done": false
870}
871```
872
873Final response:
874
875```json
876{
877 "model": "llama3.2",
878 "created_at": "2023-08-04T19:22:45.499127Z",
879 "done": true,
880 "total_duration": 8113331500,
881 "load_duration": 6396458,
882 "prompt_eval_count": 61,
883 "prompt_eval_duration": 398801000,
884 "eval_count": 468,
885 "eval_duration": 7701267000
886}
887```
888
889#### Chat request (With history, with tools)
890
891##### Request
892
893```shell
894curl http://localhost:11434/api/chat -d '{
895 "model": "llama3.2",
896 "messages": [
897 {
898 "role": "user",
899 "content": "what is the weather in Toronto?"
900 },
901 // the message from the model appended to history
902 {
903 "role": "assistant",
904 "content": "",
905 "tool_calls": [
906 {
907 "function": {
908 "name": "get_weather",
909 "arguments": {
910 "city": "Toronto"
911 }
912 }
913 }
914 ]
915 },
916 // the tool call result appended to history
917 {
918 "role": "tool",
919 "content": "11 degrees celsius",
920 "tool_name": "get_weather"
921 }
922 ],
923 "stream": false,
924 "tools": [
925 {
926 "type": "function",
927 "function": {
928 "name": "get_weather",
929 "description": "Get the weather in a given city",
930 "parameters": {
931 "type": "object",
932 "properties": {
933 "city": {
934 "type": "string",
935 "description": "The city to get the weather for"
936 }
937 },
938 "required": ["city"]
939 }
940 }
941 }
942 ]
943}'
944```
945
946##### Response
947
948```json
949{
950 "model": "llama3.2",
951 "created_at": "2025-07-07T20:43:37.688511Z",
952 "message": {
953 "role": "assistant",
954 "content": "The current temperature in Toronto is 11ยฐC."
955 },
956 "done_reason": "stop",
957 "done": true,
958 "total_duration": 890771750,
959 "load_duration": 707634750,
960 "prompt_eval_count": 94,
961 "prompt_eval_duration": 91703208,
962 "eval_count": 11,
963 "eval_duration": 90282125
964}
965```
966
967#### Chat request (with images)
968
969##### Request
970
971Send a chat message with images. The images should be provided as an array, with the individual images encoded in Base64.
972
973```shell
974curl http://localhost:11434/api/chat -d '{
975 "model": "llava",
976 "messages": [
977 {
978 "role": "user",
979 "content": "what is in this image?",
980 "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
981 }
982 ]
983}'
984```
985
986##### Response
987
988```json
989{
990 "model": "llava",
991 "created_at": "2023-12-13T22:42:50.203334Z",
992 "message": {
993 "role": "assistant",
994 "content": " The image features a cute, little pig with an angry facial expression. It's wearing a heart on its shirt and is waving in the air. This scene appears to be part of a drawing or sketching project.",
995 "images": null
996 },
997 "done": true,
998 "total_duration": 1668506709,
999 "load_duration": 1986209,
1000 "prompt_eval_count": 26,
1001 "prompt_eval_duration": 359682000,
1002 "eval_count": 83,
1003 "eval_duration": 1303285000
1004}
1005```
1006
1007#### Chat request (Reproducible outputs)
1008
1009##### Request
1010
1011```shell
1012curl http://localhost:11434/api/chat -d '{
1013 "model": "llama3.2",
1014 "messages": [
1015 {
1016 "role": "user",
1017 "content": "Hello!"
1018 }
1019 ],
1020 "options": {
1021 "seed": 101,
1022 "temperature": 0
1023 }
1024}'
1025```
1026
1027##### Response
1028
1029```json
1030{
1031 "model": "llama3.2",
1032 "created_at": "2023-12-12T14:13:43.416799Z",
1033 "message": {
1034 "role": "assistant",
1035 "content": "Hello! How are you today?"
1036 },
1037 "done": true,
1038 "total_duration": 5191566416,
1039 "load_duration": 2154458,
1040 "prompt_eval_count": 26,
1041 "prompt_eval_duration": 383809000,
1042 "eval_count": 298,
1043 "eval_duration": 4799921000
1044}
1045```
1046
1047#### Chat request (with tools)
1048
1049##### Request
1050
1051```shell
1052curl http://localhost:11434/api/chat -d '{
1053 "model": "llama3.2",
1054 "messages": [
1055 {
1056 "role": "user",
1057 "content": "What is the weather today in Paris?"
1058 }
1059 ],
1060 "stream": false,
1061 "tools": [
1062 {
1063 "type": "function",
1064 "function": {
1065 "name": "get_current_weather",
1066 "description": "Get the current weather for a location",
1067 "parameters": {
1068 "type": "object",
1069 "properties": {
1070 "location": {
1071 "type": "string",
1072 "description": "The location to get the weather for, e.g. San Francisco, CA"
1073 },
1074 "format": {
1075 "type": "string",
1076 "description": "The format to return the weather in, e.g. 'celsius' or 'fahrenheit'",
1077 "enum": ["celsius", "fahrenheit"]
1078 }
1079 },
1080 "required": ["location", "format"]
1081 }
1082 }
1083 }
1084 ]
1085}'
1086```
1087
1088##### Response
1089
1090```json
1091{
1092 "model": "llama3.2",
1093 "created_at": "2024-07-22T20:33:28.123648Z",
1094 "message": {
1095 "role": "assistant",
1096 "content": "",
1097 "tool_calls": [
1098 {
1099 "function": {
1100 "name": "get_current_weather",
1101 "arguments": {
1102 "format": "celsius",
1103 "location": "Paris, FR"
1104 }
1105 }
1106 }
1107 ]
1108 },
1109 "done_reason": "stop",
1110 "done": true,
1111 "total_duration": 885095291,
1112 "load_duration": 3753500,
1113 "prompt_eval_count": 122,
1114 "prompt_eval_duration": 328493000,
1115 "eval_count": 33,
1116 "eval_duration": 552222000
1117}
1118```
1119
1120#### Load a model
1121
1122If the messages array is empty, the model will be loaded into memory.
1123
1124##### Request
1125
1126```shell
1127curl http://localhost:11434/api/chat -d '{
1128 "model": "llama3.2",
1129 "messages": []
1130}'
1131```
1132
1133##### Response
1134
1135```json
1136{
1137 "model": "llama3.2",
1138 "created_at": "2024-09-12T21:17:29.110811Z",
1139 "message": {
1140 "role": "assistant",
1141 "content": ""
1142 },
1143 "done_reason": "load",
1144 "done": true
1145}
1146```
1147
1148#### Unload a model
1149
1150If the messages array is empty and the `keep_alive` parameter is set to `0`, a model will be unloaded from memory.
1151
1152##### Request
1153
1154```shell
1155curl http://localhost:11434/api/chat -d '{
1156 "model": "llama3.2",
1157 "messages": [],
1158 "keep_alive": 0
1159}'
1160```
1161
1162##### Response
1163
1164A single JSON object is returned:
1165
1166```json
1167{
1168 "model": "llama3.2",
1169 "created_at": "2024-09-12T21:33:17.547535Z",
1170 "message": {
1171 "role": "assistant",
1172 "content": ""
1173 },
1174 "done_reason": "unload",
1175 "done": true
1176}
1177```
1178
1179## Create a Model
1180
1181```
1182POST /api/create
1183```
1184
1185Create a model from:
1186
1187- another model;
1188- a safetensors directory; or
1189- a GGUF file.
1190
1191If you are creating a model from a safetensors directory or from a GGUF file, you must [create a blob](#create-a-blob) for each of the files and then use the file name and SHA256 digest associated with each blob in the `files` field.
1192
1193### Parameters
1194
1195- `model`: name of the model to create
1196- `from`: (optional) name of an existing model to create the new model from
1197- `files`: (optional) a dictionary of file names to SHA256 digests of blobs to create the model from
1198- `adapters`: (optional) a dictionary of file names to SHA256 digests of blobs for LORA adapters
1199- `template`: (optional) the prompt template for the model
1200- `license`: (optional) a string or list of strings containing the license or licenses for the model
1201- `system`: (optional) a string containing the system prompt for the model
1202- `parameters`: (optional) a dictionary of parameters for the model (see [Modelfile](./modelfile.mdx#valid-parameters-and-values) for a list of parameters)
1203- `messages`: (optional) a list of message objects used to create a conversation
1204- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
1205- `quantize` (optional): quantize a non-quantized (e.g. float16) model
1206
1207#### Quantization types
1208
1209| Type | Recommended |
1210| ------ | :---------: |
1211| q4_K_M | \* |
1212| q4_K_S | |
1213| q8_0 | \* |
1214
1215### Examples
1216
1217#### Create a new model
1218
1219Create a new model from an existing model.
1220
1221##### Request
1222
1223```shell
1224curl http://localhost:11434/api/create -d '{
1225 "model": "mario",
1226 "from": "llama3.2",
1227 "system": "You are Mario from Super Mario Bros."
1228}'
1229```
1230
1231##### Response
1232
1233A stream of JSON objects is returned:
1234
1235```json
1236{"status":"reading model metadata"}
1237{"status":"creating system layer"}
1238{"status":"using already created layer sha256:22f7f8ef5f4c791c1b03d7eb414399294764d7cc82c7e94aa81a1feb80a983a2"}
1239{"status":"using already created layer sha256:8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b"}
1240{"status":"using already created layer sha256:7c23fb36d80141c4ab8cdbb61ee4790102ebd2bf7aeff414453177d4f2110e5d"}
1241{"status":"using already created layer sha256:2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988"}
1242{"status":"using already created layer sha256:2759286baa875dc22de5394b4a925701b1896a7e3f8e53275c36f75a877a82c9"}
1243{"status":"writing layer sha256:df30045fe90f0d750db82a058109cecd6d4de9c90a3d75b19c09e5f64580bb42"}
1244{"status":"writing layer sha256:f18a68eb09bf925bb1b669490407c1b1251c5db98dc4d3d81f3088498ea55690"}
1245{"status":"writing manifest"}
1246{"status":"success"}
1247```
1248
1249#### Quantize a model
1250
1251Quantize a non-quantized model.
1252
1253##### Request
1254
1255```shell
1256curl http://localhost:11434/api/create -d '{
1257 "model": "llama3.2:quantized",
1258 "from": "llama3.2:3b-instruct-fp16",
1259 "quantize": "q4_K_M"
1260}'
1261```
1262
1263##### Response
1264
1265A stream of JSON objects is returned:
1266
1267```json
1268{"status":"quantizing F16 model to Q4_K_M","digest":"0","total":6433687776,"completed":12302}
1269{"status":"quantizing F16 model to Q4_K_M","digest":"0","total":6433687776,"completed":6433687552}
1270{"status":"verifying conversion"}
1271{"status":"creating new layer sha256:fb7f4f211b89c6c4928ff4ddb73db9f9c0cfca3e000c3e40d6cf27ddc6ca72eb"}
1272{"status":"using existing layer sha256:966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396"}
1273{"status":"using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d"}
1274{"status":"using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd"}
1275{"status":"using existing layer sha256:56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb"}
1276{"status":"writing manifest"}
1277{"status":"success"}
1278```
1279
1280#### Create a model from GGUF
1281
1282Create a model from a GGUF file. The `files` parameter should be filled out with the file name and SHA256 digest of the GGUF file you wish to use. Use [/api/blobs/:digest](#push-a-blob) to push the GGUF file to the server before calling this API.
1283
1284##### Request
1285
1286```shell
1287curl http://localhost:11434/api/create -d '{
1288 "model": "my-gguf-model",
1289 "files": {
1290 "test.gguf": "sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"
1291 }
1292}'
1293```
1294
1295##### Response
1296
1297A stream of JSON objects is returned:
1298
1299```json
1300{"status":"parsing GGUF"}
1301{"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
1302{"status":"writing manifest"}
1303{"status":"success"}
1304```
1305
1306#### Create a model from a Safetensors directory
1307
1308The `files` parameter should include a dictionary of files for the safetensors model which includes the file names and SHA256 digest of each file. Use [/api/blobs/:digest](#push-a-blob) to first push each of the files to the server before calling this API. Files will remain in the cache until the Ollama server is restarted.
1309
1310##### Request
1311
1312```shell
1313curl http://localhost:11434/api/create -d '{
1314 "model": "fred",
1315 "files": {
1316 "config.json": "sha256:dd3443e529fb2290423a0c65c2d633e67b419d273f170259e27297219828e389",
1317 "generation_config.json": "sha256:88effbb63300dbbc7390143fbbdd9d9fa50587b37e8bfd16c8c90d4970a74a36",
1318 "special_tokens_map.json": "sha256:b7455f0e8f00539108837bfa586c4fbf424e31f8717819a6798be74bef813d05",
1319 "tokenizer.json": "sha256:bbc1904d35169c542dffbe1f7589a5994ec7426d9e5b609d07bab876f32e97ab",
1320 "tokenizer_config.json": "sha256:24e8a6dc2547164b7002e3125f10b415105644fcf02bf9ad8b674c87b1eaaed6",
1321 "model.safetensors": "sha256:1ff795ff6a07e6a68085d206fb84417da2f083f68391c2843cd2b8ac6df8538f"
1322 }
1323}'
1324```
1325
1326##### Response
1327
1328A stream of JSON objects is returned:
1329
1330```shell
1331{"status":"converting model"}
1332{"status":"creating new layer sha256:05ca5b813af4a53d2c2922933936e398958855c44ee534858fcfd830940618b6"}
1333{"status":"using autodetected template llama3-instruct"}
1334{"status":"using existing layer sha256:56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb"}
1335{"status":"writing manifest"}
1336{"status":"success"}
1337```
1338
1339## Check if a Blob Exists
1340
1341```shell
1342HEAD /api/blobs/:digest
1343```
1344
1345Ensures that the file blob (Binary Large Object) used with create a model exists on the server. This checks your Ollama server and not ollama.com.
1346
1347### Query Parameters
1348
1349- `digest`: the SHA256 digest of the blob
1350
1351### Examples
1352
1353#### Request
1354
1355```shell
1356curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
1357```
1358
1359#### Response
1360
1361Return 200 OK if the blob exists, 404 Not Found if it does not.
1362
1363## Push a Blob
1364
1365```
1366POST /api/blobs/:digest
1367```
1368
1369Push a file to the Ollama server to create a "blob" (Binary Large Object).
1370
1371### Query Parameters
1372
1373- `digest`: the expected SHA256 digest of the file
1374
1375### Examples
1376
1377#### Request
1378
1379```shell
1380curl -T model.gguf -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
1381```
1382
1383#### Response
1384
1385Return 201 Created if the blob was successfully created, 400 Bad Request if the digest used is not expected.
1386
1387## List Local Models
1388
1389```
1390GET /api/tags
1391```
1392
1393List models that are available locally.
1394
1395### Examples
1396
1397#### Request
1398
1399```shell
1400curl http://localhost:11434/api/tags
1401```
1402
1403#### Response
1404
1405A single JSON object will be returned.
1406
1407```json
1408{
1409 "models": [
1410 {
1411 "name": "deepseek-r1:latest",
1412 "model": "deepseek-r1:latest",
1413 "modified_at": "2025-05-10T08:06:48.639712648-07:00",
1414 "size": 4683075271,
1415 "digest": "0a8c266910232fd3291e71e5ba1e058cc5af9d411192cf88b6d30e92b6e73163",
1416 "details": {
1417 "parent_model": "",
1418 "format": "gguf",
1419 "family": "qwen2",
1420 "families": ["qwen2"],
1421 "parameter_size": "7.6B",
1422 "quantization_level": "Q4_K_M"
1423 }
1424 },
1425 {
1426 "name": "llama3.2:latest",
1427 "model": "llama3.2:latest",
1428 "modified_at": "2025-05-04T17:37:44.706015396-07:00",
1429 "size": 2019393189,
1430 "digest": "a80c4f17acd55265feec403c7aef86be0c25983ab279d83f3bcd3abbcb5b8b72",
1431 "details": {
1432 "parent_model": "",
1433 "format": "gguf",
1434 "family": "llama",
1435 "families": ["llama"],
1436 "parameter_size": "3.2B",
1437 "quantization_level": "Q4_K_M"
1438 }
1439 }
1440 ]
1441}
1442```
1443
1444## Show Model Information
1445
1446```
1447POST /api/show
1448```
1449
1450Show information about a model including details, modelfile, template, parameters, license, system prompt.
1451
1452### Parameters
1453
1454- `model`: name of the model to show
1455- `verbose`: (optional) if set to `true`, returns full data for verbose response fields
1456
1457### Examples
1458
1459#### Request
1460
1461```shell
1462curl http://localhost:11434/api/show -d '{
1463 "model": "llava"
1464}'
1465```
1466
1467#### Response
1468
1469```json5
1470{
1471 modelfile: '# Modelfile generated by "ollama show"\n# To build a new Modelfile based on this one, replace the FROM line with:\n# FROM llava:latest\n\nFROM /Users/matt/.ollama/models/blobs/sha256:200765e1283640ffbd013184bf496e261032fa75b99498a9613be4e94d63ad52\nTEMPLATE """{{ .System }}\nUSER: {{ .Prompt }}\nASSISTANT: """\nPARAMETER num_ctx 4096\nPARAMETER stop "\u003c/s\u003e"\nPARAMETER stop "USER:"\nPARAMETER stop "ASSISTANT:"',
1472 parameters: 'num_keep 24\nstop "<|start_header_id|>"\nstop "<|end_header_id|>"\nstop "<|eot_id|>"',
1473 template: "{{ if .System }}<|start_header_id|>system<|end_header_id|>\n\n{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>\n\n{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>\n\n{{ .Response }}<|eot_id|>",
1474 details: {
1475 parent_model: "",
1476 format: "gguf",
1477 family: "llama",
1478 families: ["llama"],
1479 parameter_size: "8.0B",
1480 quantization_level: "Q4_0",
1481 },
1482 model_info: {
1483 "general.architecture": "llama",
1484 "general.file_type": 2,
1485 "general.parameter_count": 8030261248,
1486 "general.quantization_version": 2,
1487 "llama.attention.head_count": 32,
1488 "llama.attention.head_count_kv": 8,
1489 "llama.attention.layer_norm_rms_epsilon": 0.00001,
1490 "llama.block_count": 32,
1491 "llama.context_length": 8192,
1492 "llama.embedding_length": 4096,
1493 "llama.feed_forward_length": 14336,
1494 "llama.rope.dimension_count": 128,
1495 "llama.rope.freq_base": 500000,
1496 "llama.vocab_size": 128256,
1497 "tokenizer.ggml.bos_token_id": 128000,
1498 "tokenizer.ggml.eos_token_id": 128009,
1499 "tokenizer.ggml.merges": [], // populates if `verbose=true`
1500 "tokenizer.ggml.model": "gpt2",
1501 "tokenizer.ggml.pre": "llama-bpe",
1502 "tokenizer.ggml.token_type": [], // populates if `verbose=true`
1503 "tokenizer.ggml.tokens": [], // populates if `verbose=true`
1504 },
1505 capabilities: ["completion", "vision"],
1506}
1507```
1508
1509## Copy a Model
1510
1511```
1512POST /api/copy
1513```
1514
1515Copy a model. Creates a model with another name from an existing model.
1516
1517### Examples
1518
1519#### Request
1520
1521```shell
1522curl http://localhost:11434/api/copy -d '{
1523 "source": "llama3.2",
1524 "destination": "llama3-backup"
1525}'
1526```
1527
1528#### Response
1529
1530Returns a 200 OK if successful, or a 404 Not Found if the source model doesn't exist.
1531
1532## Delete a Model
1533
1534```
1535DELETE /api/delete
1536```
1537
1538Delete a model and its data.
1539
1540### Parameters
1541
1542- `model`: model name to delete
1543
1544### Examples
1545
1546#### Request
1547
1548```shell
1549curl -X DELETE http://localhost:11434/api/delete -d '{
1550 "model": "llama3:13b"
1551}'
1552```
1553
1554#### Response
1555
1556Returns a 200 OK if successful, 404 Not Found if the model to be deleted doesn't exist.
1557
1558## Pull a Model
1559
1560```
1561POST /api/pull
1562```
1563
1564Download a model from the ollama library. Cancelled pulls are resumed from where they left off, and multiple calls will share the same download progress.
1565
1566### Parameters
1567
1568- `model`: name of the model to pull
1569- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pulling from your own library during development.
1570- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
1571
1572### Examples
1573
1574#### Request
1575
1576```shell
1577curl http://localhost:11434/api/pull -d '{
1578 "model": "llama3.2"
1579}'
1580```
1581
1582#### Response
1583
1584If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:
1585
1586The first object is the manifest:
1587
1588```json
1589{
1590 "status": "pulling manifest"
1591}
1592```
1593
1594Then there is a series of downloading responses. Until any of the download is completed, the `completed` key may not be included. The number of files to be downloaded depends on the number of layers specified in the manifest.
1595
1596```json
1597{
1598 "status": "pulling digestname",
1599 "digest": "digestname",
1600 "total": 2142590208,
1601 "completed": 241970
1602}
1603```
1604
1605After all the files are downloaded, the final responses are:
1606
1607```json
1608{
1609 "status": "verifying sha256 digest"
1610}
1611{
1612 "status": "writing manifest"
1613}
1614{
1615 "status": "removing any unused layers"
1616}
1617{
1618 "status": "success"
1619}
1620```
1621
1622if `stream` is set to false, then the response is a single JSON object:
1623
1624```json
1625{
1626 "status": "success"
1627}
1628```
1629
1630## Push a Model
1631
1632```
1633POST /api/push
1634```
1635
1636Upload a model to a model library. Requires registering for ollama.ai and adding a public key first.
1637
1638### Parameters
1639
1640- `model`: name of the model to push in the form of `<namespace>/<model>:<tag>`
1641- `insecure`: (optional) allow insecure connections to the library. Only use this if you are pushing to your library during development.
1642- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
1643
1644### Examples
1645
1646#### Request
1647
1648```shell
1649curl http://localhost:11434/api/push -d '{
1650 "model": "mattw/pygmalion:latest"
1651}'
1652```
1653
1654#### Response
1655
1656If `stream` is not specified, or set to `true`, a stream of JSON objects is returned:
1657
1658```json
1659{ "status": "retrieving manifest" }
1660```
1661
1662and then:
1663
1664```json
1665{
1666 "status": "starting upload",
1667 "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
1668 "total": 1928429856
1669}
1670```
1671
1672Then there is a series of uploading responses:
1673
1674```json
1675{
1676 "status": "starting upload",
1677 "digest": "sha256:bc07c81de745696fdf5afca05e065818a8149fb0c77266fb584d9b2cba3711ab",
1678 "total": 1928429856
1679}
1680```
1681
1682Finally, when the upload is complete:
1683
1684```json
1685{"status":"pushing manifest"}
1686{"status":"success"}
1687```
1688
1689If `stream` is set to `false`, then the response is a single JSON object:
1690
1691```json
1692{ "status": "success" }
1693```
1694
1695## Generate Embeddings
1696
1697```
1698POST /api/embed
1699```
1700
1701Generate embeddings from a model
1702
1703### Parameters
1704
1705- `model`: name of model to generate embeddings from
1706- `input`: text or list of text to generate embeddings for
1707
1708Advanced parameters:
1709
1710- `truncate`: truncates the end of each input to fit within context length. Returns error if `false` and context length is exceeded. Defaults to `true`
1711- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.mdx#valid-parameters-and-values) such as `temperature`
1712- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
1713- `dimensions`: number of dimensions for the embedding
1714
1715### Examples
1716
1717#### Request
1718
1719```shell
1720curl http://localhost:11434/api/embed -d '{
1721 "model": "all-minilm",
1722 "input": "Why is the sky blue?"
1723}'
1724```
1725
1726#### Response
1727
1728```json
1729{
1730 "model": "all-minilm",
1731 "embeddings": [
1732 [
1733 0.010071029, -0.0017594862, 0.05007221, 0.04692972, 0.054916814,
1734 0.008599704, 0.105441414, -0.025878139, 0.12958129, 0.031952348
1735 ]
1736 ],
1737 "total_duration": 14143917,
1738 "load_duration": 1019500,
1739 "prompt_eval_count": 8
1740}
1741```
1742
1743#### Request (Multiple input)
1744
1745```shell
1746curl http://localhost:11434/api/embed -d '{
1747 "model": "all-minilm",
1748 "input": ["Why is the sky blue?", "Why is the grass green?"]
1749}'
1750```
1751
1752#### Response
1753
1754```json
1755{
1756 "model": "all-minilm",
1757 "embeddings": [
1758 [
1759 0.010071029, -0.0017594862, 0.05007221, 0.04692972, 0.054916814,
1760 0.008599704, 0.105441414, -0.025878139, 0.12958129, 0.031952348
1761 ],
1762 [
1763 -0.0098027075, 0.06042469, 0.025257962, -0.006364387, 0.07272725,
1764 0.017194884, 0.09032035, -0.051705178, 0.09951512, 0.09072481
1765 ]
1766 ]
1767}
1768```
1769
1770## List Running Models
1771
1772```
1773GET /api/ps
1774```
1775
1776List models that are currently loaded into memory.
1777
1778#### Examples
1779
1780### Request
1781
1782```shell
1783curl http://localhost:11434/api/ps
1784```
1785
1786#### Response
1787
1788A single JSON object will be returned.
1789
1790```json
1791{
1792 "models": [
1793 {
1794 "name": "mistral:latest",
1795 "model": "mistral:latest",
1796 "size": 5137025024,
1797 "digest": "2ae6f6dd7a3dd734790bbbf58b8909a606e0e7e97e94b7604e0aa7ae4490e6d8",
1798 "details": {
1799 "parent_model": "",
1800 "format": "gguf",
1801 "family": "llama",
1802 "families": ["llama"],
1803 "parameter_size": "7.2B",
1804 "quantization_level": "Q4_0"
1805 },
1806 "expires_at": "2024-06-04T14:38:31.83753-07:00",
1807 "size_vram": 5137025024
1808 }
1809 ]
1810}
1811```
1812
1813## Generate Embedding
1814
1815> Note: this endpoint has been superseded by `/api/embed`
1816
1817```
1818POST /api/embeddings
1819```
1820
1821Generate embeddings from a model
1822
1823### Parameters
1824
1825- `model`: name of model to generate embeddings from
1826- `prompt`: text to generate embeddings for
1827
1828Advanced parameters:
1829
1830- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.mdx#valid-parameters-and-values) such as `temperature`
1831- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
1832
1833### Examples
1834
1835#### Request
1836
1837```shell
1838curl http://localhost:11434/api/embeddings -d '{
1839 "model": "all-minilm",
1840 "prompt": "Here is an article about llamas..."
1841}'
1842```
1843
1844#### Response
1845
1846```json
1847{
1848 "embedding": [
1849 0.5670403838157654, 0.009260174818336964, 0.23178744316101074,
1850 -0.2916173040866852, -0.8924556970596313, 0.8785552978515625,
1851 -0.34576427936553955, 0.5742510557174683, -0.04222835972905159,
1852 -0.137906014919281
1853 ]
1854}
1855```
1856
1857## Version
1858
1859```
1860GET /api/version
1861```
1862
1863Retrieve the Ollama version
1864
1865### Examples
1866
1867#### Request
1868
1869```shell
1870curl http://localhost:11434/api/version
1871```
1872
1873#### Response
1874
1875```json
1876{
1877 "version": "0.5.1"
1878}
1879```
1880
1881## Experimental Features
1882
1883### Image Generation (Experimental)
1884
1885> [!WARNING]
1886> Image generation is experimental and may change in future versions.
1887
1888Image generation is now supported through the standard `/api/generate` endpoint when using image generation models. The API automatically detects when an image generation model is being used.
1889
1890See the [Generate a completion](#generate-a-completion) section for the full API documentation. The experimental image generation parameters (`width`, `height`, `steps`) are documented there.
1891
1892#### Example
1893
1894##### Request
1895
1896```shell
1897curl http://localhost:11434/api/generate -d '{
1898 "model": "x/z-image-turbo",
1899 "prompt": "a sunset over mountains",
1900 "width": 1024,
1901 "height": 768
1902}'
1903```
1904
1905##### Response (streaming)
1906
1907Progress updates during generation:
1908
1909```json
1910{
1911 "model": "x/z-image-turbo",
1912 "created_at": "2024-01-15T10:30:00.000000Z",
1913 "completed": 5,
1914 "total": 20,
1915 "done": false
1916}
1917```
1918
1919##### Final Response
1920
1921```json
1922{
1923 "model": "x/z-image-turbo",
1924 "created_at": "2024-01-15T10:30:15.000000Z",
1925 "image": "iVBORw0KGgoAAAANSUhEUg...",
1926 "done": true,
1927 "done_reason": "stop",
1928 "total_duration": 15000000000,
1929 "load_duration": 2000000000
1930}
1931```