Using ollama from the command line
Using Ollama from the command line allows you to take advantage of LLM from within your programs and shells. This environment assumes that ollama is installed with Docker.
Access ollama via CLI
Check which model is installed.
$ docker exec -it ollama ollama list
NAME ID SIZE MODIFIED
llama3.1:latest 42182419e950 4.7 GB 3 weeks ago
phi3:latest 4f2222927938 2.2 GB 3 weeks ago
Models that are not installed will be downloaded automatically when you run the command, but you can also download them explicitly.
$ docker exec -it ollama ollama pull llama3.2
pulling manifest
pulling dde5aa3fc5ff... 100% ▕████████████████████████████████████████████████████████▏ 2.0 GB
pulling 966de95ca8a6... 100% ▕████████████████████████████████████████████████████████▏ 1.4 KB
pulling fcc5a6bec9da... 100% ▕████████████████████████████████████████████████████████▏ 7.7 KB
pulling a70ff7e570d9... 100% ▕████████████████████████████████████████████████████████▏ 6.0 KB
pulling 56bb8bd477a5... 100% ▕████████████████████████████████████████████████████████▏ 96 B
pulling 34bb5ab01051... 100% ▕████████████████████████████████████████████████████████▏ 561 B
verifying sha256 digest
writing manifest
success
Run ollama from the command line.
$ docker exec ollama ollama run "llama3.2" "what is AI? Explain in one sentence."
Artificial Intelligence (AI) refers to the development of computer systems that can perform tasks typically requiring human intelligence, such as learning, problem-solving, and decision-making, using algorithms and data.
Access ollama via API
Confirm the API is accesible.
$ curl -X POST http://localhost:11434/api/generate -d '{
"model":"llama3.2",
"prompt": "Hi?"
}'
Output looks like:
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.55612798Z","response":"How","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.569218685Z","response":" can","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.581301676Z","response":" I","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.592935427Z","response":" assist","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.605632629Z","response":" you","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.618116021Z","response":" today","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.631825046Z","response":"?","done":false}
{"model":"llama3.2","created_at":"2024-11-29T11:39:40.64372069Z","response":"","done":true,"done_reason":"stop","context":[128006,9125,128007,271,38766,1303,33025,2696,25,6790,220,2366,18,271,128009,128006,882,128007,271,13347,30,128009,128006,78191,128007,271,4438,649,358,7945,499,3432,30],"total_duration":126834961,"load_duration":12344583,"prompt_eval_count":27,"prompt_eval_duration":7000000,"eval_count":8,"eval_duration":106000000}
If you see erros like this, the localhost must be an IP address of the ollama docker. (i.e. other docker container may access with http://ollama:11434)
$ curl -X POST http://localhost:11434/api/generate -d '{
"model":"llama3.2",
"prompt": "Hi?"
}'
curl: (7) Failed to connect to localhost port 11434 after 0 ms: Connection refused
You can check the IP address of the ollama Docker:
$ docker exec -it ollama hostname -i
172.18.0.2
Replace localhost to the IP address and delete the newline code:
$ curl -X POST http://`docker exec -it ollama hostname -i | tr -d '\n\r'`:11434/api/generate -d '{
"model":"llama3.2",
"prompt": "Hi?"
}'
Now we want the answer in plain text.
$ curl -s -X POST http://`docker exec -it ollama hostname -i | tr -d '\n\r'`:11434/api/generate -d '{
"model": "llama3.2",
"prompt":"Hi?",
"stream": false
}' | jq -r ".response"
This response would be ideal from program access.
How can I assist you today?
Accessing Ollama on another machine
By default, ollama listens on 127.0.0.1:11434, so access is restricted to local. To allow external access, set the environment variables OLLAMA_HOST and OLLAMA_ORIGINS.
OLLAMA_HOST=0.0.0.0
OLLAMA_ORIGINS=192.168.0.*
If the ollama is in docker, the port can be mapped to the host IP. i.e. If the host ip is 192.168.0.123, it can be accessed with “192.168.0.123:11434”.
services:
ollama:
volumes:
- ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest}
ports:
- "11434:11434"
environment:
- 'OLLAMA_HOST=0.0.0.0:11434'
deploy:
resources:
reservations:
devices:
- driver: ${OLLAMA_GPU_DRIVER-nvidia}
count: ${OLLAMA_GPU_COUNT-1}
capabilities:
- gpu