Run LLM models locally | Luis Ch

Intro

This time we are going to see how to run LLM¹ models locally. For this we are going to use Ollama², a tool that allows to run LLM models locally and Docker, to run Ollama in a container.

If you have not yet interacted with artificial intelligence models, you can do so anonymously from the DuckDuckGo³ website.

DuckDuckGo AI Chat

Why locally?

This is the way I normally use AI models like ChatGPT⁴, as it gives me some advantages. The first one is that it usually works without internet connection, with its limits but it works. On the other hand, due to my location, I do not have access to OpenAI⁵ services, so I cannot use ChatGPT. Even if I had the credentials I would need to connect to a VPN⁶ to access it, which adds complexity to when using the service. And finally, it offers me privacy as my queries are processed locally and not on the third party machine.

Configure Ollama

For this example we will need to have Docker or Podman installed. Depending on your operating system, the steps may vary. Here are the steps to do it from the Linux terminal.

1. Download the docker image for Ollama

The docker image can be found from their blog. And we download it with the following command:

docker pull ollama/ollama

2. Create container from docker image

According to the documentation, you can create the container with the following command:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

3. Ejecutar ollama desde el contenedor

First we run bash inside the container, this way we will be able to access its resources and the one we need specifically is ollama to download the models.

sudo docker exec -it ollama bash

Once the download is complete, we can access the model from an interface, which we must create and configure separately, or from the terminal itself, as we can see below.

root@f5ed90aecd2b:/# ollama run llama3.2:3b
>>> hello
Hello! How can I assist you today?

>>>

Configure web interface

For this example we are going to configure a web interface (Web UI) to interact with the models we have installed.

1. Download the docker image for Open Web UI

Download image and create the container, with the following command:

docker run -d --network=host -v open-webui:/app/backend/data -e PORT=3000 -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

2. Access from the browser and create a user

Then we open the browser and access localhost:3000. We will see a welcome message with instructions to create our first user.

3. Select a model

And finally, we will get to this interface, where we will be able to choose among the models we have downloaded and start interacting with them.

Ollama from Open WebUI

Want to run AI models locally without the hassle?

If you’re looking for privacy, unrestricted geographic access, or simply prefer to avoid relying on internet connections, I can help you set up custom solutions using tools like Ollama and Docker. From installation and optimization to creating intuitive interfaces, I develop systems tailored to your technical needs so you can make the most of AI in your own environment. Ready to take your experience with language models to the next level? Contact me and let’s make it happen.

What is a LLM?

Ollama official website

DuckDuckGo Chat and Official Announcement

⁴

ChatGPT official website

⁵

OpenAI official website

⁶

What is a VPN?