Local AI on Raspberry Pi 5 with Ollama: Your private AI server at home

A few months ago I came across something that really caught my attention: the possibility of having my own “ChatGPT” running at home, without sending data anywhere, using only a Raspberry Pi 5. Sounds too good to be true, right?

Well, it turns out that with Ollama and a Pi 5 it’s perfectly possible to set up a local AI server that works surprisingly well. Let me tell you my experience and how you can do it too.

What is Ollama and why did I like it so much?

Ollama is an open source tool that allows you to run large language models (LLMs) directly on your machine, without depending on external services. What I like most is that all your data stays at home - no sending sensitive information to remote servers.

The Raspberry Pi 5, especially the 8GB RAM version, turns out to be the perfect companion for this type of project. It consumes little energy, is inexpensive, and on top of that you can leave it running 24/7 without problems.

The advantages I value most

  • Total privacy: Everything is processed locally
  • No internet dependency: Once configured, it works offline
  • Minimal cost: No subscriptions or usage fees
  • Complete personalization: You can choose exactly which models to use

What you need to get started

The setup is quite simple:

  • A Raspberry Pi 5 (I strongly recommend the 8GB version)
  • Sufficient storage - some models take up several GB
  • Raspberry Pi OS Bookworm 64-bit
  • Internet connection for initial installation
  • A little patience for the initial configuration

Important: Make sure to use the 64-bit version of the operating system. It’s essential.

Step-by-step installation

The installation is much simpler than I expected. Ollama provides a script that automates the entire process:

# Update the system
sudo apt update && sudo apt upgrade

# Install curl if you don't have it
sudo apt install curl

# Download and install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify the installation
ollama --version

And that’s it. Seriously, it’s that simple.

Choosing the right model

Here comes the interesting part: choosing what “brain” you want for your AI. I’ve tested several and let me tell you my experience:

TinyLlama - The sprinter

ollama run tinyllama

It’s the lightest (1.1B parameters) and fastest. Perfect for initial tests and basic chatbots. The responses aren’t the most elaborate, but the speed is impressive.

Phi3 - The balanced one

ollama run phi3

Developed by Microsoft, it offers a good balance between speed and response quality. It’s my favorite option for daily use on the Pi 5.

Llama3 - The brainiac

ollama run llama3

It’s the most advanced, but also the most demanding. The responses are excellent, but you need patience. Only recommended if you have the 8GB version and don’t mind waiting a bit longer.

Deepseek-R1 - The specialist

ollama run deepseek-r1:1.5b

It comes in different sizes. The 1.5B version works well on the Pi 5 and is quite competent.

My recommendation: start with Phi3. It’s the best compromise between functionality and performance.

Beyond the terminal

Once you have Ollama running, you can take it to the next level by installing a web interface. There are several options available, but personally I like using Docker to keep everything organized:

# If you don't have Docker installed
curl -sSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# After restarting the session, you can use a WebUI
# (there are several projects on GitHub specific for Pi 5)

With a web interface, you can access your AI from any device on your local network. It’s much more comfortable.

The API that opens a world of possibilities

What really excited me about Ollama is its integrated HTTP API. You can make queries programmatically:

curl http://localhost:11434/api/generate \
  -d '{
    "model": "phi3",
    "prompt": "What is the capital of Australia?",
    "stream": false
  }'

This opens a bunch of possibilities: automation, integration with other systems, creating custom bots… The options are endless.

Real use cases I’ve tried

Offline personal assistant

Perfect for quick queries without sending data out of the house.

Document analysis

You can process and analyze texts locally, ideal for sensitive information.

Task automation

Combined with scripts, you can automate email responses, text classification, etc.

Educational experiments

Excellent for learning about AI without additional costs.

Practical optimization tips

Monitor RAM usage: If you notice slowness, try smaller models.

Use fast storage: A good microSD or better yet, an external SSD, makes a difference.

Control temperature: The Pi 5 can heat up with heavy models. A fan doesn’t hurt.

Update regularly: Both Ollama and the models update frequently with improvements.

Common problems I encountered

The system runs out of memory

Solution: Switch to a smaller model or close other applications.

Very slow responses

Solution: It’s normal with large models. Patience or try lighter models.

Architecture error

Solution: Verify you’re using Raspberry Pi OS 64-bit.

My experience after several months

I’ve been using this setup for several months and I’m genuinely impressed. No, it’s not as fast as ChatGPT, but for many use cases it’s perfectly valid. And the peace of mind of knowing my data doesn’t leave home is priceless.

The energy consumption is minimal, so I have it running 24/7. When I need to make a quick query or analyze a document, I simply open the web interface from any device in the house.

Is it worth it?

For me, absolutely yes. If you value privacy, like experimenting with technology, or just want to have your own AI server without depending on third parties, this combination is perfect.

Don’t expect miracles in terms of speed, but a solid and very satisfactory experience. And the best of all: it’s yours, completely.

Next steps

Once you have everything running, I recommend exploring:

  • Integration with LangChain for more complex workflows
  • Creating custom bots using the API
  • Home task automation
  • Experimenting with different models based on your needs

The Ollama community is very active, and constantly new models and improvements appear. It’s an exciting time to experiment with local AI.

Do you dare to set up your own AI server? If you do, I’d love to know how it goes. And if you have doubts, you know where to find me.


Have you tried Ollama on your Raspberry Pi? Which models work best for you? Share your experience in the comments.