Another Day, Another Model v2

Trying out an LLM on the Raspberry Pi 5 - Crazy?

Dec 22, 2023

Recently I’ve been looking into LLMs (Large Language Models like ChatGPT) as productivity tools for knowledge workers, and finding that OpenAI products (like the new customisable GPTs) lack some security and privacy features that many organisations require (especially here in Europe). So I’ve been going down the rabbit hole of LLM options, and found Mistral (a French open source LLM). Mistral can be run on your own server for free, which bypasses many privacy and security issues (i.e. you control things vs. some anonymous offshore server farm).

And this made me wonder, could I run Mistral on the RPi 5? In fact it’s possible to run Mistral on your Apple Silicon Mac, Windows, or Linux PC (more info). But is it possible run an LLM on the RPi5? That seems kind of crazy, since the RPi is so underpowered compared to a typical PC. I mean, there’s no GPU right?

Short answer: Yes you can run an LLM on the RPi 5! (but you have to choose the right model, and it is slow…)

How? Install Ollama (a system for hosting LLMs locally) on the RPi5, download & install a model (choose from the list here, depending on your requirements), run it, then try some prompts.

# install for Linux curl https://ollama.ai/install.sh | sh

# get and install the model (orca-mini works on the 4GB RPi 5) # this will download and set up the model, then run it ollama run orca-mini

# try a prompt (takes from 13 to 100 seconds to start “typing”) >>> what is a bird

The speed to answer was quite variable - the fastest was right after a reboot, but it went downhill from there. You can tell the RPi5 is working hard, since the fan spins up frequently. This is probably a memory thing, where the OS is thrashing trying to manage too much data, forcing it to do a lot of swapping memory on and off of the slow SD card.

I tried a few other models after the success of orca-mini. Unfortunately these (mistral, mistrallite, llama2) did not work and had the following problem (either loading or running):

(Error: llama runner process has terminated)

I’m guessing this is again because my Pi has only 4GB of memory. I’ll have to get an 8GB model and try it out to see if these models will work, and/or to see if it speeds anything up. Stay tuned!