Installing Local Large Language Models with Ollama on Linux

In today's digital world, Large Language Models (LLMs) are playing an increasingly important role. However, many companies are faced with the challenge of keeping their sensitive data secure while using powerful AI models. One solution: installing local Large Language Models. In this article, we will show you how to set up a local LLM on a Linux server using the Ollama software. This allows you to take advantage of powerful language models without sending sensitive data to the cloud.

Why local LLMs?

Many companies prefer local LLMs to maintain control over their data. Cloud-based solutions such as Microsoft Azure or AWS offer immense computing power, but data sovereignty often remains a critical point. Local installations make it possible to process highly sensitive information internally while exploiting the power of modern language models. To install Ollama on a Linux server you need:

  • Linux distribution: any Linux distribution, we use Ubuntu Server 24.04 LTS 
  • Nvidia graphics card: A powerful card like the Nvidia RTX A5000 provides the computing power that LLMs need
  • docker: To start Open WebUI as a Docker container.
 

Step 1: Install Ollama

Ollama enables the management and use of local LLMs. Installation is straightforward:

				
					sudo curl -fsSL https://ollama.com/install.sh | sh
				
			

After installation, you should restart the server to ensure that all kernel components are loaded correctly.

Step 2: Check the Nvidia graphics card

Use nvidia-smito monitor the status of your graphics card:

				
					nvidia-smi -l 1
				
			


Step 3: Install Docker

Docker is required to run Open WebUI. Installation instructions can be found here.

Step 4: Start Open WebUI

 Start Open WebUI as a Docker container to use the user interface for your LLMs:
				
					docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open- webui/open-webui:main
				
			


Step 5: Access the user interface

Access Open WebUI in your browser by entering the IP address of your server, e.g.:
				
					http://192.168.0.5:8080
				
			
Create an admin account to install models and query the language models.


Step 6: Install and use models

Models can be installed either directly via the command line or the WebUI. For example:
				
					ollama pull llama3
				
			
The WebUI provides a user-friendly way to manage models and query them, similar to ChatGPT.


Tips for hardware optimization

Powerful hardware resources are crucial for the productive use of LLMs. Graphics cards and server boards with sufficient RAM and good expandability in terms of PCI slots for the graphics cards are ideal for operating sophisticated models efficiently. The size of the models that can be used depends largely on the number and performance of the graphics cards.
 

example configurations

Entry-level configuration (for Llama 7B and simple applications)

  • CPU: AMD Ryzen 9 or Intel i9
  • GPU: NVIDIA RTX 3060 with 12GB VRAM
  • RAM: 32 GB
  • Storage: 1 TB NVMe SSD

 

Advanced configuration (for Llama 13B to 30B)

  • CPU: AMD Threadripper or Intel Xeon
  • GPU: NVIDIA RTX 3090 or A6000 with at least 24 GB VRAM
  • RAM: 64-128 GB
  • Storage: 2 TB NVMe SSD

 

High-end configuration (for Llama 65B and demanding applications)

  • CPU: Dual AMD EPYC or Intel Xeon
  • GPU: NVIDIA A100 or H100 (40 GB or more VRAM) or a cluster of multiple GPUs
  • RAM: 128 GB or more
  • Storage: 4 TB NVMe SSD


Conclusion

With Ollama, you can efficiently run local LLMs on your Linux server while retaining full control over your data. This solution is particularly suitable for companies that process sensitive information and still want to take advantage of modern language models. Click here for the practical video tutorial:

Service Level Agreement

Service Level Agreement: Ensuring quality and reliability in the service A service level agreement (SLA) is an essential part of the relationship between service providers

Continue reading "
Outsourcing around the world

Outsourcing in IT

Outsourcing in IT: Strategies, Benefits and Challenges Outsourcing in the IT industry has established itself as an effective strategy for companies to access specialized

Continue reading "

Java and the Spring Framework

Java and the Spring Framework: A Powerful Combination in Software Development Java, one of the world's most widely used programming languages, is known for its robust, platform-independent nature

Continue reading "

Platform independent apps

Platform Independent Apps: Developing for a Wide User Base Platform independent (cross-platform) app development is an effective strategy to create applications that work on different operating systems and devices.

Continue reading "