Running Generative AI Models with Ollama and Open WebUI Using DeployHQ

Posted on 1st March 2025

AI, Devops & Infrastructure, News, Open Source, Python, and Tutorials

Running Generative AI Models with Ollama and Open WebUI Using DeployHQ

In this series

Artificial Intelligence, particularly Generative AI, is rapidly evolving and becoming more accessible to everyday users. With large language models (LLMs) such as GPT and LLaMA making waves, the desire to run these models locally on personal hardware is growing. This article will provide a simple guide on setting up Ollama—a tool for running LLMs locally—on machines with and without a GPU, and implementing automated deployments using DeployHQ.

What is Ollama?

Ollama is a platform that allows users to run LLMs locally without relying on cloud-based services. It is designed to be user-friendly and supports a variety of models. By running models locally, users can ensure greater privacy, reduce latency, and maintain control over their data.

Setting Up Deployment Pipeline with DeployHQ

1. DeployHQ Configuration

Create a DeployHQ account and new project
Connect your Git repository containing Ollama and Open WebUI configurations
Configure deployment settings using a Shell Server (SSH):

# Required settings on DeployHQ:
environments:
  production:
    branch: main
    server:
      type: ssh (shell)
      host: your-server-ip
      user: deploy
      path: /opt/ollama-deployment

pre_deploy:
  - command: "systemctl stop ollama"
  - command: "podman stop open-webui"

deploy:
  - command: "curl -fsSL https://ollama.com/install.sh | sh"
  - command: "podman pull ghcr.io/open-webui/open-webui:main"

post_deploy:
  - command: "systemctl start ollama"
  - command: "podman run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main"

2. Server Preparation

Ensure your server has:

SSH access configured for DeployHQ
Podman installed in your server
Sufficient permissions for the deploy user

Prerequisites

Before proceeding, ensure you have the following:

A system running Fedora or a compatible Linux distribution
Podman installed (for OpenWebUI deployment)
Sufficient disk space for storing models

For machines with GPUs:

NVIDIA GPU with CUDA support (for faster performance)
Properly installed and configured NVIDIA drivers

Installing Ollama

Ollama can be installed with a one-line command:

curl -fsSL https://ollama.com/install.sh | sh

Verify installation:

ollama --version

Deployment Process

1- Initial Setup:

# On your local machine
git init
git add deployhq.yml
git commit -m "Initial deployment configuration"
git push

2- Configure Deployment Keys:

Add DeployHQ's SSH key to your server's authorized_keys
Set up any necessary environment variables in DeployHQ

3- First Deployment:

Trigger manual deployment through DeployHQ dashboard
Monitor deployment logs for any issues

Deploying OpenWebUI with Podman

For users who prefer a graphical interface, OpenWebUI provides an excellent solution.

1. Downloading OpenWebUI Container

podman run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Note: Ollama API is exposed(by default) at 127.0.0.1:11434

2. Accessing OpenWebUI

Access the interface at localhost:8080 (locally, or using your public IP). Perhaps it's easier if you have a domain and you want to use a subdomain here.

Managing Updates

When updating Ollama or Open WebUI:

Update your configuration in the Git repository
Push changes to trigger automatic deployment
DeployHQ will:
- Stop existing services
- Deploy updates
- Restart services
- Verify deployment success

Rollback Procedure

If issues occur:

Access DeployHQ dashboard
Select previous successful deployment
Click "Rollback"
Monitor rollback process

Using LLaVA for Image Analysis

1. Pulling the LLaVA Container

ollama pull llava

2. Running LLaVA

Access the LLaVA interface via web UI and upload images for analysis.

Monitoring Deployments

DeployHQ provides:

Deployment logs
Status notifications
Deployment history
Performance metrics

Best Practices

Always test changes in a staging environment
Use deployment hooks for custom actions
Implement health checks
Monitor system resources during deployments
Maintain backup configurations

Other Use Cases

OCR: Extract text from images and scanned documents
Multimodal Interaction: Combine text and image inputs
Visual QA: Answer specific questions about images

Conclusion

This setup provides a robust solution for running generative AI models locally while maintaining professional deployment practices through DeployHQ. The combination of local AI capabilities and automated deployments creates an efficient, secure, and maintainable environment for both development and production use.

Happy experimenting!