Header

Running Generative AI Models with Ollama and Open WebUI Using DeployHQ

AI, Devops & Infrastructure, News, Open Source, Python, and Tutorials

Post Image

In this series

Artificial Intelligence, particularly Generative AI, is rapidly evolving and becoming more accessible to everyday users. With large language models (LLMs) such as GPT and LLaMA making waves, the desire to run these models locally on personal hardware is growing. This article will provide a simple guide on setting up Ollama—a tool for running LLMs locally—on machines with and without a GPU, and implementing automated deployments using DeployHQ.

What is Ollama?

Ollama is a platform that allows users to run LLMs locally without relying on cloud-based services. It is designed to be user-friendly and supports a variety of models. By running models locally, users can ensure greater privacy, reduce latency, and maintain control over their data.

Setting Up Deployment Pipeline with DeployHQ

1. DeployHQ Configuration

  1. Create a DeployHQ account and new project
  2. Connect your Git repository containing Ollama and Open WebUI configurations
  3. Configure deployment settings using a Shell Server (SSH):
# Required settings on DeployHQ:
environments:
  production:
    branch: main
    server:
      type: ssh (shell)
      host: your-server-ip
      user: deploy
      path: /opt/ollama-deployment

pre_deploy:
  - command: "systemctl stop ollama"
  - command: "podman stop open-webui"

deploy:
  - command: "curl -fsSL https://ollama.com/install.sh | sh"
  - command: "podman pull ghcr.io/open-webui/open-webui:main"

post_deploy:
  - command: "systemctl start ollama"
  - command: "podman run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main"

2. Server Preparation

Ensure your server has:

  • SSH access configured for DeployHQ
  • Podman installed in your server
  • Sufficient permissions for the deploy user

Prerequisites

Before proceeding, ensure you have the following:

  • A system running Fedora or a compatible Linux distribution
  • Podman installed (for OpenWebUI deployment)
  • Sufficient disk space for storing models

For machines with GPUs:

  • NVIDIA GPU with CUDA support (for faster performance)
  • Properly installed and configured NVIDIA drivers

Installing Ollama

Ollama can be installed with a one-line command:

curl -fsSL https://ollama.com/install.sh | sh

Verify installation:

ollama --version

Deployment Process

1- Initial Setup:

# On your local machine
git init
git add deployhq.yml
git commit -m "Initial deployment configuration"
git push

2- Configure Deployment Keys:

  • Add DeployHQ's SSH key to your server's authorized_keys
  • Set up any necessary environment variables in DeployHQ

3- First Deployment:

  • Trigger manual deployment through DeployHQ dashboard
  • Monitor deployment logs for any issues

Deploying OpenWebUI with Podman

For users who prefer a graphical interface, OpenWebUI provides an excellent solution.

1. Downloading OpenWebUI Container

podman run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Note: Ollama API is exposed(by default) at 127.0.0.1:11434

2. Accessing OpenWebUI

Access the interface at localhost:8080 (locally, or using your public IP). Perhaps it's easier if you have a domain and you want to use a subdomain here.

Managing Updates

When updating Ollama or Open WebUI:

  1. Update your configuration in the Git repository
  2. Push changes to trigger automatic deployment
  3. DeployHQ will:
    • Stop existing services
    • Deploy updates
    • Restart services
    • Verify deployment success

Rollback Procedure

If issues occur:

  1. Access DeployHQ dashboard
  2. Select previous successful deployment
  3. Click "Rollback"
  4. Monitor rollback process

Using LLaVA for Image Analysis

1. Pulling the LLaVA Container

ollama pull llava

2. Running LLaVA

Access the LLaVA interface via web UI and upload images for analysis.

Monitoring Deployments

DeployHQ provides:

  • Deployment logs
  • Status notifications
  • Deployment history
  • Performance metrics

Best Practices

  1. Always test changes in a staging environment
  2. Use deployment hooks for custom actions
  3. Implement health checks
  4. Monitor system resources during deployments
  5. Maintain backup configurations

Other Use Cases

  • OCR: Extract text from images and scanned documents
  • Multimodal Interaction: Combine text and image inputs
  • Visual QA: Answer specific questions about images

Conclusion

This setup provides a robust solution for running generative AI models locally while maintaining professional deployment practices through DeployHQ. The combination of local AI capabilities and automated deployments creates an efficient, secure, and maintainable environment for both development and production use.

Happy experimenting!

A little bit about the author

Facundo is the CTO at DeployHQ. He oversees our software engineering team by day and, in his free time, enjoys hobbies such as cycling, spending time in nature, and the company of Bono 🐶

Tree

Proudly powered by Katapult. Running on 100% renewable energy.