llm-api-backend / DEPLOY.md
cygon
intial commit
86042ad

πŸš€ Deployment Guide

This guide covers deploying the LLM API Backend to various platforms.

πŸ“‹ Prerequisites

  • Node.js 18+ installed
  • Encore CLI installed (npm install -g encore.dev)
  • Git installed
  • API keys for your chosen LLM provider

🌟 Hugging Face Spaces (Recommended for Demos)

Step 1: Create a Space

  1. Go to https://huggingface.co/spaces
  2. Click "Create new Space"
  3. Settings:
    • Space name: llm-api-backend (or your choice)
    • SDK: Docker
    • Visibility: Public or Private
  4. Click Create Space

Step 2: Clone and Push

# Clone your new Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME

# Copy all files from this project
cp -r /path/to/llm-api-backend/* .

# Commit and push
git add .
git commit -m "Initial deployment"
git push

Step 3: Configure Secrets

  1. Go to your Space page
  2. Click Settings β†’ Repository secrets
  3. Add the following secrets:

For Hugging Face Provider:

LLMProvider = huggingface
HuggingFaceAPIKey = hf_your_token_here
DefaultModel = mistralai/Mistral-7B-Instruct-v0.2

For Ollama Provider (requires custom Docker setup):

LLMProvider = ollama
OllamaBaseURL = http://localhost:11434
DefaultModel = llama3

Step 4: Wait for Build

  • Hugging Face will automatically build your Docker container
  • Watch the build logs in the Space interface
  • Once complete, your API is live!

Step 5: Test Your API

# Replace with your actual Space URL
export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space"

# Test chat endpoint
curl -X POST $SPACE_URL/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

# Test health endpoint
curl $SPACE_URL/health

☁️ Encore Cloud (Recommended for Production)

Step 1: Install Encore

npm install -g encore.dev

Step 2: Create Encore App

# If starting fresh
encore app create

# Or link existing app
encore app link

Step 3: Set Secrets

# For Hugging Face
encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token_here
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2

# For Ollama (local development)
encore secret set LLMProvider ollama
encore secret set OllamaBaseURL http://localhost:11434
encore secret set DefaultModel llama3

Step 4: Deploy

# Deploy to staging
encore deploy

# Deploy to production
encore deploy --env production

Step 5: Access Your API

# Your API will be available at:
# Staging: https://staging-YOUR_APP.encr.app
# Production: https://prod-YOUR_APP.encr.app

# Test it
curl https://staging-YOUR_APP.encr.app/health

🐳 Docker (Self-Hosted)

Step 1: Build Image

docker build -t llm-api-backend .

Step 2: Run Container

Using Hugging Face:

docker run -d \
  -p 7860:7860 \
  -e LLMProvider=huggingface \
  -e HuggingFaceAPIKey=hf_your_token \
  -e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \
  --name llm-api \
  llm-api-backend

Using Ollama (with host network):

docker run -d \
  --network host \
  -e LLMProvider=ollama \
  -e OllamaBaseURL=http://localhost:11434 \
  -e DefaultModel=llama3 \
  --name llm-api \
  llm-api-backend

Step 3: Test

curl http://localhost:7860/health

πŸ–₯️ VPS / Bare Metal

Step 1: Install Dependencies

# Install Node.js 20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Install Encore CLI
npm install -g encore.dev

Step 2: Clone Repository

git clone https://github.com/YOUR_USERNAME/llm-api-backend.git
cd llm-api-backend

Step 3: Configure Environment

# Copy example env file
cp .env.example .env

# Edit with your values
nano .env

Step 4: Set Encore Secrets

encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2

Step 5: Run with PM2 (Production)

# Install PM2
npm install -g pm2

# Start application
pm2 start "encore run --port 8080" --name llm-api

# Save PM2 configuration
pm2 save

# Enable startup on boot
pm2 startup

Step 6: Configure Nginx (Optional)

server {
    listen 80;
    server_name your-domain.com;

    location / {
        proxy_pass http://localhost:8080;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
    }
}

πŸ” Security Checklist

Before deploying to production:

  • All secrets configured properly
  • API keys have appropriate permissions
  • CORS configured for your frontend domains
  • Rate limiting enabled (add middleware)
  • HTTPS enabled (Encore/HF Spaces handle this)
  • Environment variables not committed to git
  • Monitoring and logging set up
  • Error tracking configured

πŸ“Š Monitoring

Encore Cloud

  • Built-in dashboard at https://app.encore.dev
  • Real-time traces, logs, and metrics
  • Performance monitoring
  • Error tracking

Hugging Face Spaces

  • View logs in Space interface
  • Use /health endpoint for uptime monitoring
  • Configure external monitoring tools

Self-Hosted

  • Use /health endpoint
  • Set up monitoring tools like:
    • Prometheus + Grafana
    • Datadog
    • New Relic
    • Sentry for errors

πŸ†˜ Troubleshooting

Build Failures on HF Spaces

Issue: Docker build fails

# Check Dockerfile syntax
# Ensure all required files are committed
# Check Space build logs

"Secret not set" Errors

Issue: Application can't access secrets

# On Encore: Use 'encore secret set' command
# On HF Spaces: Configure in Space settings
# On Docker: Pass as environment variables (-e flag)

Model Loading Timeout

Issue: HF models take too long to load

# Solution: Wait 30-60 seconds for cold start
# Use smaller models for faster loading
# Check model availability on HF

Connection Refused (Ollama)

Issue: Can't connect to Ollama

# Ensure Ollama is running: ollama serve
# Check OllamaBaseURL is correct
# For Docker: Use --network host

πŸ”„ Updates and Maintenance

Updating on Hugging Face Spaces

git pull origin main
# Make your changes
git add .
git commit -m "Update: description"
git push

Updating on Encore Cloud

# Make changes
git commit -am "Update: description"
encore deploy

Updating Docker

# Rebuild image
docker build -t llm-api-backend .

# Stop old container
docker stop llm-api
docker rm llm-api

# Run new container
docker run -d [your flags] llm-api-backend

πŸ“š Additional Resources


Need help? Open an issue or check the main README.md for support options.