Spaces:
Runtime error
Runtime error
π Deployment Guide
This guide covers deploying the LLM API Backend to various platforms.
π Prerequisites
- Node.js 18+ installed
- Encore CLI installed (
npm install -g encore.dev) - Git installed
- API keys for your chosen LLM provider
π Hugging Face Spaces (Recommended for Demos)
Step 1: Create a Space
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Settings:
- Space name:
llm-api-backend(or your choice) - SDK: Docker
- Visibility: Public or Private
- Space name:
- Click Create Space
Step 2: Clone and Push
# Clone your new Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
cd YOUR_SPACE_NAME
# Copy all files from this project
cp -r /path/to/llm-api-backend/* .
# Commit and push
git add .
git commit -m "Initial deployment"
git push
Step 3: Configure Secrets
- Go to your Space page
- Click Settings β Repository secrets
- Add the following secrets:
For Hugging Face Provider:
LLMProvider = huggingface
HuggingFaceAPIKey = hf_your_token_here
DefaultModel = mistralai/Mistral-7B-Instruct-v0.2
For Ollama Provider (requires custom Docker setup):
LLMProvider = ollama
OllamaBaseURL = http://localhost:11434
DefaultModel = llama3
Step 4: Wait for Build
- Hugging Face will automatically build your Docker container
- Watch the build logs in the Space interface
- Once complete, your API is live!
Step 5: Test Your API
# Replace with your actual Space URL
export SPACE_URL="https://YOUR_USERNAME-YOUR_SPACE_NAME.hf.space"
# Test chat endpoint
curl -X POST $SPACE_URL/chat \
-H "Content-Type: application/json" \
-d '{"message": "Hello!"}'
# Test health endpoint
curl $SPACE_URL/health
βοΈ Encore Cloud (Recommended for Production)
Step 1: Install Encore
npm install -g encore.dev
Step 2: Create Encore App
# If starting fresh
encore app create
# Or link existing app
encore app link
Step 3: Set Secrets
# For Hugging Face
encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token_here
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2
# For Ollama (local development)
encore secret set LLMProvider ollama
encore secret set OllamaBaseURL http://localhost:11434
encore secret set DefaultModel llama3
Step 4: Deploy
# Deploy to staging
encore deploy
# Deploy to production
encore deploy --env production
Step 5: Access Your API
# Your API will be available at:
# Staging: https://staging-YOUR_APP.encr.app
# Production: https://prod-YOUR_APP.encr.app
# Test it
curl https://staging-YOUR_APP.encr.app/health
π³ Docker (Self-Hosted)
Step 1: Build Image
docker build -t llm-api-backend .
Step 2: Run Container
Using Hugging Face:
docker run -d \
-p 7860:7860 \
-e LLMProvider=huggingface \
-e HuggingFaceAPIKey=hf_your_token \
-e DefaultModel=mistralai/Mistral-7B-Instruct-v0.2 \
--name llm-api \
llm-api-backend
Using Ollama (with host network):
docker run -d \
--network host \
-e LLMProvider=ollama \
-e OllamaBaseURL=http://localhost:11434 \
-e DefaultModel=llama3 \
--name llm-api \
llm-api-backend
Step 3: Test
curl http://localhost:7860/health
π₯οΈ VPS / Bare Metal
Step 1: Install Dependencies
# Install Node.js 20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
# Install Encore CLI
npm install -g encore.dev
Step 2: Clone Repository
git clone https://github.com/YOUR_USERNAME/llm-api-backend.git
cd llm-api-backend
Step 3: Configure Environment
# Copy example env file
cp .env.example .env
# Edit with your values
nano .env
Step 4: Set Encore Secrets
encore secret set LLMProvider huggingface
encore secret set HuggingFaceAPIKey hf_your_token
encore secret set DefaultModel mistralai/Mistral-7B-Instruct-v0.2
Step 5: Run with PM2 (Production)
# Install PM2
npm install -g pm2
# Start application
pm2 start "encore run --port 8080" --name llm-api
# Save PM2 configuration
pm2 save
# Enable startup on boot
pm2 startup
Step 6: Configure Nginx (Optional)
server {
listen 80;
server_name your-domain.com;
location / {
proxy_pass http://localhost:8080;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
π Security Checklist
Before deploying to production:
- All secrets configured properly
- API keys have appropriate permissions
- CORS configured for your frontend domains
- Rate limiting enabled (add middleware)
- HTTPS enabled (Encore/HF Spaces handle this)
- Environment variables not committed to git
- Monitoring and logging set up
- Error tracking configured
π Monitoring
Encore Cloud
- Built-in dashboard at https://app.encore.dev
- Real-time traces, logs, and metrics
- Performance monitoring
- Error tracking
Hugging Face Spaces
- View logs in Space interface
- Use
/healthendpoint for uptime monitoring - Configure external monitoring tools
Self-Hosted
- Use
/healthendpoint - Set up monitoring tools like:
- Prometheus + Grafana
- Datadog
- New Relic
- Sentry for errors
π Troubleshooting
Build Failures on HF Spaces
Issue: Docker build fails
# Check Dockerfile syntax
# Ensure all required files are committed
# Check Space build logs
"Secret not set" Errors
Issue: Application can't access secrets
# On Encore: Use 'encore secret set' command
# On HF Spaces: Configure in Space settings
# On Docker: Pass as environment variables (-e flag)
Model Loading Timeout
Issue: HF models take too long to load
# Solution: Wait 30-60 seconds for cold start
# Use smaller models for faster loading
# Check model availability on HF
Connection Refused (Ollama)
Issue: Can't connect to Ollama
# Ensure Ollama is running: ollama serve
# Check OllamaBaseURL is correct
# For Docker: Use --network host
π Updates and Maintenance
Updating on Hugging Face Spaces
git pull origin main
# Make your changes
git add .
git commit -m "Update: description"
git push
Updating on Encore Cloud
# Make changes
git commit -am "Update: description"
encore deploy
Updating Docker
# Rebuild image
docker build -t llm-api-backend .
# Stop old container
docker stop llm-api
docker rm llm-api
# Run new container
docker run -d [your flags] llm-api-backend
π Additional Resources
Need help? Open an issue or check the main README.md for support options.