mozhii-rag / README.md
Vipooshanb's picture
Add Hugging Face Spaces config to README
db2dab5
metadata
title: Mozhii RAG
emoji: 📚
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false

Mozhii AI — RAG System (v.0.1)

Version Language Model Deploy License

**Retrieval-Augmented Generation system for mozhii v.0.1 **
www.mozhii.online


Mozhii AI RAG

Student Question (Tamil)  →  RAG System  →  Top 5 Relevant Book Passages + Metadata

System Architecture

INDEXING PHASE (One Time)
─────────────────────────
PDF Books → Extract Text → Clean → Manual Chunking (200–400 chars)
→ multilingual-E5-large Embeddings → ChromaDB (Dense) + BM25 (Sparse)

RETRIEVAL PHASE (Every Query)
──────────────────────────────
Tamil Question → Dense Search (top 10) + BM25 Search (top 10)
→ RRF Merge → CrossEncoder Reranker → Top 5 Chunks → FastAPI Response

Data Sources

Book Grade Language Chunks
Tamil History Textbook Grade 10 Tamil (ta) ~1204 chunks
Tamil History Textbook Grade 11 Tamil (ta) ~1256 chunks

Chunk format:

{
  "chunk_id": "ta_edu_Grade11His_01",
  "heading": "இலங்கை சுதந்திரம் அடைதல்",
  "sub_heading": "அறிமுகம்",
  "text": "1948 ஆம் ஆண்டில் இலங்கைக்கு சுதந்திரம்...",
  "language": "ta",
  "category": "education",
  "source": "gov_textbook",
  "source_file": "Grade_11_History_Chapter_06",
  "chunk_index": 1,
  "status": "approved"
}

Chunks were manually extracted, cleaned, and approved — not auto-generated.


Tech Stack

Component Tool Why
Embedding Model intfloat/multilingual-e5-large Best Tamil semantic understanding
Vector Store ChromaDB Local, persistent, prototype-ready
Sparse Search BM25Okapi (rank-bm25) Exact keyword match for names & dates
Reranker cross-encoder/ms-marco-MiniLM-L-6-v2 Precision boost on top-15 candidates
Merge Strategy Reciprocal Rank Fusion (RRF) Robust dense + sparse combination
API Framework FastAPI Clean REST API with auto docs
Deployment Hugging Face Spaces (Docker) Free, always-on, HTTPS

Project Structure

mozhii-rag/
├── data/
│   └── chunks/
│       ├── grade10_history_chunks.json     ← manually prepared chunks
│       └── grade11_history_chunks.json
│
├── indexing/
│   ├── __init__.py
│   ├── embedder.py                         ← embed + store in ChromaDB + BM25
│   └── loader.py                           ← load JSON chunks
│
├── retrieval/
│   ├── __init__.py
│   └── hybrid_retriever.py                 ← Dense + BM25 + RRF + Rerank
│
├── vectorstore/                            ← generated after running indexing
│   ├── chroma.sqlite3
│   └── bm25_index.pkl
│
├── app.py                                  ← FastAPI application
├── run_indexing.py                         ← run once to build the index
├── evaluate.py                             ← measure retrieval hit rate
├── requirements.txt
├── Dockerfile                              ← for HF Spaces deployment
└── README.md

Getting Started

1. Clone & Install

git clone https://github.com/YOUR_USERNAME/mozhii-rag.git
cd mozhii-rag

pip install -r requirements.txt

2. Build the Index (Run Once)

Place your chunk JSON files inside data/chunks/, then run:

python run_indexing.py

This creates vectorstore/chroma.sqlite3 and vectorstore/bm25_index.pkl.

3. Run the API Locally

uvicorn app:app --reload --port 8000

Visit http://localhost:8000/docs for interactive API documentation.

4. Test a Query

curl -X POST "http://localhost:8000/retrieve" \
  -H "Content-Type: application/json" \
  -d '{"question": "சோழர்களின் நீர்ப்பாசன முறை என்ன?", "top_k": 5}'

API Reference

POST /retrieve

Retrieve relevant chunks for a Tamil question.

Request:

{
  "question": "உங்கள் தமிழ் கேள்வி இங்கே",
  "grade": "grade_10",
  "top_k": 5
}

grade is optional. Pass "grade_10" or "grade_11" to filter, or omit to search both.

Response:

{
  "query": "உங்கள் தமிழ் கேள்வி இங்கே",
  "results": [
    {
      "rank": 1,
      "text": "..relevant Tamil passage..",
      "grade": "grade_10",
      "chapter": "Chapter_03",
      "page": 45,
      "score": 0.9423
    }
  ],
  "total_found": 5
}

GET /health

{
  "status": "running",
  "model": "Mozhii RAG v1.0",
  "index": "Grade 10 & 11 Tamil History"
}

Deployment — Hugging Face Spaces (Docker)

Prerequisites

  • Hugging Face account
  • Git installed locally
  • Chunk files available in data/chunks/

1) Create a Docker Space

Option A (UI):

Option B (CLI):

pip install -U "huggingface_hub[cli]"
huggingface-cli login
huggingface-cli repo create mozhii-rag --type space --space_sdk docker

2) Add Secrets (required for /chat)

In Space Settings → Variables and secrets:

  • Secret name: GROQ_API_KEY
  • Secret value: your Groq API key

Optional:

  • GROQ_MODEL (default is llama-3.1-8b-instant)

If GROQ_API_KEY is not set, /retrieve still works, but /chat will return 503.

3) Push This Project to the Space Repo

# Clone your Space repository
git clone https://huggingface.co/spaces/YOUR_USERNAME/mozhii-rag

# Copy this project into the cloned Space folder
rsync -av --delete --exclude ".git" /path/to/this/RAG/ /path/to/mozhii-rag/

cd /path/to/mozhii-rag
git add .
git commit -m "Deploy Mozhii RAG"
git push

4) Wait for Build

Hugging Face will automatically build the Docker image.

This project builds the index inside the Docker image (via run_indexing.py), so first build can take several minutes because it must:

  • install dependencies
  • download the embedding model
  • generate ChromaDB + BM25 index files

5) Verify Deployment

After build is green, test:

curl https://YOUR_USERNAME-mozhii-rag.hf.space/health
curl -X POST "https://YOUR_USERNAME-mozhii-rag.hf.space/retrieve" \
  -H "Content-Type: application/json" \
  -d '{"question": "சோழர்களின் நீர்ப்பாசன முறை என்ன?", "top_k": 5}'

API docs:

  • https://YOUR_USERNAME-mozhii-rag.hf.space/docs

Evaluation

Run the evaluation script against your deployed API to measure retrieval quality:

python evaluate.py --api https://YOUR_USERNAME-mozhii-rag.hf.space

Target Metrics:

Metric Target Description
Hit Rate > 80% Correct chunk in top-5 results
MRR > 0.75 Correct chunk ranked at position 1
Avg Score > 0.70 Reranker score for relevant chunks

If Hit Rate is below target, tune chunk_size, top_k, or overlap in config.


Frontend Integration

Your frontend connects to the deployed API like this:

const response = await fetch(
  "https://YOUR_USERNAME-mozhii-rag.hf.space/retrieve",
  {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      question: userQuestion,
      top_k: 5
    })
  }
);

const data = await response.json();
// data.results → array of Tamil chunks to display

)


Requirements

python >= 3.10
torch >= 2.2.0
chromadb >= 0.4.22
sentence-transformers >= 2.6.0
rank-bm25 >= 0.2.2
fastapi >= 0.110.0
uvicorn >= 0.27.0
transformers >= 4.38.0

License

MIT License — see LICENSE for details.


Built by ❤️ Vipooshan  |  Mozhii AI