Spaces:
Sleeping
title: Mozhii RAG
emoji: 📚
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
Mozhii AI — RAG System (v.0.1)
**Retrieval-Augmented Generation system for mozhii v.0.1 **
www.mozhii.online
Mozhii AI RAG
Student Question (Tamil) → RAG System → Top 5 Relevant Book Passages + Metadata
System Architecture
INDEXING PHASE (One Time)
─────────────────────────
PDF Books → Extract Text → Clean → Manual Chunking (200–400 chars)
→ multilingual-E5-large Embeddings → ChromaDB (Dense) + BM25 (Sparse)
RETRIEVAL PHASE (Every Query)
──────────────────────────────
Tamil Question → Dense Search (top 10) + BM25 Search (top 10)
→ RRF Merge → CrossEncoder Reranker → Top 5 Chunks → FastAPI Response
Data Sources
| Book | Grade | Language | Chunks |
|---|---|---|---|
| Tamil History Textbook | Grade 10 | Tamil (ta) | ~1204 chunks |
| Tamil History Textbook | Grade 11 | Tamil (ta) | ~1256 chunks |
Chunk format:
{
"chunk_id": "ta_edu_Grade11His_01",
"heading": "இலங்கை சுதந்திரம் அடைதல்",
"sub_heading": "அறிமுகம்",
"text": "1948 ஆம் ஆண்டில் இலங்கைக்கு சுதந்திரம்...",
"language": "ta",
"category": "education",
"source": "gov_textbook",
"source_file": "Grade_11_History_Chapter_06",
"chunk_index": 1,
"status": "approved"
}
Chunks were manually extracted, cleaned, and approved — not auto-generated.
Tech Stack
| Component | Tool | Why |
|---|---|---|
| Embedding Model | intfloat/multilingual-e5-large |
Best Tamil semantic understanding |
| Vector Store | ChromaDB |
Local, persistent, prototype-ready |
| Sparse Search | BM25Okapi (rank-bm25) |
Exact keyword match for names & dates |
| Reranker | cross-encoder/ms-marco-MiniLM-L-6-v2 |
Precision boost on top-15 candidates |
| Merge Strategy | Reciprocal Rank Fusion (RRF) | Robust dense + sparse combination |
| API Framework | FastAPI |
Clean REST API with auto docs |
| Deployment | Hugging Face Spaces (Docker) |
Free, always-on, HTTPS |
Project Structure
mozhii-rag/
├── data/
│ └── chunks/
│ ├── grade10_history_chunks.json ← manually prepared chunks
│ └── grade11_history_chunks.json
│
├── indexing/
│ ├── __init__.py
│ ├── embedder.py ← embed + store in ChromaDB + BM25
│ └── loader.py ← load JSON chunks
│
├── retrieval/
│ ├── __init__.py
│ └── hybrid_retriever.py ← Dense + BM25 + RRF + Rerank
│
├── vectorstore/ ← generated after running indexing
│ ├── chroma.sqlite3
│ └── bm25_index.pkl
│
├── app.py ← FastAPI application
├── run_indexing.py ← run once to build the index
├── evaluate.py ← measure retrieval hit rate
├── requirements.txt
├── Dockerfile ← for HF Spaces deployment
└── README.md
Getting Started
1. Clone & Install
git clone https://github.com/YOUR_USERNAME/mozhii-rag.git
cd mozhii-rag
pip install -r requirements.txt
2. Build the Index (Run Once)
Place your chunk JSON files inside data/chunks/, then run:
python run_indexing.py
This creates vectorstore/chroma.sqlite3 and vectorstore/bm25_index.pkl.
3. Run the API Locally
uvicorn app:app --reload --port 8000
Visit http://localhost:8000/docs for interactive API documentation.
4. Test a Query
curl -X POST "http://localhost:8000/retrieve" \
-H "Content-Type: application/json" \
-d '{"question": "சோழர்களின் நீர்ப்பாசன முறை என்ன?", "top_k": 5}'
API Reference
POST /retrieve
Retrieve relevant chunks for a Tamil question.
Request:
{
"question": "உங்கள் தமிழ் கேள்வி இங்கே",
"grade": "grade_10",
"top_k": 5
}
gradeis optional. Pass"grade_10"or"grade_11"to filter, or omit to search both.
Response:
{
"query": "உங்கள் தமிழ் கேள்வி இங்கே",
"results": [
{
"rank": 1,
"text": "..relevant Tamil passage..",
"grade": "grade_10",
"chapter": "Chapter_03",
"page": 45,
"score": 0.9423
}
],
"total_found": 5
}
GET /health
{
"status": "running",
"model": "Mozhii RAG v1.0",
"index": "Grade 10 & 11 Tamil History"
}
Deployment — Hugging Face Spaces (Docker)
Prerequisites
- Hugging Face account
- Git installed locally
- Chunk files available in
data/chunks/
1) Create a Docker Space
Option A (UI):
- Go to https://huggingface.co/new-space
- Select SDK: Docker
- Set visibility (Public or Private)
Option B (CLI):
pip install -U "huggingface_hub[cli]"
huggingface-cli login
huggingface-cli repo create mozhii-rag --type space --space_sdk docker
2) Add Secrets (required for /chat)
In Space Settings → Variables and secrets:
- Secret name:
GROQ_API_KEY - Secret value: your Groq API key
Optional:
GROQ_MODEL(default isllama-3.1-8b-instant)
If GROQ_API_KEY is not set, /retrieve still works, but /chat will return 503.
3) Push This Project to the Space Repo
# Clone your Space repository
git clone https://huggingface.co/spaces/YOUR_USERNAME/mozhii-rag
# Copy this project into the cloned Space folder
rsync -av --delete --exclude ".git" /path/to/this/RAG/ /path/to/mozhii-rag/
cd /path/to/mozhii-rag
git add .
git commit -m "Deploy Mozhii RAG"
git push
4) Wait for Build
Hugging Face will automatically build the Docker image.
This project builds the index inside the Docker image (via run_indexing.py), so first build can take several minutes because it must:
- install dependencies
- download the embedding model
- generate ChromaDB + BM25 index files
5) Verify Deployment
After build is green, test:
curl https://YOUR_USERNAME-mozhii-rag.hf.space/health
curl -X POST "https://YOUR_USERNAME-mozhii-rag.hf.space/retrieve" \
-H "Content-Type: application/json" \
-d '{"question": "சோழர்களின் நீர்ப்பாசன முறை என்ன?", "top_k": 5}'
API docs:
https://YOUR_USERNAME-mozhii-rag.hf.space/docs
Evaluation
Run the evaluation script against your deployed API to measure retrieval quality:
python evaluate.py --api https://YOUR_USERNAME-mozhii-rag.hf.space
Target Metrics:
| Metric | Target | Description |
|---|---|---|
| Hit Rate | > 80% | Correct chunk in top-5 results |
| MRR | > 0.75 | Correct chunk ranked at position 1 |
| Avg Score | > 0.70 | Reranker score for relevant chunks |
If Hit Rate is below target, tune chunk_size, top_k, or overlap in config.
Frontend Integration
Your frontend connects to the deployed API like this:
const response = await fetch(
"https://YOUR_USERNAME-mozhii-rag.hf.space/retrieve",
{
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
question: userQuestion,
top_k: 5
})
}
);
const data = await response.json();
// data.results → array of Tamil chunks to display
)
Requirements
python >= 3.10
torch >= 2.2.0
chromadb >= 0.4.22
sentence-transformers >= 2.6.0
rank-bm25 >= 0.2.2
fastapi >= 0.110.0
uvicorn >= 0.27.0
transformers >= 4.38.0
License
MIT License — see LICENSE for details.