Spaces:

Ab-Romia
/

Context-Aware-AI

Sleeping

App Files Files Community

Update README.md

by Ab-Romia - opened Aug 4, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+535

-1652

Files changed (17) hide show

.dockerignore +0 -63
.gitattributes +35 -0
.gitignore +0 -4
Dockerfile +1 -6
LICENSE +0 -21
README.md +52 -307
README_HUGGINGFACE.md +0 -106
app/__init__.py +0 -0
app/config.py +31 -71
app/main.py +21 -31
app/rag_setup.py +79 -296
app/schemas.py +2 -18
app/services.py +210 -415
main.py +5 -5
requirements.txt +33 -22
static/app.js +36 -209
templates/index.html +30 -78

.dockerignore DELETED Viewed

@@ -1,63 +0,0 @@
-# Python
-__pycache__/
-*.py[cod]
-*$py.class
-*.so
-.Python
-*.egg-info/
-dist/
-build/
-*.egg
-# Virtual environments
-venv/
-env/
-ENV/
-.venv/
-# IDEs
-.idea/
-.vscode/
-*.swp
-*.swo
-*~
-# Git
-.git/
-.gitignore
-.gitattributes
-# OS
-.DS_Store
-Thumbs.db
-# Logs
-*.log
-# Temporary files
-*.tmp
-*.bak
-*.swp
-.cache/
-# ChromaDB local storage
-chroma_db/
-*.sqlite3
-# Documentation (if not needed in container)
-*.md
-!README.md
-# Tests (if any)
-tests/
-test_*
-*_test.py
-# CI/CD
-.github/
-.gitlab-ci.yml
-# Docker
-Dockerfile
-.dockerignore
-docker-compose.yml

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

.gitignore DELETED Viewed

@@ -1,4 +0,0 @@
-.env
-__pycache__/
-*.pyc
-.venv/

Dockerfile CHANGED Viewed

@@ -1,15 +1,10 @@
-FROM python:3.11-slim
 WORKDIR /app
 # Install system dependencies
 RUN apt-get update && apt-get install -y \
     build-essential \
-    curl \
-    git \
-    wget \
-    gcc \
-    g++ \
     && rm -rf /var/lib/apt/lists/*
 # Upgrade pip and install wheel

+FROM python:3.10-slim
 WORKDIR /app
 # Install system dependencies
 RUN apt-get update && apt-get install -y \
     build-essential \
     && rm -rf /var/lib/apt/lists/*
 # Upgrade pip and install wheel

LICENSE DELETED Viewed

@@ -1,21 +0,0 @@
-MIT License
-Copyright (c) 2025 Abdelrahman Abouroumia (Ab-Romia)
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.

README.md CHANGED Viewed

@@ -1,332 +1,77 @@
 ---
-title: ContextIQ - Intelligent RAG Assistant
-emoji: 🧠
-colorFrom: blue
-colorTo: purple
 sdk: docker
-app_port: 7860
 pinned: false
 license: mit
 ---
-# 🧠 ContextIQ - Intelligent Context-Aware AI Assistant
-<div align="center">
-[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)
-[![FastAPI](https://img.shields.io/badge/FastAPI-0.116+-green.svg)](https://fastapi.tiangolo.com/)
-[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
-[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Hugging%20Face-orange)](https://huggingface.co/spaces/Ab-Romia/Context-Aware-AI)
-**A sophisticated RAG (Retrieval-Augmented Generation) application powered by multiple AI providers**
-[Live Demo](https://huggingface.co/spaces/Ab-Romia/Context-Aware-AI) · [Report Bug](https://github.com/Ab-Romia/ContextIQ-RAG/issues) · [Request Feature](https://github.com/Ab-Romia/ContextIQ-RAG/issues)
-</div>
----
-## 🌟 What is ContextIQ?
-ContextIQ is an advanced **Retrieval-Augmented Generation (RAG)** application that transforms how you interact with your documents. Upload any document, ask questions, get summaries, or generate insights - all powered by state-of-the-art AI models from **OpenAI** and **OpenRouter**.
-### ✨ Key Highlights
-- 🎯 **Dual AI Provider Support**: Choose between OpenAI (GPT-4o, GPT-4, GPT-3.5) or OpenRouter (200+ models including DeepSeek R1 FREE, Claude, Gemini, and more)
-- 📚 **11+ File Formats Supported**: PDF, DOCX, PPTX, XLSX, CSV, TXT, MD, HTML, JSON, XML, RTF
-- 🚀 **Lightning-Fast RAG Pipeline**: Custom TF-IDF embeddings + ChromaDB vector search
-- 💎 **Beautiful Modern UI**: Dark-themed, responsive interface with Tailwind CSS
-- 🔒 **Privacy-First**: API keys stored locally in your browser, never on our servers
-- ⚡ **Smart Caching**: 10-minute response cache for faster interactions
-- 🎨 **Multiple Task Types**: Q&A, Summarization, Action Plans, Creative Writing
----
-## 🏗️ Architecture
-```
-┌──────────────────────────────────────────────────────────────┐
-│                    Frontend (HTML/JS/Tailwind)                │
-│  • Provider Selection (OpenAI/OpenRouter)                     │
-│  • File Upload & Text Input                                   │
-│  • Real-time Chat Interface                                   │
-│  • API Key Management                                          │
-└────────────────────┬─────────────────────────────────────────┘
-                     │ REST API
-┌────────────────────▼─────────────────────────────────────────┐
-│                    FastAPI Backend                             │
-│  • Request Validation (Pydantic)                               │
-│  • Multi-Provider LLM Support                                  │
-│  • File Processing Pipeline                                    │
-│  • Response Caching                                            │
-└────────────────────┬─────────────────────────────────────────┘
-                     │
-         ┌───────────┴───────────┐
-         │                       │
-┌────────▼────────┐    ┌─────────▼──────────┐
-│   ChromaDB      │    │  LLM Providers      │
-│ Vector Database │    │  • OpenAI API       │
-│ (TF-IDF)        │    │  • OpenRouter API   │
-└─────────────────┘    └────────────────────┘
-```
----
-## 🚀 Quick Start
-### Prerequisites
-- **Python 3.8+**
-- **API Key** from either:
-  - [OpenAI](https://platform.openai.com/api-keys) - For GPT models
-  - [OpenRouter](https://openrouter.ai/) - For 200+ models (FREE tier available)
-### Installation
-1. **Clone the repository**
-   ```bash
-   git clone https://github.com/Ab-Romia/ContextIQ-RAG.git
-   cd ContextIQ-RAG
-   ```
-2. **Install dependencies**
-   ```bash
-   pip install -r requirements.txt
-   ```
-3. **Run the application**
-   ```bash
-   python main.py
-   ```
-   Or use uvicorn directly:
-   ```bash
-   uvicorn main:app --host 0.0.0.0 --port 7860
-   ```
-4. **Access the web interface**
-   Open your browser and navigate to:
-   ```
-   http://localhost:7860
-   ```
-5. **Configure your AI provider**
-   - Choose between **OpenAI** or **OpenRouter** in the UI
-   - Enter your API key
-   - Test and save the key locally
----
-## 📖 How to Use
-### 1. Choose Your AI Provider
-- **OpenAI**: Access to GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5-turbo
-- **OpenRouter**: 200+ models including DeepSeek R1 (FREE), Claude, GPT-4, Gemini, Llama 3, and more
-  - **Default model**: DeepSeek R1 (completely free to use)
-### 2. Upload Your Documents
-ContextIQ supports a wide range of file formats:
-| Category | Formats |
-|----------|---------|
-| **Text** | .txt, .md, .rtf |
-| **Documents** | .pdf, .docx |
-| **Presentations** | .pptx |
-| **Data** | .xlsx, .csv, .json, .xml |
-| **Web** | .html, .htm |
-### 3. Index Your Content
-Click "Index Context" to process and store your documents in the vector database. The system will:
-- Extract text from your documents
-- Split into manageable chunks (600 characters)
-- Generate TF-IDF embeddings
-- Store in ChromaDB for fast retrieval
-### 4. Interact with Your AI Assistant
-Choose from multiple task types:
-- **Question & Answer**: Get precise answers from your documents
-- **Summarize**: Generate concise summaries
-- **Generate Action Plan**: Create actionable plans from your content
-- **Creative Writing**: Transform your ideas into creative content
----
-## 🎯 Features in Detail
-### 📁 Advanced File Processing
-Our robust file processing pipeline handles:
-- **PDF**: Multi-page extraction with PyMuPDF
-- **Word Documents**: Paragraphs and tables extraction
-- **PowerPoint**: Slide-by-slide text extraction
-- **Excel/CSV**: Structured data processing with Pandas
-- **HTML**: Clean text extraction with BeautifulSoup
-- **JSON/XML**: Intelligent parsing and formatting
-### 🧠 Intelligent RAG Pipeline
-1. **Custom TF-IDF Embeddings**
-   - 384-dimensional vectors
-   - N-gram support (1-2)
-   - English stop words filtering
-   - Fallback hashing mechanism
-2. **ChromaDB Vector Database**
-   - In-memory storage for speed
-   - Similarity-based retrieval
-   - Configurable chunk retrieval (default: 3)
-3. **Smart Context Assembly**
-   - Retrieves relevant chunks
-   - Constructs optimized prompts
-   - Respects token limits per task type
-### 🔧 Configurable Settings
-| Setting | Default | Description |
-|---------|---------|-------------|
-| MAX_TOKENS_CHAT | 4000 | Q&A response tokens |
-| MAX_TOKENS_SUMMARIZE | 3000 | Summary tokens |
-| MAX_TOKENS_PLAN | 5000 | Action plan tokens |
-| MAX_TOKENS_CREATIVE | 6000 | Creative writing tokens |
-| MAX_CHUNKS_RETRIEVE | 3 | Vector search results |
-| CACHE_EXPIRATION | 600s | Response cache duration |
----
-## 🛠️ Technology Stack
-### Backend
-- **FastAPI** - Modern, fast web framework
-- **ChromaDB** - Vector database for embeddings
-- **Scikit-learn** - TF-IDF vectorization
-- **Pydantic** - Data validation
-- **OpenAI SDK** - GPT models integration
-- **Requests** - HTTP client for OpenRouter
-### Frontend
-- **Tailwind CSS** - Utility-first CSS framework
-- **Marked.js** - Markdown rendering
-- **Vanilla JavaScript** - No framework bloat
-- **LocalStorage** - Client-side API key storage
-### File Processing
-- **PyMuPDF (fitz)** - PDF processing
-- **python-docx** - Word documents
-- **python-pptx** - PowerPoint files
-- **Pandas** - Excel/CSV handling
-- **BeautifulSoup** - HTML parsing
-- **striprtf** - RTF file support
----
-## 📊 API Endpoints
-| Endpoint | Method | Description |
-|----------|--------|-------------|
-| `/` | GET | Serve main interface |
-| `/health` | GET | Health check |
-| `/api/v1/test-api-key` | POST | Validate API key |
-| `/api/v1/index` | POST | Index text context |
-| `/api/v1/index-file` | POST | Upload & index file |
-| `/api/v1/generate` | POST | Generate AI response |
-| `/api/v1/task` | POST | Execute specialized task |
-| `/api/v1/clear_index` | POST | Clear vector database |
----
-## 🔒 Privacy & Security
-- ✅ API keys stored **only** in browser LocalStorage
-- ✅ No server-side API key storage
-- ✅ All requests use user-provided keys
-- ✅ HTTPS recommended for production
-- ✅ No telemetry or tracking
-- ✅ Open source - audit the code yourself
----
-## 🚢 Deployment
-### Docker
-```bash
-docker build -t contextiq .
-docker run -p 7860:7860 contextiq
-```
-### Hugging Face Spaces
-This project is optimized for Hugging Face Spaces deployment. Simply:
-1. Create a new Space
-2. Upload the repository files
-3. Set Space SDK to "Docker"
-4. Deploy!
-[View Live Demo](https://huggingface.co/spaces/Ab-Romia/Context-Aware-AI)
----
-## 🎨 UI Features
-- 🌙 **Dark Theme**: Easy on the eyes
-- 📱 **Fully Responsive**: Works on mobile, tablet, and desktop
-- 🎭 **Glass-morphism Effects**: Modern, elegant design
-- ⚡ **Real-time Updates**: Live status indicators
-- 📊 **Character/Word Counters**: Track your content
-- 🔄 **Collapsible Sections**: Clean, organized interface
-- 💬 **Markdown Support**: Rich text formatting in responses
----
-## 🤝 Contributing
-Contributions are welcome! Please feel free to submit a Pull Request.
-1. Fork the repository
-2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
-3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
-4. Push to the branch (`git push origin feature/AmazingFeature`)
-5. Open a Pull Request
 ---
-## 📝 License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
----
-## 🙏 Acknowledgments
-- **OpenRouter** for providing access to 200+ AI models
-- **OpenAI** for GPT models
-- **ChromaDB** for the vector database
-- **FastAPI** for the amazing web framework
-- **Tailwind CSS** for the beautiful UI
----
-## 📬 Contact
-**Ab-Romia** - Abdelrahman Abouroumia
-- GitHub: [@Ab-Romia](https://github.com/Ab-Romia)
-- Hugging Face: [Ab-Romia](https://huggingface.co/Ab-Romia)
----
-<div align="center">
-**⭐ Star this repo if you find it helpful! ⭐**
-Made with ❤️ by Ab-Romia
-</div>

 ---
+title: Context Aware AI
+emoji: 🌍
+colorFrom: green
+colorTo: red
 sdk: docker
 pinned: false
 license: mit
 ---
+# ContextIQ - Smart Document Assistant 🧠
+A RAG-powered AI assistant that answers questions based on your documents. Upload any text, ask questions, and get intelligent responses.
+## 🚀 How to Use on Hugging Face
+### 1. **Get Your API Key (Free)**
+- Go to [OpenRouter.ai](https://openrouter.ai) and sign up
+- Copy your API key (starts with `sk-or-`)
+- No credit card required for basic usage
+### 2. **Configure the App**
+- Enter your API key in the configuration section
+- Click "Test Key" to validate
+- Click "Save" to remember it for future sessions
+### 3. **Add Your Documents**
+- Paste your text, documents, or notes in the "Knowledge Base" panel
+- Click "Index Context" to process the content
+- Wait for the green success message
+### 4. **Start Asking Questions**
+Choose your task:
+- **Question & Answer**: Ask specific questions about your content
+- **Summarize**: Get a concise summary of your documents
+- **Plan**: Generate action plans based on your content
+- **Creative**: Write stories or content inspired by your documents
+## ✨ What You Can Do
+- **Analyze Documents**: Research papers, meeting notes, reports
+- **Study Materials**: Summarize textbooks, generate study questions
+- **Business Intelligence**: Analyze customer feedback, market research
+- **Content Creation**: Generate blog posts, creative writing from source material
+## 🔧 Features
+- **Smart Context Search**: Finds relevant information from your documents
+- **Multiple AI Tasks**: Q&A, summarization, planning, creative writing
+- **Mobile Friendly**: Works perfectly on phones and tablets
+- **Secure**: Your API key stays in your browser, never shared
+- **Fast**: Cached responses for repeated questions
+## 💡 Example Usage
+1. **Upload a research paper** → Ask "What are the main findings?"
+2. **Paste meeting notes** → Generate "Create an action plan"
+3. **Add product specs** → Write "Create marketing copy"
+4. **Upload course material** → Ask "Explain the key concepts"
+## 🛠️ Technical Details
+- **Backend**: FastAPI + ChromaDB vector database
+- **AI Model**: DeepSeek R1 via OpenRouter
+- **Embeddings**: Custom TF-IDF for document similarity
+- **Frontend**: Vanilla JavaScript with Tailwind CSS
+## 🔒 Privacy
+- Your documents are processed in memory only
+- API keys stored locally in your browser
+- No data is saved on our servers
+- All communication is encrypted
 ---
+**Built by [Abdelrahman Abouroumia](https://github.com/Ab-Romia)** | **Try it now on [Hugging Face](https://huggingface.co/spaces/Ab-Romia/Context-Aware-AI) or [Github](https://github.com/Ab-Romia/ContextIQ-RAG)**

README_HUGGINGFACE.md DELETED Viewed

@@ -1,106 +0,0 @@
----
-title: ContextIQ - Context-Aware AI Assistant
-emoji: 🧠
-colorFrom: purple
-colorTo: blue
-sdk: docker
-pinned: true
-license: mit
-app_port: 7860
----
-# 🧠 ContextIQ - Intelligent Context-Aware AI Assistant
-Welcome to **ContextIQ**, a sophisticated RAG (Retrieval-Augmented Generation) application that transforms how you interact with your documents!
-## 🌟 What Can You Do?
-- 📚 **Upload Documents**: Support for 11+ file formats (PDF, DOCX, PPTX, XLSX, CSV, TXT, MD, HTML, JSON, XML, RTF)
-- 🤖 **Ask Questions**: Get intelligent answers based on your uploaded documents
-- 📝 **Summarize**: Generate concise summaries of your content
-- 📋 **Action Plans**: Create actionable plans from your documents
-- ✍️ **Creative Writing**: Transform your ideas into creative content
-## 🎯 Dual AI Provider Support
-Choose your preferred AI provider:
-### OpenRouter (FREE DeepSeek Model!)
-- 200+ models including DeepSeek R1 (FREE), Claude, GPT-4, Gemini, Llama 3
-- **Default**: DeepSeek R1 - completely free to use
-- Get your key: [openrouter.ai](https://openrouter.ai/)
-### OpenAI
-- GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5-turbo
-- Production-ready models
-- Get your key: [platform.openai.com/api-keys](https://platform.openai.com/api-keys)
-## 🚀 How to Use
-1. **Choose Your AI Provider**
-   - Select OpenRouter (free) or OpenAI in the interface
-2. **Enter Your API Key**
-   - Your key is stored locally in your browser only
-   - Never sent to our servers
-3. **Upload Your Documents**
-   - Drag & drop or browse for files
-   - Or paste text directly
-4. **Index Your Content**
-   - Click "Index Context" to process your documents
-5. **Start Asking Questions!**
-   - Choose a task type (Q&A, Summarize, Plan, Creative)
-   - Type your question or prompt
-   - Get AI-powered responses based on your documents
-## 🔒 Privacy & Security
-- ✅ Your API keys are stored **only** in your browser
-- ✅ No server-side storage of API keys
-- ✅ All requests use your own API key
-- ✅ Open source - audit the code yourself
-## 🛠️ Technology Stack
-- **Backend**: FastAPI + Python
-- **Vector Database**: ChromaDB with custom TF-IDF embeddings
-- **Frontend**: Tailwind CSS + Vanilla JavaScript
-- **AI Providers**: OpenAI SDK + OpenRouter API
-- **File Processing**: PyMuPDF, python-docx, pandas, BeautifulSoup, and more
-## 📊 Supported File Formats
-| Category | Formats |
-|----------|---------|
-| **Text** | .txt, .md, .rtf |
-| **Documents** | .pdf, .docx |
-| **Presentations** | .pptx |
-| **Data** | .xlsx, .csv, .json, .xml |
-| **Web** | .html, .htm |
-## 💡 Tips for Best Results
-- **Clear Questions**: Ask specific questions about your documents
-- **Context Matters**: The more relevant text you provide, the better the answers
-- **Chunk Size**: Large documents are automatically split into manageable chunks
-- **Model Selection**:
-  - Use OpenRouter's DeepSeek R1 (FREE) for excellent reasoning at no cost
-  - Use OpenAI's GPT-4o for production workloads
-  - Default DeepSeek model is completely free - no credit card needed!
-## 🤝 Open Source
-This project is open source! Check out the code on GitHub:
-[github.com/Ab-Romia/ContextIQ-RAG](https://github.com/Ab-Romia/ContextIQ-RAG)
-## 📬 Feedback
-Found a bug or have a feature request?
-[Open an issue on GitHub](https://github.com/Ab-Romia/ContextIQ-RAG/issues)
----
-Made with ❤️ by Ab-Romia (Abdelrahman Abouroumia)

app/__init__.py DELETED Viewed

File without changes

app/config.py CHANGED Viewed

@@ -9,45 +9,35 @@ class Settings(BaseSettings):
     # OpenRouter Configuration
     OPENROUTER_API_KEY: str = ""
     OPENROUTER_URL: str = "https://openrouter.ai/api/v1"
-    OPENROUTER_MODEL: str = "deepseek/deepseek-r1-0528:free"
-    # OpenAI Configuration
-    OPENAI_API_KEY: str = ""
-    OPENAI_URL: str = "https://api.openai.com/v1"
-    OPENAI_MODEL: str = "gpt-4o-mini"  # Default to GPT-4o-mini for cost efficiency
-    # Legacy field for backward compatibility
     MODEL_NAME: str = "deepseek/deepseek-r1-0528:free"
     # Token Limits Configuration
-    MAX_TOKENS_CHAT: int = 4000  # For Q&A responses
-    MAX_TOKENS_SUMMARIZE: int = 3000  # For summaries
-    MAX_TOKENS_PLAN: int = 5000  # For action plans
-    MAX_TOKENS_CREATIVE: int = 6000  # For creative writing
-    MAX_TOKENS_TEST: int = 50  # For API key testing
-    # Context Limits - Optimized for better retrieval
-    MAX_CONTEXT_LENGTH_CHAT: int = 12000
-    MAX_CONTEXT_LENGTH_TASK: int = 16000
-    MAX_CHUNKS_RETRIEVE: int = 5
-    CHUNK_SIZE: int = 500
-    CHUNK_OVERLAP: int = 100
     # Performance Settings
-    REQUEST_TIMEOUT_BASE: int = 120  # Base timeout in seconds
-    REQUEST_TIMEOUT_PER_1K_TOKENS: int = 4  # Additional seconds per 1000 tokens
     # New setting to control fallback behavior
     REQUIRE_USER_API_KEY: bool = True
     class Config:
         env_file = ".env"
         case_sensitive = True
 # Create settings instance
 settings = Settings()
 # Debug logging for API key configuration
 logger.info("=" * 80)
 logger.info("🔧 CONFIGURATION DEBUG")
@@ -82,10 +72,9 @@ else:
 # Check if API key starts with expected prefix (only if present)
 if settings.OPENROUTER_API_KEY:
-    api_key_preview = settings.OPENROUTER_API_KEY[:20] + "..." if len(
-        settings.OPENROUTER_API_KEY) > 20 else settings.OPENROUTER_API_KEY
     logger.info(f"🔑 Server API Key Preview: {api_key_preview}")
     # OpenRouter API keys typically start with "sk-or-"
     if settings.OPENROUTER_API_KEY.startswith("sk-or-"):
         logger.info("✅ Server API key format looks correct (starts with 'sk-or-')")
@@ -114,56 +103,27 @@ def get_max_tokens_for_task(task_type: str) -> int:
     }
     return token_map.get(task_type, settings.MAX_TOKENS_CHAT)
 def get_timeout_for_tokens(max_tokens: int) -> int:
     """Calculate appropriate timeout based on token count."""
     additional_time = (max_tokens // 1000) * settings.REQUEST_TIMEOUT_PER_1K_TOKENS
     return settings.REQUEST_TIMEOUT_BASE + additional_time
-def validate_api_key(api_key: str, provider: str = "openrouter") -> bool:
-    """Validate API key format for OpenRouter or OpenAI"""
     if not api_key:
         return False
-    provider = provider.lower()
-    if provider == "openrouter":
-        # OpenRouter keys should start with "sk-or-" and be at least 40 characters
-        if not api_key.startswith("sk-or-"):
-            logger.warning("⚠️  OpenRouter API key should start with 'sk-or-'")
-            return False
-        if len(api_key) < 40:
-            logger.warning("⚠️  OpenRouter API key seems too short")
-            return False
-    elif provider == "openai":
-        # OpenAI keys should start with "sk-" and be at least 40 characters
-        if not api_key.startswith("sk-"):
-            logger.warning("⚠️  OpenAI API key should start with 'sk-'")
-            return False
-        if len(api_key) < 40:
-            logger.warning("⚠️  OpenAI API key seems too short")
-            return False
-    else:
-        logger.warning(f"⚠️  Unknown provider: {provider}")
         return False
     return True
-def detect_provider_from_key(api_key: str) -> str:
-    """Detect provider from API key format"""
-    if not api_key:
-        return "unknown"
-    if api_key.startswith("sk-or-"):
-        return "openrouter"
-    elif api_key.startswith("sk-proj-") or api_key.startswith("sk-"):
-        return "openai"
-    else:
-        return "unknown"
 # Validate the current server API key (if present)
 if settings.OPENROUTER_API_KEY:
     is_valid = validate_api_key(settings.OPENROUTER_API_KEY)
@@ -172,4 +132,4 @@ else:
     logger.info("🔍 No server API key to validate")
 # Export settings
-__all__ = ['settings', 'validate_api_key', 'detect_provider_from_key', 'get_max_tokens_for_task', 'get_timeout_for_tokens']

     # OpenRouter Configuration
     OPENROUTER_API_KEY: str = ""
     OPENROUTER_URL: str = "https://openrouter.ai/api/v1"
     MODEL_NAME: str = "deepseek/deepseek-r1-0528:free"
     # Token Limits Configuration
+    MAX_TOKENS_CHAT: int = 4000        # For Q&A responses
+    MAX_TOKENS_SUMMARIZE: int = 3000   # For summaries
+    MAX_TOKENS_PLAN: int = 5000        # For action plans
+    MAX_TOKENS_CREATIVE: int = 6000    # For creative writing
+    MAX_TOKENS_TEST: int = 50          # For API key testing
+    # Context Limits
+    MAX_CONTEXT_LENGTH_CHAT: int = 8000    # For chat context
+    MAX_CONTEXT_LENGTH_TASK: int = 12000   # For task context
+    MAX_CHUNKS_RETRIEVE: int = 3           # Number of chunks to retrieve
     # Performance Settings
+    REQUEST_TIMEOUT_BASE: int = 120         # Base timeout in seconds
+    REQUEST_TIMEOUT_PER_1K_TOKENS: int = 4 # Additional seconds per 1000 tokens
     # New setting to control fallback behavior
     REQUIRE_USER_API_KEY: bool = True
     class Config:
         env_file = ".env"
         case_sensitive = True
 # Create settings instance
 settings = Settings()
 # Debug logging for API key configuration
 logger.info("=" * 80)
 logger.info("🔧 CONFIGURATION DEBUG")
 # Check if API key starts with expected prefix (only if present)
 if settings.OPENROUTER_API_KEY:
+    api_key_preview = settings.OPENROUTER_API_KEY[:20] + "..." if len(settings.OPENROUTER_API_KEY) > 20 else settings.OPENROUTER_API_KEY
     logger.info(f"🔑 Server API Key Preview: {api_key_preview}")
     # OpenRouter API keys typically start with "sk-or-"
     if settings.OPENROUTER_API_KEY.startswith("sk-or-"):
         logger.info("✅ Server API key format looks correct (starts with 'sk-or-')")
     }
     return token_map.get(task_type, settings.MAX_TOKENS_CHAT)
 def get_timeout_for_tokens(max_tokens: int) -> int:
     """Calculate appropriate timeout based on token count."""
     additional_time = (max_tokens // 1000) * settings.REQUEST_TIMEOUT_PER_1K_TOKENS
     return settings.REQUEST_TIMEOUT_BASE + additional_time
+def validate_api_key(api_key: str) -> bool:
+    """Validate OpenRouter API key format"""
     if not api_key:
         return False
+    # OpenRouter keys should start with "sk-or-" and be at least 40 characters
+    if not api_key.startswith("sk-or-"):
+        logger.warning("⚠️  API key should start with 'sk-or-'")
         return False
+    if len(api_key) < 40:
+        logger.warning("⚠️  API key seems too short")
+        return False
     return True
 # Validate the current server API key (if present)
 if settings.OPENROUTER_API_KEY:
     is_valid = validate_api_key(settings.OPENROUTER_API_KEY)
     logger.info("🔍 No server API key to validate")
 # Export settings
+__all__ = ['settings', 'validate_api_key']

app/main.py CHANGED Viewed

@@ -13,39 +13,33 @@ from typing import Optional
 # Get the base directory (works both locally and on Hugging Face)
 if os.path.exists("/app"):  # Hugging Face environment
     BASE_DIR = Path("/app")
-    STATIC_DIR = BASE_DIR / "static"
-    TEMPLATES_DIR = BASE_DIR / "templates"
 else:  # Local environment
     BASE_DIR = Path(__file__).resolve().parent
-    STATIC_DIR = BASE_DIR.parent / "static"
-    TEMPLATES_DIR = BASE_DIR.parent / "templates"
 app = FastAPI(
     title="ContextIQ RAG - Intelligent Context-Aware Assistant",
     description="A sophisticated RAG-powered backend using FastAPI and OpenRouter.",
-    version="2.2.0"  # Version updated for new feature
 )
 # Mount static files and templates
-app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")
-templates = Jinja2Templates(directory=TEMPLATES_DIR)
 def get_api_key(x_api_key: Optional[str] = Header(None)) -> str:
     """Extract API key from header or use default."""
     if x_api_key and x_api_key.strip():
         return x_api_key.strip()
     # Fall back to server default if available
     if config.settings.OPENROUTER_API_KEY:
         return config.settings.OPENROUTER_API_KEY
     raise HTTPException(
         status_code=400,
         detail="No API key provided. Please provide your OpenRouter API key via the X-API-Key header."
     )
 @app.get("/debug")
 async def debug_config():
     """Debug endpoint to check configuration."""
@@ -57,23 +51,21 @@ async def debug_config():
         "accepts_user_keys": True
     }
 @app.post("/api/v1/test-api-key", response_model=schemas.ApiKeyTestResponse)
 async def test_api_key_endpoint(api_key_request: schemas.ApiKeyRequest):
     """
-    Test if the provided API key is valid (OpenRouter or OpenAI).
     """
     try:
-        result = await services.test_api_key(api_key_request.api_key, api_key_request.provider)
         return schemas.ApiKeyTestResponse(**result)
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.post("/api/v1/task", response_model=schemas.TaskResponse)
 async def execute_task(
-        task_request: schemas.TaskRequest,
-        x_api_key: Optional[str] = Header(None)
 ):
     """
     Executes a specific task (e.g., summarize, plan) based on the provided context
@@ -86,7 +78,6 @@ async def execute_task(
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.get("/", response_class=HTMLResponse)
 async def read_root(request: Request):
     """
@@ -94,17 +85,15 @@ async def read_root(request: Request):
     """
     return templates.TemplateResponse("index.html", {"request": request})
 @app.get("/health")
 async def health_check():
     """Health check endpoint for Hugging Face."""
     return {"status": "healthy", "message": "ContextIQ RAG is running!"}
 @app.post("/api/v1/generate", response_model=schemas.ChatResponse)
 async def generate_response(
-        chat_request: schemas.ChatRequest,
-        x_api_key: Optional[str] = Header(None)
 ):
     """
     Receives a prompt, retrieves relevant context from the vector DB,
@@ -117,7 +106,6 @@ async def generate_response(
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.post("/api/v1/clear_index", response_model=schemas.GeneralResponse)
 async def clear_context_index(x_api_key: Optional[str] = Header(None)):
     """
@@ -129,11 +117,10 @@ async def clear_context_index(x_api_key: Optional[str] = Header(None)):
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to clear index: {e}")
 @app.post("/api/v1/index", response_model=schemas.IndexResponse)
 async def index_context(
-        document_request: schemas.DocumentRequest,
-        x_api_key: Optional[str] = Header(None)
 ):
     """
     Receives text, clears the old index, chunks the new text,
@@ -142,22 +129,21 @@ async def index_context(
     try:
         # Validate API key access (but indexing doesn't require API calls)
         get_api_key(x_api_key)
         docs_added = services.index_document(document_request)
         return schemas.IndexResponse(
             message="Context has been successfully indexed.",
             documents_added=docs_added,
-            extracted_text=document_request.context  # Return the provided text
         )
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to index document: {e}")
 # ✨ UPDATED: File Upload Endpoint now returns the extracted text
 @app.post("/api/v1/index-file", response_model=schemas.IndexResponse)
 async def index_file(
-        x_api_key: Optional[str] = Header(None),
-        file: UploadFile = File(...)
 ):
     """
     Receives a file (.txt, .pdf), extracts text, and indexes it.
@@ -180,3 +166,7 @@ async def index_file(
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to process file: {str(e)}")

 # Get the base directory (works both locally and on Hugging Face)
 if os.path.exists("/app"):  # Hugging Face environment
     BASE_DIR = Path("/app")
 else:  # Local environment
     BASE_DIR = Path(__file__).resolve().parent
 app = FastAPI(
     title="ContextIQ RAG - Intelligent Context-Aware Assistant",
     description="A sophisticated RAG-powered backend using FastAPI and OpenRouter.",
+    version="2.2.0" # Version updated for new feature
 )
 # Mount static files and templates
+app.mount("/static", StaticFiles(directory=BASE_DIR / "static"), name="static")
+templates = Jinja2Templates(directory=BASE_DIR / "templates")
 def get_api_key(x_api_key: Optional[str] = Header(None)) -> str:
     """Extract API key from header or use default."""
     if x_api_key and x_api_key.strip():
         return x_api_key.strip()
     # Fall back to server default if available
     if config.settings.OPENROUTER_API_KEY:
         return config.settings.OPENROUTER_API_KEY
     raise HTTPException(
         status_code=400,
         detail="No API key provided. Please provide your OpenRouter API key via the X-API-Key header."
     )
 @app.get("/debug")
 async def debug_config():
     """Debug endpoint to check configuration."""
         "accepts_user_keys": True
     }
 @app.post("/api/v1/test-api-key", response_model=schemas.ApiKeyTestResponse)
 async def test_api_key_endpoint(api_key_request: schemas.ApiKeyRequest):
     """
+    Test if the provided API key is valid.
     """
     try:
+        result = await services.test_api_key(api_key_request.api_key)
         return schemas.ApiKeyTestResponse(**result)
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.post("/api/v1/task", response_model=schemas.TaskResponse)
 async def execute_task(
+    task_request: schemas.TaskRequest,
+    x_api_key: Optional[str] = Header(None)
 ):
     """
     Executes a specific task (e.g., summarize, plan) based on the provided context
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.get("/", response_class=HTMLResponse)
 async def read_root(request: Request):
     """
     """
     return templates.TemplateResponse("index.html", {"request": request})
 @app.get("/health")
 async def health_check():
     """Health check endpoint for Hugging Face."""
     return {"status": "healthy", "message": "ContextIQ RAG is running!"}
 @app.post("/api/v1/generate", response_model=schemas.ChatResponse)
 async def generate_response(
+    chat_request: schemas.ChatRequest,
+    x_api_key: Optional[str] = Header(None)
 ):
     """
     Receives a prompt, retrieves relevant context from the vector DB,
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @app.post("/api/v1/clear_index", response_model=schemas.GeneralResponse)
 async def clear_context_index(x_api_key: Optional[str] = Header(None)):
     """
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to clear index: {e}")
 @app.post("/api/v1/index", response_model=schemas.IndexResponse)
 async def index_context(
+    document_request: schemas.DocumentRequest,
+    x_api_key: Optional[str] = Header(None)
 ):
     """
     Receives text, clears the old index, chunks the new text,
     try:
         # Validate API key access (but indexing doesn't require API calls)
         get_api_key(x_api_key)
         docs_added = services.index_document(document_request)
         return schemas.IndexResponse(
             message="Context has been successfully indexed.",
             documents_added=docs_added,
+            extracted_text=document_request.context # Return the provided text
         )
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to index document: {e}")
 # ✨ UPDATED: File Upload Endpoint now returns the extracted text
 @app.post("/api/v1/index-file", response_model=schemas.IndexResponse)
 async def index_file(
+    x_api_key: Optional[str] = Header(None),
+    file: UploadFile = File(...)
 ):
     """
     Receives a file (.txt, .pdf), extracts text, and indexes it.
     except Exception as e:
         raise HTTPException(status_code=500, detail=f"Failed to process file: {str(e)}")
+if __name__ == "__main__":
+    port = int(os.environ.get("PORT", 7860))
+    uvicorn.run(app, host="0.0.0.0", port=port)

app/rag_setup.py CHANGED Viewed

@@ -2,15 +2,14 @@ import chromadb
 import logging
 import requests
 import json
-from config import settings, detect_provider_from_key  # Fixed: removed 'app.' prefix
 import time
 import os
 import numpy as np
 from sklearn.feature_extraction.text import TfidfVectorizer
-from typing import List, Optional
 import hashlib
 import re
-from openai import OpenAI
 # Disable ChromaDB telemetry to avoid errors
 os.environ["ANONYMIZED_TELEMETRY"] = "False"
@@ -18,7 +17,6 @@ os.environ["ANONYMIZED_TELEMETRY"] = "False"
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 logger = logging.getLogger("rag-setup")
 # Custom TF-IDF based embedding function
 class TFIDFEmbeddingFunction:
     def __init__(self, max_features=384):
@@ -30,37 +28,34 @@ class TFIDFEmbeddingFunction:
         )
         self.is_fitted = False
         self.max_features = max_features
-    def name(self):
-        """Return a name identifier for this embedding function."""
-        return "tfidf_embedder"
     def _preprocess_text(self, text: str) -> str:
         """Clean and preprocess text."""
         text = re.sub(r'\s+', ' ', text)
         text = re.sub(r'[^\w\s]', ' ', text)
         return text.strip().lower()
     def __call__(self, input: List[str]) -> List[List[float]]:
         """Generate TF-IDF based embeddings."""
         try:
             logger.info(f"🔢 Generating embeddings for {len(input)} texts")
             processed_texts = [self._preprocess_text(text) for text in input]
             if not self.is_fitted and processed_texts:
                 # Fit the vectorizer on the input texts
                 self.vectorizer.fit(processed_texts)
                 self.is_fitted = True
                 logger.info("✅ TF-IDF vectorizer fitted on input texts")
             if not self.is_fitted:
                 # Return simple fallback if no data to fit on
                 logger.warning("⚠️  Using fallback embeddings - vectorizer not fitted")
                 return self._fallback_embeddings(input)
             # Transform texts to vectors
             tfidf_matrix = self.vectorizer.transform(processed_texts)
             embeddings = tfidf_matrix.toarray()
             # Ensure consistent dimensions
             result_embeddings = []
             for embedding in embeddings:
@@ -70,14 +65,14 @@ class TFIDFEmbeddingFunction:
                     result_embeddings.append(padded.tolist())
                 else:
                     result_embeddings.append(embedding[:self.max_features].tolist())
             logger.info(f"✅ Generated {len(result_embeddings)} embeddings of dimension {self.max_features}")
             return result_embeddings
         except Exception as e:
             logger.error(f"❌ Error generating TF-IDF embeddings: {e}")
             return self._fallback_embeddings(input)
     def _fallback_embeddings(self, input: List[str]) -> List[List[float]]:
         """Simple fallback embedding method."""
         logger.info(f"🔧 Using fallback embeddings for {len(input)} texts")
@@ -85,28 +80,27 @@ class TFIDFEmbeddingFunction:
         for text in input:
             text_hash = hashlib.md5(text.encode()).hexdigest()
             embedding = []
             # Convert hash to numbers
             for i in range(0, min(len(text_hash), 32), 2):
-                hex_pair = text_hash[i:i + 2]
                 embedding.append(int(hex_pair, 16) / 255.0)
             # Add text features
             embedding.extend([
                 len(text) / 1000.0,
                 len(text.split()) / 100.0,
                 text.count('.') / 10.0,
             ])
             # Pad to desired size
             while len(embedding) < self.max_features:
                 embedding.extend(embedding[:min(len(embedding), self.max_features - len(embedding))])
             embeddings.append(embedding[:self.max_features])
         return embeddings
 # Simple ChromaDB setup - use in-memory storage for Hugging Face
 logger.info("🔧 Initializing ChromaDB with in-memory storage for Hugging Face compatibility")
@@ -114,14 +108,14 @@ try:
     # Use in-memory client to avoid permission issues
     client = chromadb.Client()
     embedding_function = TFIDFEmbeddingFunction()
     collection = client.get_or_create_collection(
         name="context_aware_collection",
         embedding_function=embedding_function
     )
     logger.info("✅ ChromaDB collection initialized successfully with in-memory storage")
 except Exception as e:
     logger.error(f"❌ Error initializing ChromaDB: {e}")
     raise RuntimeError(f"Failed to initialize ChromaDB: {e}")
@@ -133,7 +127,7 @@ class OpenRouterLLM:
         self.base_url = base_url
         self.model = model
         self.api_url = f"{base_url.rstrip('/')}/chat/completions"
         logger.info("=" * 60)
         logger.info("🚀 INITIALIZING OPENROUTER LLM")
         logger.info("=" * 60)
@@ -141,12 +135,12 @@ class OpenRouterLLM:
         logger.info(f"🔑 API Key present: {'Yes' if api_key else 'No'}")
         logger.info(f"📏 API Key length: {len(api_key) if api_key else 0}")
         logger.info(f"🌐 API URL: {self.api_url}")
         if not api_key or not api_key.strip():
             logger.error("❌ OpenRouter API key is missing or empty")
             self.client_ready = False
             return
         # Test the connection with minimal tokens
         try:
             logger.info("🔍 Testing OpenRouter connection...")
@@ -160,12 +154,12 @@ class OpenRouterLLM:
         except Exception as e:
             logger.error(f"❌ OpenRouter connection test failed: {e}")
             self.client_ready = False
         logger.info("=" * 60)
     def _make_api_request(self, prompt: str, max_tokens: int = 2000, timeout: int = None) -> dict:
         """Make a direct HTTP request to OpenRouter API with configurable token limits."""
         # Calculate dynamic timeout based on max_tokens and prompt length
         if timeout is None:
             base_timeout = 120
@@ -173,19 +167,19 @@ class OpenRouterLLM:
             token_timeout = max(20, max_tokens // 100)  # ~1 second per 100 tokens
             prompt_timeout = max(10, len(prompt) // 1000)  # ~1 second per 2000 characters
             timeout = min(base_timeout + token_timeout + prompt_timeout, 600)  # Cap at 5 minutes
         logger.info(f"🌐 Making API request to OpenRouter")
         logger.info(f"📏 Prompt length: {len(prompt)} characters")
         logger.info(f"🎯 Max tokens: {max_tokens}")
         logger.info(f"⏱️  Timeout: {timeout}s")
         headers = {
             "Authorization": f"Bearer {self.api_key}",
             "Content-Type": "application/json",
             "HTTP-Referer": "https://github.com/Ab-Romia/ContextIQ-RAG",
             "X-Title": "Context Aware AI"
         }
         # Optimize payload for longer responses
         payload = {
             "model": self.model,
@@ -198,15 +192,15 @@ class OpenRouterLLM:
             "presence_penalty": 0.1,  # Slight penalty for repetition
             "frequency_penalty": 0.1,  # Slight penalty for frequency
         }
         # Log the request payload (without sensitive data)
         safe_payload = payload.copy()
         safe_payload["messages"] = [{"role": "user", "content": f"[CONTENT: {len(prompt)} chars]"}]
         logger.info(f"📤 Request payload: {json.dumps(safe_payload, indent=2)}")
         try:
             start_time = time.time()
             with requests.Session() as session:
                 response = session.post(
                     self.api_url,
@@ -214,40 +208,39 @@ class OpenRouterLLM:
                     json=payload,
                     timeout=timeout
                 )
             request_time = time.time() - start_time
             logger.info(f"⏱️  API request completed in {request_time:.2f}s")
             logger.info(f"📊 Response status: {response.status_code}")
             if response.status_code == 200:
                 response_data = response.json()
                 logger.info("✅ API request successful")
                 # Log response details
                 if "choices" in response_data and response_data["choices"]:
                     content = response_data["choices"][0]["message"]["content"]
                     logger.info(f"📝 Response content length: {len(content)} characters")
                     # Check if response was truncated
                     if "usage" in response_data:
                         usage = response_data["usage"]
                         completion_tokens = usage.get("completion_tokens", 0)
                         logger.info(f"📊 Token usage: {usage}")
                         if completion_tokens >= max_tokens * 0.95:  # If we used 95% of max tokens
-                            logger.warning(
-                                f"⚠️  Response may be truncated (used {completion_tokens}/{max_tokens} tokens)")
                     content_preview = content[:300] + "..." if len(content) > 300 else content
                     logger.info(f"📄 Response preview: {content_preview}")
                 return response_data
             else:
                 logger.error(f"❌ API request failed with status {response.status_code}")
                 logger.error(f"📄 Response text: {response.text}")
                 return {"error": f"HTTP {response.status_code}: {response.text}"}
         except requests.exceptions.Timeout:
             logger.error(f"⏱️  API request timed out after {timeout}s")
             return {"error": f"Request timed out after {timeout}s. Try reducing the context length or max tokens."}
@@ -267,11 +260,11 @@ class OpenRouterLLM:
         logger.info(f"🎯 Requested max tokens: {max_tokens}")
         logger.info(f"🔧 Client status: {'Ready' if self.client_ready else 'Not ready'}")
         logger.info(f"🔑 API key status: {'Present' if self.api_key else 'Missing'}")
         # Dynamic prompt optimization based on max_tokens
         original_length = len(prompt)
         max_prompt_length = 12000 if max_tokens > 3000 else 8000  # Allow longer prompts for longer responses
         if len(prompt) > max_prompt_length:
             logger.warning(f"⚠️  Prompt is quite long ({original_length} chars), truncating for better performance")
             # Intelligent truncation that preserves structure
@@ -280,37 +273,36 @@ class OpenRouterLLM:
                 if len(parts) == 2:
                     context_part = parts[0]
                     question_part = "Question:" + parts[1]
                     # Keep the question and instructions, truncate context if needed
                     available_for_context = max_prompt_length - len(question_part) - 500  # Reserve space
                     if len(context_part) > available_for_context:
-                        context_part = context_part[
-                                       :available_for_context] + "\n\n[... content truncated for performance ...]"
                     prompt = context_part + question_part
                     logger.info(f"📏 Prompt intelligently truncated from {original_length} to {len(prompt)} characters")
             else:
                 prompt = prompt[:max_prompt_length] + "\n\n[... content truncated for performance ...]"
                 logger.info(f"📏 Prompt truncated from {original_length} to {len(prompt)} characters")
         # Log prompt preview
         prompt_preview = prompt[:400] + "..." if len(prompt) > 400 else prompt
         logger.info(f"📝 PROMPT PREVIEW:")
         logger.info(f"   {prompt_preview}")
         logger.info("-" * 60)
         # Check API key first
         if not self.api_key or not self.api_key.strip():
             error_msg = "❌ OpenRouter API key is not configured. Please set the OPENROUTER_API_KEY environment variable."
             logger.error(error_msg)
             return error_msg
         # Check client readiness
         if not self.client_ready:
             error_msg = "❌ OpenRouter client is not ready. Please check your API key and connection."
             logger.error(error_msg)
             return error_msg
         max_retries = 3
         retry_count = 0
         base_wait_time = 2
@@ -318,21 +310,21 @@ class OpenRouterLLM:
         while retry_count <= max_retries:
             try:
                 logger.info(f"🔄 API call attempt {retry_count + 1}/{max_retries + 1}")
                 # Adjust parameters based on retry attempt
                 current_max_tokens = max_tokens
                 timeout = None  # Let _make_api_request calculate dynamic timeout
                 if retry_count > 0:
                     # Reduce max_tokens on retries for faster responses
                     current_max_tokens = max(1000, max_tokens - (retry_count * 500))
                     logger.info(f"🔧 Retry attempt - reducing max_tokens to {current_max_tokens}")
                 response = self._make_api_request(prompt, max_tokens=current_max_tokens, timeout=timeout)
                 if "error" in response:
                     error_msg = response["error"]
                     # Handle specific error types
                     if "timeout" in error_msg.lower() or "408" in error_msg:
                         logger.warning(f"⏱️  Timeout error on attempt {retry_count + 1}")
@@ -348,29 +340,28 @@ class OpenRouterLLM:
                     elif "401" in error_msg or "403" in error_msg:
                         logger.error(f"🔑 Authentication error: {error_msg}")
                         return f"❌ Authentication error: {error_msg}"
                     raise Exception(error_msg)
                 if "choices" in response and len(response["choices"]) > 0:
                     content = response["choices"][0]["message"]["content"]
                     if content:
                         logger.info(f"✅ Successfully generated response")
                         logger.info(f"📏 Response length: {len(content)} characters")
                         # Check if response seems complete
                         if "usage" in response:
                             usage = response["usage"]
                             completion_tokens = usage.get("completion_tokens", 0)
                             if completion_tokens >= current_max_tokens * 0.95:
-                                logger.warning(
-                                    f"⚠️  Response may be incomplete (used {completion_tokens}/{current_max_tokens} tokens)")
                                 content += "\n\n[Note: Response may be truncated due to token limits. Consider asking for specific parts if needed.]"
                         response_preview = content[:400] + "..." if len(content) > 400 else content
                         logger.info(f"📤 RESPONSE PREVIEW:")
                         logger.info(f"   {response_preview}")
                         logger.info("=" * 80)
                         return content
                     else:
                         logger.error("❌ Received empty response from AI model")
@@ -385,253 +376,45 @@ class OpenRouterLLM:
                         retry_count += 1
                         continue
                     return "❌ Invalid response format from AI model."
             except Exception as e:
                 error_type = type(e).__name__
                 error_msg = str(e)
                 logger.error(f"❌ API call failed (attempt {retry_count + 1}): {error_type}: {error_msg}")
                 retry_count += 1
                 if retry_count > max_retries:
                     final_error = f"❌ Error: Failed to get response from AI model after {max_retries + 1} attempts. Final error: {error_msg}"
                     logger.error(final_error)
                     logger.info("=" * 80)
                     return final_error
                 wait_time = base_wait_time * retry_count + (retry_count * 0.5)
                 logger.info(f"⏳ Waiting {wait_time:.1f}s before retry...")
                 time.sleep(wait_time)
-class OpenAILLM:
-    """OpenAI LLM client for GPT models"""
-    def __init__(self, api_key: str, base_url: str, model: str):
-        self.api_key = api_key
-        self.base_url = base_url
-        self.model = model
-        self.client = None
-        logger.info("=" * 60)
-        logger.info("🚀 INITIALIZING OPENAI LLM")
-        logger.info("=" * 60)
-        logger.info(f"🤖 Model: {model}")
-        logger.info(f"🔑 API Key present: {'Yes' if api_key else 'No'}")
-        logger.info(f"📏 API Key length: {len(api_key) if api_key else 0}")
-        logger.info(f"🌐 API URL: {base_url}")
-        if not api_key or not api_key.strip():
-            logger.error("❌ OpenAI API key is missing or empty")
-            self.client_ready = False
-            return
-        # Initialize OpenAI client
-        try:
-            self.client = OpenAI(api_key=api_key, base_url=base_url)
-            logger.info("✅ OpenAI client initialized")
-            # Test the connection with minimal tokens
-            test_response = self.client.chat.completions.create(
-                model=self.model,
-                messages=[{"role": "user", "content": "Hello"}],
-                max_tokens=5
-            )
-            if test_response:
-                logger.info("✅ OpenAI connection test successful")
-                self.client_ready = True
-            else:
-                logger.error("❌ OpenAI connection test failed")
-                self.client_ready = False
-        except Exception as e:
-            logger.error(f"❌ OpenAI initialization failed: {e}")
-            self.client_ready = False
-        logger.info("=" * 60)
-    def generate_content(self, prompt: str, max_tokens: int = 2000) -> str:
-        """Generate content using OpenAI API"""
-        logger.info("=" * 80)
-        logger.info("🧠 OPENAI CONTENT GENERATION STARTED")
-        logger.info("=" * 80)
-        logger.info(f"📏 Input prompt length: {len(prompt)} characters")
-        logger.info(f"🎯 Requested max tokens: {max_tokens}")
-        logger.info(f"🔧 Client status: {'Ready' if self.client_ready else 'Not ready'}")
-        # Check client readiness
-        if not self.client_ready or not self.client:
-            error_msg = "❌ OpenAI client is not ready. Please check your API key and connection."
-            logger.error(error_msg)
-            return error_msg
-        # Optimize prompt length
-        original_length = len(prompt)
-        max_prompt_length = 12000 if max_tokens > 3000 else 8000
-        if len(prompt) > max_prompt_length:
-            logger.warning(f"⚠️  Prompt is quite long ({original_length} chars), truncating for better performance")
-            if "Context:" in prompt and "Question:" in prompt:
-                parts = prompt.split("Question:")
-                if len(parts) == 2:
-                    context_part = parts[0]
-                    question_part = "Question:" + parts[1]
-                    available_for_context = max_prompt_length - len(question_part) - 500
-                    if len(context_part) > available_for_context:
-                        context_part = context_part[:available_for_context] + "\n\n[... content truncated ...]"
-                    prompt = context_part + question_part
-                    logger.info(f"📏 Prompt intelligently truncated from {original_length} to {len(prompt)} characters")
-            else:
-                prompt = prompt[:max_prompt_length] + "\n\n[... content truncated ...]"
-                logger.info(f"📏 Prompt truncated from {original_length} to {len(prompt)} characters")
-        max_retries = 3
-        retry_count = 0
-        while retry_count <= max_retries:
-            try:
-                logger.info(f"🔄 API call attempt {retry_count + 1}/{max_retries + 1}")
-                current_max_tokens = max_tokens
-                if retry_count > 0:
-                    current_max_tokens = max(1000, max_tokens - (retry_count * 500))
-                    logger.info(f"🔧 Retry attempt - reducing max_tokens to {current_max_tokens}")
-                start_time = time.time()
-                response = self.client.chat.completions.create(
-                    model=self.model,
-                    messages=[{"role": "user", "content": prompt}],
-                    max_tokens=current_max_tokens,
-                    temperature=0.7,
-                    top_p=0.9,
-                )
-                request_time = time.time() - start_time
-                logger.info(f"⏱️  API request completed in {request_time:.2f}s")
-                if response and response.choices:
-                    content = response.choices[0].message.content
-                    if content:
-                        logger.info(f"✅ Successfully generated response")
-                        logger.info(f"📏 Response length: {len(content)} characters")
-                        # Check token usage
-                        if hasattr(response, 'usage') and response.usage:
-                            logger.info(f"📊 Token usage: {response.usage}")
-                            if response.usage.completion_tokens >= current_max_tokens * 0.95:
-                                logger.warning(f"⚠️  Response may be truncated")
-                                content += "\n\n[Note: Response may be truncated due to token limits.]"
-                        response_preview = content[:400] + "..." if len(content) > 400 else content
-                        logger.info(f"📤 RESPONSE PREVIEW: {response_preview}")
-                        logger.info("=" * 80)
-                        return content
-                    else:
-                        logger.error("❌ Received empty response")
-                        if retry_count < max_retries:
-                            retry_count += 1
-                            continue
-                        return "❌ Received empty response from OpenAI."
-                else:
-                    logger.error("❌ Invalid response format")
-                    if retry_count < max_retries:
-                        retry_count += 1
-                        continue
-                    return "❌ Invalid response format from OpenAI."
-            except Exception as e:
-                error_msg = str(e)
-                logger.error(f"❌ API call failed (attempt {retry_count + 1}): {error_msg}")
-                # Handle rate limits
-                if "rate_limit" in error_msg.lower() or "429" in error_msg:
-                    wait_time = 2 ** retry_count
-                    logger.info(f"⏳ Rate limit hit, waiting {wait_time}s...")
-                    time.sleep(wait_time)
-                retry_count += 1
-                if retry_count > max_retries:
-                    final_error = f"❌ Error: Failed after {max_retries + 1} attempts. Final error: {error_msg}"
-                    logger.error(final_error)
-                    logger.info("=" * 80)
-                    return final_error
-                time.sleep(2 * retry_count)
-def create_llm(api_key: str, provider: Optional[str] = None) -> 'OpenRouterLLM | OpenAILLM':
-    """
-    Factory function to create the appropriate LLM client based on API key or provider.
-    Args:
-        api_key: The API key to use
-        provider: Optional provider name ('openrouter' or 'openai'). If not provided, will auto-detect.
-    Returns:
-        An instance of OpenRouterLLM or OpenAILLM
-    """
-    # Auto-detect provider if not specified
-    if provider is None:
-        provider = detect_provider_from_key(api_key)
-        logger.info(f"🔍 Auto-detected provider: {provider}")
-    provider = provider.lower()
-    if provider == "openai":
-        logger.info("🎯 Creating OpenAI LLM client")
-        return OpenAILLM(
-            api_key=api_key,
-            base_url=settings.OPENAI_URL,
-            model=settings.OPENAI_MODEL
-        )
-    elif provider == "openrouter":
-        logger.info("🎯 Creating OpenRouter LLM client")
-        return OpenRouterLLM(
-            api_key=api_key,
-            base_url=settings.OPENROUTER_URL,
-            model=settings.OPENROUTER_MODEL
-        )
-    else:
-        logger.warning(f"⚠️  Unknown provider '{provider}', defaulting to OpenRouter")
-        return OpenRouterLLM(
-            api_key=api_key,
-            base_url=settings.OPENROUTER_URL,
-            model=settings.OPENROUTER_MODEL
-        )
-# Initialize the generation model with default settings
-logger.info("🚀 Creating default LLM instance...")
 try:
-    # Use OpenRouter by default if API key is available
-    if settings.OPENROUTER_API_KEY:
-        generation_model = OpenRouterLLM(
-            api_key=settings.OPENROUTER_API_KEY,
-            base_url=settings.OPENROUTER_URL,
-            model=settings.OPENROUTER_MODEL
-        )
-    elif settings.OPENAI_API_KEY:
-        generation_model = OpenAILLM(
-            api_key=settings.OPENAI_API_KEY,
-            base_url=settings.OPENAI_URL,
-            model=settings.OPENAI_MODEL
-        )
-    else:
-        raise ValueError("No API key configured")
     if generation_model.client_ready:
-        logger.info("✅ RAG setup completed successfully - LLM client is ready")
     else:
-        logger.error("❌ RAG setup completed but LLM client is not ready")
 except Exception as e:
-    logger.error(f"❌ Error creating LLM: {e}")
     # Create a dummy model for graceful degradation
     class DummyLLM:
-        def generate_content(self, prompt: str, max_tokens: int = 2000) -> str:
             return f"❌ AI model is not available. Initialization error: {str(e)}"
     generation_model = DummyLLM()
     logger.warning("⚠️  Using dummy LLM due to initialization failure")

 import logging
 import requests
 import json
+from config import settings  # Fixed: removed 'app.' prefix
 import time
 import os
 import numpy as np
 from sklearn.feature_extraction.text import TfidfVectorizer
+from typing import List
 import hashlib
 import re
 # Disable ChromaDB telemetry to avoid errors
 os.environ["ANONYMIZED_TELEMETRY"] = "False"
 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 logger = logging.getLogger("rag-setup")
 # Custom TF-IDF based embedding function
 class TFIDFEmbeddingFunction:
     def __init__(self, max_features=384):
         )
         self.is_fitted = False
         self.max_features = max_features
     def _preprocess_text(self, text: str) -> str:
         """Clean and preprocess text."""
         text = re.sub(r'\s+', ' ', text)
         text = re.sub(r'[^\w\s]', ' ', text)
         return text.strip().lower()
     def __call__(self, input: List[str]) -> List[List[float]]:
         """Generate TF-IDF based embeddings."""
         try:
             logger.info(f"🔢 Generating embeddings for {len(input)} texts")
             processed_texts = [self._preprocess_text(text) for text in input]
             if not self.is_fitted and processed_texts:
                 # Fit the vectorizer on the input texts
                 self.vectorizer.fit(processed_texts)
                 self.is_fitted = True
                 logger.info("✅ TF-IDF vectorizer fitted on input texts")
             if not self.is_fitted:
                 # Return simple fallback if no data to fit on
                 logger.warning("⚠️  Using fallback embeddings - vectorizer not fitted")
                 return self._fallback_embeddings(input)
             # Transform texts to vectors
             tfidf_matrix = self.vectorizer.transform(processed_texts)
             embeddings = tfidf_matrix.toarray()
             # Ensure consistent dimensions
             result_embeddings = []
             for embedding in embeddings:
                     result_embeddings.append(padded.tolist())
                 else:
                     result_embeddings.append(embedding[:self.max_features].tolist())
             logger.info(f"✅ Generated {len(result_embeddings)} embeddings of dimension {self.max_features}")
             return result_embeddings
         except Exception as e:
             logger.error(f"❌ Error generating TF-IDF embeddings: {e}")
             return self._fallback_embeddings(input)
     def _fallback_embeddings(self, input: List[str]) -> List[List[float]]:
         """Simple fallback embedding method."""
         logger.info(f"🔧 Using fallback embeddings for {len(input)} texts")
         for text in input:
             text_hash = hashlib.md5(text.encode()).hexdigest()
             embedding = []
             # Convert hash to numbers
             for i in range(0, min(len(text_hash), 32), 2):
+                hex_pair = text_hash[i:i+2]
                 embedding.append(int(hex_pair, 16) / 255.0)
             # Add text features
             embedding.extend([
                 len(text) / 1000.0,
                 len(text.split()) / 100.0,
                 text.count('.') / 10.0,
             ])
             # Pad to desired size
             while len(embedding) < self.max_features:
                 embedding.extend(embedding[:min(len(embedding), self.max_features - len(embedding))])
             embeddings.append(embedding[:self.max_features])
         return embeddings
 # Simple ChromaDB setup - use in-memory storage for Hugging Face
 logger.info("🔧 Initializing ChromaDB with in-memory storage for Hugging Face compatibility")
     # Use in-memory client to avoid permission issues
     client = chromadb.Client()
     embedding_function = TFIDFEmbeddingFunction()
     collection = client.get_or_create_collection(
         name="context_aware_collection",
         embedding_function=embedding_function
     )
     logger.info("✅ ChromaDB collection initialized successfully with in-memory storage")
 except Exception as e:
     logger.error(f"❌ Error initializing ChromaDB: {e}")
     raise RuntimeError(f"Failed to initialize ChromaDB: {e}")
         self.base_url = base_url
         self.model = model
         self.api_url = f"{base_url.rstrip('/')}/chat/completions"
         logger.info("=" * 60)
         logger.info("🚀 INITIALIZING OPENROUTER LLM")
         logger.info("=" * 60)
         logger.info(f"🔑 API Key present: {'Yes' if api_key else 'No'}")
         logger.info(f"📏 API Key length: {len(api_key) if api_key else 0}")
         logger.info(f"🌐 API URL: {self.api_url}")
         if not api_key or not api_key.strip():
             logger.error("❌ OpenRouter API key is missing or empty")
             self.client_ready = False
             return
         # Test the connection with minimal tokens
         try:
             logger.info("🔍 Testing OpenRouter connection...")
         except Exception as e:
             logger.error(f"❌ OpenRouter connection test failed: {e}")
             self.client_ready = False
         logger.info("=" * 60)
     def _make_api_request(self, prompt: str, max_tokens: int = 2000, timeout: int = None) -> dict:
         """Make a direct HTTP request to OpenRouter API with configurable token limits."""
         # Calculate dynamic timeout based on max_tokens and prompt length
         if timeout is None:
             base_timeout = 120
             token_timeout = max(20, max_tokens // 100)  # ~1 second per 100 tokens
             prompt_timeout = max(10, len(prompt) // 1000)  # ~1 second per 2000 characters
             timeout = min(base_timeout + token_timeout + prompt_timeout, 600)  # Cap at 5 minutes
         logger.info(f"🌐 Making API request to OpenRouter")
         logger.info(f"📏 Prompt length: {len(prompt)} characters")
         logger.info(f"🎯 Max tokens: {max_tokens}")
         logger.info(f"⏱️  Timeout: {timeout}s")
         headers = {
             "Authorization": f"Bearer {self.api_key}",
             "Content-Type": "application/json",
             "HTTP-Referer": "https://github.com/Ab-Romia/ContextIQ-RAG",
             "X-Title": "Context Aware AI"
         }
         # Optimize payload for longer responses
         payload = {
             "model": self.model,
             "presence_penalty": 0.1,  # Slight penalty for repetition
             "frequency_penalty": 0.1,  # Slight penalty for frequency
         }
         # Log the request payload (without sensitive data)
         safe_payload = payload.copy()
         safe_payload["messages"] = [{"role": "user", "content": f"[CONTENT: {len(prompt)} chars]"}]
         logger.info(f"📤 Request payload: {json.dumps(safe_payload, indent=2)}")
         try:
             start_time = time.time()
             with requests.Session() as session:
                 response = session.post(
                     self.api_url,
                     json=payload,
                     timeout=timeout
                 )
             request_time = time.time() - start_time
             logger.info(f"⏱️  API request completed in {request_time:.2f}s")
             logger.info(f"📊 Response status: {response.status_code}")
             if response.status_code == 200:
                 response_data = response.json()
                 logger.info("✅ API request successful")
                 # Log response details
                 if "choices" in response_data and response_data["choices"]:
                     content = response_data["choices"][0]["message"]["content"]
                     logger.info(f"📝 Response content length: {len(content)} characters")
                     # Check if response was truncated
                     if "usage" in response_data:
                         usage = response_data["usage"]
                         completion_tokens = usage.get("completion_tokens", 0)
                         logger.info(f"📊 Token usage: {usage}")
                         if completion_tokens >= max_tokens * 0.95:  # If we used 95% of max tokens
+                            logger.warning(f"⚠️  Response may be truncated (used {completion_tokens}/{max_tokens} tokens)")
                     content_preview = content[:300] + "..." if len(content) > 300 else content
                     logger.info(f"📄 Response preview: {content_preview}")
                 return response_data
             else:
                 logger.error(f"❌ API request failed with status {response.status_code}")
                 logger.error(f"📄 Response text: {response.text}")
                 return {"error": f"HTTP {response.status_code}: {response.text}"}
         except requests.exceptions.Timeout:
             logger.error(f"⏱️  API request timed out after {timeout}s")
             return {"error": f"Request timed out after {timeout}s. Try reducing the context length or max tokens."}
         logger.info(f"🎯 Requested max tokens: {max_tokens}")
         logger.info(f"🔧 Client status: {'Ready' if self.client_ready else 'Not ready'}")
         logger.info(f"🔑 API key status: {'Present' if self.api_key else 'Missing'}")
         # Dynamic prompt optimization based on max_tokens
         original_length = len(prompt)
         max_prompt_length = 12000 if max_tokens > 3000 else 8000  # Allow longer prompts for longer responses
         if len(prompt) > max_prompt_length:
             logger.warning(f"⚠️  Prompt is quite long ({original_length} chars), truncating for better performance")
             # Intelligent truncation that preserves structure
                 if len(parts) == 2:
                     context_part = parts[0]
                     question_part = "Question:" + parts[1]
                     # Keep the question and instructions, truncate context if needed
                     available_for_context = max_prompt_length - len(question_part) - 500  # Reserve space
                     if len(context_part) > available_for_context:
+                        context_part = context_part[:available_for_context] + "\n\n[... content truncated for performance ...]"
                     prompt = context_part + question_part
                     logger.info(f"📏 Prompt intelligently truncated from {original_length} to {len(prompt)} characters")
             else:
                 prompt = prompt[:max_prompt_length] + "\n\n[... content truncated for performance ...]"
                 logger.info(f"📏 Prompt truncated from {original_length} to {len(prompt)} characters")
         # Log prompt preview
         prompt_preview = prompt[:400] + "..." if len(prompt) > 400 else prompt
         logger.info(f"📝 PROMPT PREVIEW:")
         logger.info(f"   {prompt_preview}")
         logger.info("-" * 60)
         # Check API key first
         if not self.api_key or not self.api_key.strip():
             error_msg = "❌ OpenRouter API key is not configured. Please set the OPENROUTER_API_KEY environment variable."
             logger.error(error_msg)
             return error_msg
         # Check client readiness
         if not self.client_ready:
             error_msg = "❌ OpenRouter client is not ready. Please check your API key and connection."
             logger.error(error_msg)
             return error_msg
         max_retries = 3
         retry_count = 0
         base_wait_time = 2
         while retry_count <= max_retries:
             try:
                 logger.info(f"🔄 API call attempt {retry_count + 1}/{max_retries + 1}")
                 # Adjust parameters based on retry attempt
                 current_max_tokens = max_tokens
                 timeout = None  # Let _make_api_request calculate dynamic timeout
                 if retry_count > 0:
                     # Reduce max_tokens on retries for faster responses
                     current_max_tokens = max(1000, max_tokens - (retry_count * 500))
                     logger.info(f"🔧 Retry attempt - reducing max_tokens to {current_max_tokens}")
                 response = self._make_api_request(prompt, max_tokens=current_max_tokens, timeout=timeout)
                 if "error" in response:
                     error_msg = response["error"]
                     # Handle specific error types
                     if "timeout" in error_msg.lower() or "408" in error_msg:
                         logger.warning(f"⏱️  Timeout error on attempt {retry_count + 1}")
                     elif "401" in error_msg or "403" in error_msg:
                         logger.error(f"🔑 Authentication error: {error_msg}")
                         return f"❌ Authentication error: {error_msg}"
                     raise Exception(error_msg)
                 if "choices" in response and len(response["choices"]) > 0:
                     content = response["choices"][0]["message"]["content"]
                     if content:
                         logger.info(f"✅ Successfully generated response")
                         logger.info(f"📏 Response length: {len(content)} characters")
                         # Check if response seems complete
                         if "usage" in response:
                             usage = response["usage"]
                             completion_tokens = usage.get("completion_tokens", 0)
                             if completion_tokens >= current_max_tokens * 0.95:
+                                logger.warning(f"⚠️  Response may be incomplete (used {completion_tokens}/{current_max_tokens} tokens)")
                                 content += "\n\n[Note: Response may be truncated due to token limits. Consider asking for specific parts if needed.]"
                         response_preview = content[:400] + "..." if len(content) > 400 else content
                         logger.info(f"📤 RESPONSE PREVIEW:")
                         logger.info(f"   {response_preview}")
                         logger.info("=" * 80)
                         return content
                     else:
                         logger.error("❌ Received empty response from AI model")
                         retry_count += 1
                         continue
                     return "❌ Invalid response format from AI model."
             except Exception as e:
                 error_type = type(e).__name__
                 error_msg = str(e)
                 logger.error(f"❌ API call failed (attempt {retry_count + 1}): {error_type}: {error_msg}")
                 retry_count += 1
                 if retry_count > max_retries:
                     final_error = f"❌ Error: Failed to get response from AI model after {max_retries + 1} attempts. Final error: {error_msg}"
                     logger.error(final_error)
                     logger.info("=" * 80)
                     return final_error
                 wait_time = base_wait_time * retry_count + (retry_count * 0.5)
                 logger.info(f"⏳ Waiting {wait_time:.1f}s before retry...")
                 time.sleep(wait_time)
+# Initialize the generation model
+logger.info("🚀 Creating OpenRouter LLM instance...")
 try:
+    generation_model = OpenRouterLLM(
+        api_key=settings.OPENROUTER_API_KEY,
+        base_url=settings.OPENROUTER_URL,
+        model=settings.MODEL_NAME
+    )
     if generation_model.client_ready:
+        logger.info("✅ RAG setup completed successfully - OpenRouter client is ready")
     else:
+        logger.error("❌ RAG setup completed but OpenRouter client is not ready")
 except Exception as e:
+    logger.error(f"❌ Error creating OpenRouter LLM: {e}")
     # Create a dummy model for graceful degradation
     class DummyLLM:
+        def generate_content(self, prompt: str) -> str:
             return f"❌ AI model is not available. Initialization error: {str(e)}"
     generation_model = DummyLLM()
     logger.warning("⚠️  Using dummy LLM due to initialization failure")

app/schemas.py CHANGED Viewed

@@ -1,12 +1,5 @@
 from pydantic import BaseModel, Field
-from typing import Optional, List
-class ConversationMessage(BaseModel):
-    """Schema for a single message in the conversation history"""
-    role: str = Field(..., description="Role of the message sender ('user' or 'assistant')")
-    content: str = Field(..., description="Content of the message")
 class DocumentRequest(BaseModel):
     """
@@ -18,7 +11,6 @@ class DocumentRequest(BaseModel):
         description="The full document or text to be indexed."
     )
 class ChatRequest(BaseModel):
     """
     Schema for the request to generate a response.
@@ -28,10 +20,6 @@ class ChatRequest(BaseModel):
         min_length=2,
         description="The user's question to be answered based on the indexed context."
     )
-    conversation_history: Optional[List[ConversationMessage]] = Field(
-        default=None,
-        description="Optional conversation history for context-aware responses"
-    )
 class TaskRequest(BaseModel):
     """
@@ -48,11 +36,7 @@ class ApiKeyRequest(BaseModel):
     api_key: str = Field(
         ...,
         min_length=10,
-        description="The API key to test (OpenRouter or OpenAI)."
-    )
-    provider: Optional[str] = Field(
-        None,
-        description="The provider for the API key ('openrouter' or 'openai'). Auto-detected if not provided."
     )
 class ChatResponse(BaseModel):

 from pydantic import BaseModel, Field
+from typing import Optional
 class DocumentRequest(BaseModel):
     """
         description="The full document or text to be indexed."
     )
 class ChatRequest(BaseModel):
     """
     Schema for the request to generate a response.
         min_length=2,
         description="The user's question to be answered based on the indexed context."
     )
 class TaskRequest(BaseModel):
     """
     api_key: str = Field(
         ...,
         min_length=10,
+        description="The OpenRouter API key to test."
     )
 class ChatResponse(BaseModel):

app/services.py CHANGED Viewed

@@ -6,13 +6,12 @@ import time
 import rag_setup
 from schemas import ChatRequest, DocumentRequest, TaskRequest
 from typing import Optional, Tuple
-from config import settings, detect_provider_from_key  # Fixed: removed 'app.' prefix
 from fastapi import UploadFile, HTTPException
 import json
 import xml.etree.ElementTree as ET
 from striprtf.striprtf import rtf_to_text
 import markdown
 try:
     import fitz  # PyMuPDF
 except ImportError:
@@ -22,31 +21,28 @@ except ImportError:
 try:
     import docx  # python-docx for Word documents
 except ImportError:
-    logging.error(
-        "python-docx is not installed. Word document processing will not work. Please run 'pip install python-docx'")
     docx = None
 try:
     from pptx import Presentation  # python-pptx for PowerPoint
 except ImportError:
-    logging.error(
-        "python-pptx is not installed. PowerPoint processing will not work. Please run 'pip install python-pptx'")
     Presentation = None
 try:
     import pandas as pd  # For Excel and CSV files
 except ImportError:
-    logging.error(
-        "pandas is not installed. Excel/CSV processing will not work. Please run 'pip install pandas openpyxl'")
     pd = None
 try:
     from bs4 import BeautifulSoup  # For HTML parsing
 except ImportError:
-    logging.error(
-        "BeautifulSoup is not installed. HTML processing will not work. Please run 'pip install beautifulsoup4'")
     BeautifulSoup = None
 logging.basicConfig(
     level=logging.INFO,
     format='%(asctime)s [%(levelname)s] %(message)s',
@@ -59,16 +55,18 @@ logger = logging.getLogger("rag-service")
 _response_cache = {}
 CACHE_EXPIRATION_SECONDS = 600  # 10 minutes
-def create_llm_instance(api_key: str, provider: Optional[str] = None):
-    """Create a new LLM instance with the provided API key (OpenRouter or OpenAI)."""
-    return rag_setup.create_llm(api_key=api_key, provider=provider)
-async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
-    """Test if the provided API key is valid (OpenRouter or OpenAI)."""
     logger.info(f"🔍 Testing API key: {api_key[:10]}...")
     try:
         # Validate API key format first
         if not api_key or not api_key.strip():
@@ -78,37 +76,15 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
                 "message": "API key cannot be empty",
                 "model_info": None
             }
-        # Auto-detect provider if not specified
-        if provider is None:
-            provider = detect_provider_from_key(api_key)
-            logger.info(f"🔍 Auto-detected provider: {provider}")
-        # Validate based on provider
-        if provider == "openrouter":
-            if not api_key.startswith('sk-or-'):
-                logger.error("❌ API key has incorrect format for OpenRouter")
-                return {
-                    "valid": False,
-                    "message": "OpenRouter API keys should start with 'sk-or-'",
-                    "model_info": None
-                }
-        elif provider == "openai":
-            if not api_key.startswith('sk-'):
-                logger.error("❌ API key has incorrect format for OpenAI")
-                return {
-                    "valid": False,
-                    "message": "OpenAI API keys should start with 'sk-'",
-                    "model_info": None
-                }
-        else:
-            logger.error("❌ Unknown provider")
             return {
                 "valid": False,
-                "message": f"Unknown provider: {provider}. Please use OpenRouter or OpenAI.",
                 "model_info": None
             }
         if len(api_key) < 40:
             logger.error("❌ API key is too short")
             return {
@@ -116,51 +92,18 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
                 "message": "API key appears to be too short",
                 "model_info": None
             }
         # Create a temporary LLM instance
-        test_llm = create_llm_instance(api_key, provider)
         # Test with a minimal prompt to avoid quota usage
-        if provider == "openai":
-            # OpenAI uses the SDK, so we test differently
-            try:
-                test_content = test_llm.generate_content("Hi", max_tokens=5)
-                if test_content and not test_content.startswith("❌"):
-                    logger.info("✅ OpenAI API key test successful")
-                    return {
-                        "valid": True,
-                        "message": "OpenAI API key is valid and working!",
-                        "model_info": {"model": settings.OPENAI_MODEL, "provider": "openai"}
-                    }
-                else:
-                    return {
-                        "valid": False,
-                        "message": test_content or "API key test failed",
-                        "model_info": None
-                    }
-            except Exception as e:
-                error_msg = str(e)
-                if "401" in error_msg or "Incorrect API key" in error_msg:
-                    return {
-                        "valid": False,
-                        "message": "Invalid OpenAI API key: Authentication failed",
-                        "model_info": None
-                    }
-                else:
-                    return {
-                        "valid": False,
-                        "message": f"OpenAI API key test failed: {error_msg}",
-                        "model_info": None
-                    }
-        # For OpenRouter, use the existing logic
         test_response = test_llm._make_api_request("Hi", max_tokens=1)
         # Check for explicit errors first
         if "error" in test_response:
             error_msg = test_response["error"]
             logger.error(f"❌ API key test failed: {error_msg}")
             # Parse specific error types
             if "401" in str(error_msg) or "403" in str(error_msg) or "Unauthorized" in str(error_msg):
                 return {
@@ -188,7 +131,7 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
                     "message": f"API key test failed: {error_msg}",
                     "model_info": None
                 }
         # Check for successful response with proper structure
         if "choices" in test_response and test_response["choices"]:
             choice = test_response["choices"][0]
@@ -204,7 +147,7 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
                     "message": "API key is valid and working!",
                     "model_info": model_info
                 }
         # If we get here, the response format is unexpected
         logger.error(f"❌ API key test failed: Unexpected response format - {test_response}")
         return {
@@ -212,11 +155,11 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
             "message": "API key test failed: Unexpected response format from OpenRouter",
             "model_info": None
         }
     except Exception as e:
         logger.error(f"❌ API key test failed with exception: {str(e)}")
         error_msg = str(e)
         # Parse common error patterns
         if "401" in error_msg or "403" in error_msg or "Unauthorized" in error_msg:
             return {
@@ -243,7 +186,6 @@ async def test_api_key(api_key: str, provider: Optional[str] = None) -> dict:
                 "model_info": None
             }
 async def process_and_index_file(file: UploadFile) -> Tuple[int, str]:
     """
     Processes an uploaded file, extracts text, calls the indexing function,
@@ -251,7 +193,7 @@ async def process_and_index_file(file: UploadFile) -> Tuple[int, str]:
     Supports: .txt, .pdf, .docx, .pptx, .xlsx, .csv, .json, .xml, .html, .md, .rtf
     """
     logger.info(f"📄 Processing file '{file.filename}' with content type '{file.content_type}'")
     # Read file content
     file_content = await file.read()
     text = ""
@@ -260,43 +202,42 @@ async def process_and_index_file(file: UploadFile) -> Tuple[int, str]:
     try:
         if file_extension == "txt":
             text = await _process_txt_file(file_content)
         elif file_extension == "pdf":
             text = await _process_pdf_file(file_content)
         elif file_extension == "docx":
             text = await _process_docx_file(file_content)
         elif file_extension in ["ppt", "pptx"]:
             text = await _process_pptx_file(file_content)
         elif file_extension in ["xls", "xlsx"]:
             text = await _process_excel_file(file_content, file.filename)
         elif file_extension == "csv":
             text = await _process_csv_file(file_content)
         elif file_extension == "json":
             text = await _process_json_file(file_content)
         elif file_extension == "xml":
             text = await _process_xml_file(file_content)
         elif file_extension in ["html", "htm"]:
             text = await _process_html_file(file_content)
         elif file_extension in ["md", "markdown"]:
             text = await _process_markdown_file(file_content)
         elif file_extension == "rtf":
             text = await _process_rtf_file(file_content)
         else:
-            supported_extensions = ['.txt', '.pdf', '.docx', '.pptx', '.xlsx', '.csv', '.json', '.xml', '.html', '.md',
-                                    '.rtf']
             logger.error(f"❌ Unsupported file type: {file.filename}")
             raise HTTPException(
-                status_code=400,
                 detail=f"Unsupported file type. Please upload one of: {', '.join(supported_extensions)}"
             )
@@ -309,24 +250,23 @@ async def process_and_index_file(file: UploadFile) -> Tuple[int, str]:
     # Validate extracted text
     if not text or not text.strip():
         logger.error("❌ Extracted text is empty or whitespace only")
-        raise HTTPException(status_code=400,
-                            detail="Extracted text is empty. The file might be empty, corrupted, or unreadable.")
     # Clean up the text
     text = text.strip()
     # Log processing stats
     word_count = len(text.split())
     logger.info(f"📊 Text processing complete: {len(text)} characters, {word_count} words")
     # Index the extracted text using existing logic
     try:
         doc_request = DocumentRequest(context=text)
         docs_added = index_document(doc_request)
         logger.info(f"✅ Successfully indexed {docs_added} document chunks from file")
         return docs_added, text
     except Exception as e:
         logger.error(f"❌ Failed to index extracted text: {e}")
         raise HTTPException(status_code=500, detail=f"Failed to index extracted text: {str(e)}")
@@ -350,14 +290,13 @@ async def _process_txt_file(file_content: bytes) -> str:
                     continue
             else:
                 raise UnicodeDecodeError("Unable to decode file with any common encoding")
         logger.info(f"✅ Extracted {len(text)} characters from .txt file")
         return text
     except UnicodeDecodeError as e:
         logger.error(f"❌ Could not decode .txt file: {e}")
-        raise HTTPException(status_code=400,
-                            detail="Could not decode .txt file. Please ensure it uses UTF-8, Latin-1, or CP1252 encoding.")
 async def _process_pdf_file(file_content: bytes) -> str:
@@ -365,34 +304,34 @@ async def _process_pdf_file(file_content: bytes) -> str:
     if fitz is None:
         logger.error("❌ PyMuPDF not available for PDF processing")
         raise HTTPException(status_code=501, detail="PDF processing is not available. PyMuPDF is not installed.")
     logger.info("📖 Opening PDF document...")
     doc = fitz.open(stream=file_content, filetype="pdf")
     try:
         text_parts = []
         page_count = len(doc)
         logger.info(f"📑 PDF has {page_count} pages")
         for page_num in range(page_count):
             try:
                 page = doc[page_num]
                 page_text = page.get_text()
                 if page_text and page_text.strip():
                     text_parts.append(f"--- Page {page_num + 1} ---\n{page_text.strip()}")
                     logger.info(f"📄 Extracted text from page {page_num + 1}: {len(page_text)} characters")
                 else:
                     logger.info(f"📄 Page {page_num + 1} is empty or contains no extractable text")
             except Exception as page_error:
                 logger.warning(f"⚠️  Could not extract text from page {page_num + 1}: {page_error}")
                 continue
         text = "\n\n".join(text_parts)
         logger.info(f"✅ Extracted text from {len(text_parts)} pages of the PDF file ({len(text)} characters)")
         return text
     finally:
         doc.close()
         logger.info("📕 PDF document closed successfully")
@@ -401,17 +340,16 @@ async def _process_pdf_file(file_content: bytes) -> str:
 async def _process_docx_file(file_content: bytes) -> str:
     """Process .docx files using python-docx."""
     if docx is None:
-        raise HTTPException(status_code=501,
-                            detail="Word document processing is not available. python-docx is not installed.")
     from io import BytesIO
     doc = docx.Document(BytesIO(file_content))
     text_parts = []
     for paragraph in doc.paragraphs:
         if paragraph.text.strip():
             text_parts.append(paragraph.text.strip())
     # Also extract text from tables
     for table in doc.tables:
         for row in table.rows:
@@ -421,7 +359,7 @@ async def _process_docx_file(file_content: bytes) -> str:
                     row_text.append(cell.text.strip())
             if row_text:
                 text_parts.append(" | ".join(row_text))
     text = "\n\n".join(text_parts)
     logger.info(f"✅ Extracted {len(text)} characters from Word document")
     return text
@@ -430,23 +368,22 @@ async def _process_docx_file(file_content: bytes) -> str:
 async def _process_pptx_file(file_content: bytes) -> str:
     """Process .pptx files using python-pptx."""
     if Presentation is None:
-        raise HTTPException(status_code=501,
-                            detail="PowerPoint processing is not available. python-pptx is not installed.")
     from io import BytesIO
     prs = Presentation(BytesIO(file_content))
     text_parts = []
     for slide_num, slide in enumerate(prs.slides, 1):
         slide_text = [f"--- Slide {slide_num} ---"]
         for shape in slide.shapes:
             if hasattr(shape, "text") and shape.text.strip():
                 slide_text.append(shape.text.strip())
         if len(slide_text) > 1:  # More than just the slide header
             text_parts.append("\n".join(slide_text))
     text = "\n\n".join(text_parts)
     logger.info(f"✅ Extracted text from {len(prs.slides)} PowerPoint slides ({len(text)} characters)")
     return text
@@ -456,38 +393,37 @@ async def _process_excel_file(file_content: bytes, filename: str) -> str:
     """Process .xlsx/.xls files using pandas."""
     if pd is None:
         raise HTTPException(status_code=501, detail="Excel processing is not available. pandas is not installed.")
     from io import BytesIO
     try:
         # Read all sheets
         excel_file = pd.ExcelFile(BytesIO(file_content))
         text_parts = [f"Excel File: {filename}"]
         for sheet_name in excel_file.sheet_names:
             df = pd.read_excel(excel_file, sheet_name=sheet_name)
             if not df.empty:
                 text_parts.append(f"\n--- Sheet: {sheet_name} ---")
                 # Convert DataFrame to readable text
                 # Include column headers
                 text_parts.append("Columns: " + " | ".join(str(col) for col in df.columns))
                 # Add row data (limit to first 100 rows to avoid huge files)
                 for idx, row in df.head(100).iterrows():
                     row_text = " | ".join(str(val) for val in row.values if pd.notna(val))
                     if row_text.strip():
                         text_parts.append(row_text)
                 if len(df) > 100:
                     text_parts.append(f"... and {len(df) - 100} more rows")
         text = "\n".join(text_parts)
-        logger.info(
-            f"✅ Extracted data from Excel file with {len(excel_file.sheet_names)} sheets ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process Excel file: {str(e)}")
@@ -496,9 +432,9 @@ async def _process_csv_file(file_content: bytes) -> str:
     """Process .csv files using pandas."""
     if pd is None:
         raise HTTPException(status_code=501, detail="CSV processing is not available. pandas is not installed.")
     from io import StringIO
     try:
         # Try different encodings for CSV
         for encoding in ['utf-8', 'latin-1', 'cp1252']:
@@ -510,26 +446,26 @@ async def _process_csv_file(file_content: bytes) -> str:
                 continue
         else:
             raise ValueError("Could not decode CSV file with any common encoding")
         if df.empty:
             raise ValueError("CSV file is empty")
         text_parts = ["CSV Data:"]
         text_parts.append("Columns: " + " | ".join(str(col) for col in df.columns))
         # Add row data (limit to first 200 rows)
         for idx, row in df.head(200).iterrows():
             row_text = " | ".join(str(val) for val in row.values if pd.notna(val))
             if row_text.strip():
                 text_parts.append(row_text)
         if len(df) > 200:
             text_parts.append(f"... and {len(df) - 200} more rows")
         text = "\n".join(text_parts)
         logger.info(f"✅ Extracted data from CSV file with {len(df)} rows ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process CSV file: {str(e)}")
@@ -539,12 +475,12 @@ async def _process_json_file(file_content: bytes) -> str:
     try:
         json_text = file_content.decode('utf-8')
         data = json.loads(json_text)
         # Convert JSON to readable text format
         def json_to_text(obj, indent=0):
             lines = []
             prefix = "  " * indent
             if isinstance(obj, dict):
                 for key, value in obj.items():
                     if isinstance(value, (dict, list)):
@@ -561,15 +497,15 @@ async def _process_json_file(file_content: bytes) -> str:
                         lines.append(f"{prefix}[{i}]: {item}")
             else:
                 lines.append(f"{prefix}{obj}")
             return lines
         text_lines = ["JSON Data:"] + json_to_text(data)
         text = "\n".join(text_lines)
         logger.info(f"✅ Extracted data from JSON file ({len(text)} characters)")
         return text
     except json.JSONDecodeError as e:
         raise HTTPException(status_code=400, detail=f"Invalid JSON file: {str(e)}")
     except Exception as e:
@@ -581,34 +517,34 @@ async def _process_xml_file(file_content: bytes) -> str:
     try:
         xml_text = file_content.decode('utf-8')
         root = ET.fromstring(xml_text)
         def xml_to_text(element, indent=0):
             lines = []
             prefix = "  " * indent
             # Add element name and attributes
             if element.attrib:
                 attrs = " ".join(f'{k}="{v}"' for k, v in element.attrib.items())
                 lines.append(f"{prefix}{element.tag} ({attrs}):")
             else:
                 lines.append(f"{prefix}{element.tag}:")
             # Add text content
             if element.text and element.text.strip():
                 lines.append(f"{prefix}  {element.text.strip()}")
             # Add child elements
             for child in element:
                 lines.extend(xml_to_text(child, indent + 1))
             return lines
         text_lines = ["XML Data:"] + xml_to_text(root)
         text = "\n".join(text_lines)
         logger.info(f"✅ Extracted data from XML file ({len(text)} characters)")
         return text
     except ET.ParseError as e:
         raise HTTPException(status_code=400, detail=f"Invalid XML file: {str(e)}")
     except Exception as e:
@@ -619,26 +555,26 @@ async def _process_html_file(file_content: bytes) -> str:
     """Process .html files using BeautifulSoup."""
     if BeautifulSoup is None:
         raise HTTPException(status_code=501, detail="HTML processing is not available. BeautifulSoup is not installed.")
     try:
         html_text = file_content.decode('utf-8')
         soup = BeautifulSoup(html_text, 'html.parser')
         # Remove script and style elements
         for script in soup(["script", "style"]):
             script.decompose()
         # Get text content
         text = soup.get_text()
         # Clean up whitespace
         lines = (line.strip() for line in text.splitlines())
         chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
         text = '\n'.join(chunk for chunk in chunks if chunk)
         logger.info(f"✅ Extracted text from HTML file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process HTML file: {str(e)}")
@@ -647,7 +583,7 @@ async def _process_markdown_file(file_content: bytes) -> str:
     """Process .md files."""
     try:
         md_text = file_content.decode('utf-8')
         # Convert markdown to HTML then to plain text for better readability
         html = markdown.markdown(md_text)
         if BeautifulSoup:
@@ -656,10 +592,10 @@ async def _process_markdown_file(file_content: bytes) -> str:
         else:
             # Fallback: use raw markdown
             text = md_text
         logger.info(f"✅ Extracted text from Markdown file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process Markdown file: {str(e)}")
@@ -669,66 +605,18 @@ async def _process_rtf_file(file_content: bytes) -> str:
     try:
         rtf_text = file_content.decode('utf-8')
         text = rtf_to_text(rtf_text)
         logger.info(f"✅ Extracted text from RTF file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process RTF file: {str(e)}")
-def _create_overlapping_chunks(text: str, chunk_size: int = None, overlap: int = None) -> list:
-    """
-    Create overlapping text chunks with smart boundary detection.
-    """
-    if chunk_size is None:
-        chunk_size = settings.CHUNK_SIZE
-    if overlap is None:
-        overlap = settings.CHUNK_OVERLAP
-    chunks = []
-    start = 0
-    text_length = len(text)
-    while start < text_length:
-        end = start + chunk_size
-        if end < text_length:
-            # Try sentence boundaries first
-            sentence_end = max(
-                text.rfind('.', start, end),
-                text.rfind('!', start, end),
-                text.rfind('?', start, end)
-            )
-            if sentence_end > start + chunk_size // 2:
-                end = sentence_end + 1
-            else:
-                # Try paragraph or line break
-                newline_pos = text.rfind('\n', start, end)
-                if newline_pos > start + chunk_size // 2:
-                    end = newline_pos + 1
-                else:
-                    # Fall back to word boundary
-                    space_pos = text.rfind(' ', start, end)
-                    if space_pos > start + chunk_size // 2:
-                        end = space_pos
-        chunk = text[start:end].strip()
-        if chunk and len(chunk) > 20:
-            chunks.append(chunk)
-        next_start = end - overlap if end < text_length else text_length
-        start = max(next_start, start + 1)
-    return chunks
 def index_document(request_data: DocumentRequest) -> int:
     logger.info("=" * 80)
-    logger.info("📚 STARTING ENHANCED DOCUMENT INDEXING PROCESS")
     logger.info("=" * 80)
     # Log the incoming context
     context_preview = request_data.context[:200] + "..." if len(request_data.context) > 200 else request_data.context
     logger.info(f"📝 CONTEXT TO INDEX (length: {len(request_data.context)} chars):")
@@ -745,55 +633,64 @@ def index_document(request_data: DocumentRequest) -> int:
         else:
             logger.info("📂 No existing documents to clear.")
-        # Step 2: Enhanced chunking with overlap for better context preservation
-        logger.info(f"✂️  Creating overlapping chunks ({settings.CHUNK_SIZE} chars, {settings.CHUNK_OVERLAP} overlap)...")
-        text_chunks = _create_overlapping_chunks(request_data.context)
         if not text_chunks:
             logger.warning("⚠️  No text chunks were generated.")
             return 0
-        logger.info(f"✅ Document split into {len(text_chunks)} overlapping chunks")
-        # Log chunk statistics
-        avg_chunk_size = sum(len(chunk) for chunk in text_chunks) / len(text_chunks)
-        logger.info(f"📊 Average chunk size: {avg_chunk_size:.0f} characters")
-        # Step 3: Add chunks to ChromaDB with enhanced metadata
-        timestamp = int(time.time())
-        chunk_ids = [f"doc_chunk_{i}_{timestamp}" for i in range(len(text_chunks))]
-        logger.info(f"💾 Adding {len(chunk_ids)} chunks to ChromaDB with enhanced metadata...")
-        # Enhanced metadata for better retrieval
-        metadatas = [
-            {
-                "chunk_index": i,
-                "timestamp": timestamp,
-                "chunk_length": len(chunk),
-                "position": "start" if i == 0 else "end" if i == len(text_chunks) - 1 else "middle",
-                "total_chunks": len(text_chunks)
-            }
-            for i, chunk in enumerate(text_chunks)
-        ]
         rag_setup.collection.add(
-            documents=text_chunks,
             ids=chunk_ids,
             metadatas=metadatas
         )
-        logger.info("✅ ENHANCED DOCUMENT INDEXING COMPLETED SUCCESSFULLY")
         logger.info(f"📊 Total chunks indexed: {len(text_chunks)}")
-        logger.info(f"🔗 Chunks have 100-char overlap for better context continuity")
         logger.info("=" * 80)
         return len(text_chunks)
     except Exception as e:
         logger.error(f"❌ Error during indexing: {str(e)}", exc_info=True)
         raise
 def clear_index():
     """Clears all documents from the vector database."""
     logger.info("🗑️  Clearing vector index...")
@@ -810,194 +707,97 @@ def clear_index():
         logger.error(f"❌ Error clearing vector index: {e}")
         raise
-def _expand_query(query: str) -> list:
-    """
-    Expand the query with synonyms and related terms for better retrieval.
-    Returns list of query variations.
-    """
-    queries = [query]
-    # Add query without question words for better matching
-    question_words = ['what', 'where', 'when', 'who', 'why', 'how', 'is', 'are', 'can', 'do', 'does']
-    words = query.lower().split()
-    filtered_words = [w for w in words if w not in question_words and len(w) > 2]
-    if len(filtered_words) >= 2:
-        # Create a version with just the key terms
-        key_terms_query = ' '.join(filtered_words)
-        if key_terms_query != query.lower():
-            queries.append(key_terms_query)
-    return queries[:2]  # Limit to 2 variations to avoid too many retrievals
-def _deduplicate_and_rank_chunks(chunks_list: list, metadatas_list: list) -> tuple:
-    """
-    Deduplicate chunks and rank them by relevance.
-    Returns (unique_chunks, unique_metadatas)
-    """
-    seen = set()
-    unique_chunks = []
-    unique_metadatas = []
-    for chunks, metadatas in zip(chunks_list, metadatas_list):
-        for chunk, metadata in zip(chunks, metadatas):
-            # Use first 100 chars as fingerprint
-            fingerprint = chunk[:100]
-            if fingerprint not in seen:
-                seen.add(fingerprint)
-                unique_chunks.append(chunk)
-                unique_metadatas.append(metadata)
-    return unique_chunks, unique_metadatas
 async def get_rag_response(request_data: ChatRequest, api_key: Optional[str] = None) -> str:
     """
-    Enhanced RAG pipeline with conversation history, query expansion, and better retrieval.
     """
     start_total = time.time()
     logger.info("=" * 80)
-    logger.info("🤖 STARTING ENHANCED RAG PIPELINE")
     logger.info("=" * 80)
     logger.info(f"❓ USER PROMPT: '{request_data.prompt}'")
     logger.info(f"📏 Prompt length: {len(request_data.prompt)} characters")
-    logger.info(f"💬 Conversation history: {len(request_data.conversation_history or [])} messages")
     logger.info(f"🔑 Using custom API key: {'Yes' if api_key else 'No'}")
     logger.info("-" * 60)
     try:
-        # Step 1: Build cache key including conversation context
-        history_hash = ""
-        if request_data.conversation_history:
-            # Create a hash of recent conversation for cache key
-            recent_msgs = request_data.conversation_history[-3:]  # Last 3 messages
-            history_hash = str(hash("".join([m.content[:50] for m in recent_msgs])))
-        cache_key = f"{api_key or 'default'}:{history_hash}:{request_data.prompt}"
         cached_response = _get_cached_response(cache_key)
         if cached_response:
             logger.info("💾 CACHE HIT! Returning cached response.")
-            return cached_response
-        logger.info("🔍 Cache miss. Proceeding with enhanced RAG pipeline.")
         # Step 2: Check if the vector database has any content
         doc_count = rag_setup.collection.count()
         logger.info(f"📚 Vector DB contains {doc_count} documents")
         if doc_count == 0:
             logger.warning("⚠️  Vector DB is empty. Cannot answer query.")
             return "I don't have any specific context loaded right now. Please provide some context in the Knowledge Base and click 'Index Context' before asking questions. However, I'd be happy to help with general questions using my built-in knowledge!"
-        # Step 3: Query expansion for better retrieval
-        logger.info("🔍 Expanding query for better retrieval...")
-        query_variations = _expand_query(request_data.prompt)
-        logger.info(f"📝 Generated {len(query_variations)} query variations")
-        # Step 4: Retrieve chunks for each query variation
-        all_chunks = []
-        all_metadatas = []
-        for query_var in query_variations:
-            logger.info(f"🔎 Retrieving chunks for: '{query_var[:50]}...'")
-            retrieved = await _retrieve_chunks_async(
-                query_var,
-                n_results=settings.MAX_CHUNKS_RETRIEVE + 2  # Retrieve more for better ranking
-            )
-            if retrieved and retrieved.get('documents') and retrieved['documents'][0]:
-                all_chunks.append(retrieved['documents'][0])
-                all_metadatas.append(retrieved.get('metadatas', [[]])[0])
-        if not all_chunks:
             logger.warning("❌ No relevant chunks found in the vector DB for this query.")
-            return "I couldn't find specific information about that in the provided context. Let me help you with what I know from my general knowledge:\n\n" + await _generate_fallback_response(
-                request_data.prompt, api_key)
-        # Step 5: Deduplicate and rank chunks
-        logger.info("🎯 Deduplicating and ranking retrieved chunks...")
-        unique_chunks, unique_metadatas = _deduplicate_and_rank_chunks(all_chunks, all_metadatas)
-        # Limit to best chunks
-        final_chunks = unique_chunks[:settings.MAX_CHUNKS_RETRIEVE + 1]
-        logger.info(f"📋 Using {len(final_chunks)} unique, ranked chunks for context")
-        # Log chunk details
-        for i, (chunk, meta) in enumerate(zip(final_chunks, unique_metadatas[:len(final_chunks)])):
-            logger.info(f"   Chunk {i + 1}: {len(chunk)} chars, position: {meta.get('position', 'unknown')}")
-        context_for_prompt = "\n\n---\n\n".join(final_chunks)
         # Limit context length to prevent timeouts
         max_context_length = settings.MAX_CONTEXT_LENGTH_CHAT
         if len(context_for_prompt) > max_context_length:
             logger.warning(f"⚠️  Context too long, truncating to {max_context_length}")
             context_for_prompt = context_for_prompt[:max_context_length] + "\n\n[... content truncated ...]"
-        # Step 6: Build conversation history context
-        history_context = ""
-        if request_data.conversation_history and len(request_data.conversation_history) > 0:
-            logger.info(f"💬 Including {len(request_data.conversation_history)} previous messages for context")
-            history_messages = request_data.conversation_history[-6:]  # Last 6 messages (3 exchanges)
-            history_parts = []
-            for msg in history_messages:
-                role_label = "User" if msg.role == "user" else "Assistant"
-                # Truncate very long messages
-                content = msg.content[:300] + "..." if len(msg.content) > 300 else msg.content
-                history_parts.append(f"{role_label}: {content}")
-            history_context = (
-                "\n\nPREVIOUS CONVERSATION:\n"
-                + "\n".join(history_parts)
-                + "\n"
-            )
-        # Step 7: Construct enhanced prompt with conversation history
         full_prompt = (
-            "You are an intelligent assistant with access to specific context information and conversation history. "
-            "Your goal is to provide comprehensive, helpful answers that:\n"
-            "• Take into account the previous conversation flow\n"
-            "• Use the provided context as your PRIMARY source when relevant\n"
-            "• Build naturally on previous exchanges\n"
-            "• Provide coherent, contextually appropriate responses\n\n"
             "INSTRUCTIONS:\n"
-            "• Reference previous conversation when relevant for continuity\n"
-            "• Use the document context as your primary source for factual information\n"
-            "• If the user asks follow-up questions, refer back to previous answers\n"
-            "• Be natural and conversational - maintain the conversation flow\n"
-            "• Provide detailed, well-structured responses\n"
-            "• If information isn't in the context, acknowledge it and provide general knowledge help\n"
-        )
-        if history_context:
-            full_prompt += history_context + "\n"
-        full_prompt += (
-            "DOCUMENT CONTEXT:\n"
             f"{context_for_prompt}\n\n"
-            f"CURRENT USER QUESTION: {request_data.prompt}\n\n"
-            "Please provide a comprehensive, contextually appropriate response:"
         )
-        # Step 8: Generate the response using the LLM
-        logger.info("🧠 Generating response with conversation context...")
         response_text = await _generate_response_async(full_prompt, api_key)
-        # Step 9: Cache the newly generated response
         _cache_response(cache_key, response_text)
         logger.info("💾 Response cached for future use")
         total_time = time.time() - start_total
         logger.info(f"⏱️  Total processing time: {total_time:.2f}s")
-        logger.info("✅ ENHANCED RAG PIPELINE COMPLETED SUCCESSFULLY")
         logger.info("=" * 80)
         return response_text
     except asyncio.TimeoutError:
@@ -1015,21 +815,20 @@ async def _generate_fallback_response(prompt: str, api_key: Optional[str] = None
         f"Question: {prompt}\n\n"
         f"Answer:"
     )
     try:
         return await _generate_response_async(fallback_prompt, api_key)
     except Exception as e:
         logger.error(f"❌ Fallback response generation failed: {e}")
         return "I'm having trouble generating a response right now. Please try again or rephrase your question."
 async def execute_task(request_data: TaskRequest, api_key: Optional[str] = None) -> str:
     """
     Executes a specific task on the given context.
     Uses provided API key or falls back to default.
     """
     start_total = time.time()
     logger.info("=" * 80)
     logger.info("🎯 STARTING TASK EXECUTION")
     logger.info("=" * 80)
@@ -1072,7 +871,7 @@ async def execute_task(request_data: TaskRequest, api_key: Optional[str] = None)
         logger.info(f"⏱️  Task execution time: {total_time:.2f}s")
         logger.info("✅ TASK EXECUTION COMPLETED SUCCESSFULLY")
         logger.info("=" * 80)
         return response_text
     except asyncio.TimeoutError:
@@ -1082,7 +881,6 @@ async def execute_task(request_data: TaskRequest, api_key: Optional[str] = None)
         logger.error(f"❌ An unexpected error occurred during task execution: {e}", exc_info=True)
         return f"An unexpected error occurred: {e}"
 # --- ASYNC WRAPPERS & CACHE HELPERS ---
 async def _retrieve_chunks_async(prompt: str, n_results: int = 2):
@@ -1096,12 +894,11 @@ async def _retrieve_chunks_async(prompt: str, n_results: int = 2):
     logger.info(f"📊 ChromaDB query returned {len(result.get('documents', [[]])[0])} chunks")
     return result
 async def _generate_response_async(full_prompt: str, api_key: Optional[str] = None):
     """Asynchronously calls the LLM to generate content."""
     logger.info("🤖 Calling LLM for content generation...")
     logger.info(f"📏 Prompt length sent to LLM: {len(full_prompt)} characters")
     # Use custom API key if provided, otherwise use default
     if api_key:
         llm_instance = create_llm_instance(api_key)
@@ -1109,18 +906,17 @@ async def _generate_response_async(full_prompt: str, api_key: Optional[str] = No
     else:
         llm_instance = rag_setup.generation_model
         logger.info("�� Using default API key")
     loop = asyncio.get_event_loop()
     response = await loop.run_in_executor(
         None,
         llm_instance.generate_content,
         full_prompt
     )
     logger.info(f"✅ LLM response received (length: {len(response)} chars)")
     return response
 def _get_cached_response(key: str):
     """Checks the cache for a valid (non-expired) entry."""
     if key in _response_cache:
@@ -1134,7 +930,6 @@ def _get_cached_response(key: str):
             logger.info(f"🗑️  Expired cache entry removed for key: '{key[:50]}...'")
     return None
 def _cache_response(key: str, response: str):
     """Adds a response to the cache with the current timestamp."""
     _response_cache[key] = (time.time(), response)

 import rag_setup
 from schemas import ChatRequest, DocumentRequest, TaskRequest
 from typing import Optional, Tuple
+from config import settings  # Fixed: removed 'app.' prefix
 from fastapi import UploadFile, HTTPException
 import json
 import xml.etree.ElementTree as ET
 from striprtf.striprtf import rtf_to_text
 import markdown
 try:
     import fitz  # PyMuPDF
 except ImportError:
 try:
     import docx  # python-docx for Word documents
 except ImportError:
+    logging.error("python-docx is not installed. Word document processing will not work. Please run 'pip install python-docx'")
     docx = None
 try:
     from pptx import Presentation  # python-pptx for PowerPoint
 except ImportError:
+    logging.error("python-pptx is not installed. PowerPoint processing will not work. Please run 'pip install python-pptx'")
     Presentation = None
 try:
     import pandas as pd  # For Excel and CSV files
 except ImportError:
+    logging.error("pandas is not installed. Excel/CSV processing will not work. Please run 'pip install pandas openpyxl'")
     pd = None
 try:
     from bs4 import BeautifulSoup  # For HTML parsing
 except ImportError:
+    logging.error("BeautifulSoup is not installed. HTML processing will not work. Please run 'pip install beautifulsoup4'")
     BeautifulSoup = None
 logging.basicConfig(
     level=logging.INFO,
     format='%(asctime)s [%(levelname)s] %(message)s',
 _response_cache = {}
 CACHE_EXPIRATION_SECONDS = 600  # 10 minutes
+def create_llm_instance(api_key: str) -> rag_setup.OpenRouterLLM:
+    """Create a new LLM instance with the provided API key."""
+    return rag_setup.OpenRouterLLM(
+        api_key=api_key,
+        base_url=settings.OPENROUTER_URL,
+        model=settings.MODEL_NAME
+    )
+async def test_api_key(api_key: str) -> dict:
+    """Test if the provided API key is valid."""
     logger.info(f"🔍 Testing API key: {api_key[:10]}...")
     try:
         # Validate API key format first
         if not api_key or not api_key.strip():
                 "message": "API key cannot be empty",
                 "model_info": None
             }
+        if not api_key.startswith('sk-or-'):
+            logger.error("❌ API key has incorrect format")
             return {
                 "valid": False,
+                "message": "OpenRouter API keys should start with 'sk-or-'",
                 "model_info": None
             }
         if len(api_key) < 40:
             logger.error("❌ API key is too short")
             return {
                 "message": "API key appears to be too short",
                 "model_info": None
             }
         # Create a temporary LLM instance
+        test_llm = create_llm_instance(api_key)
         # Test with a minimal prompt to avoid quota usage
         test_response = test_llm._make_api_request("Hi", max_tokens=1)
         # Check for explicit errors first
         if "error" in test_response:
             error_msg = test_response["error"]
             logger.error(f"❌ API key test failed: {error_msg}")
             # Parse specific error types
             if "401" in str(error_msg) or "403" in str(error_msg) or "Unauthorized" in str(error_msg):
                 return {
                     "message": f"API key test failed: {error_msg}",
                     "model_info": None
                 }
         # Check for successful response with proper structure
         if "choices" in test_response and test_response["choices"]:
             choice = test_response["choices"][0]
                     "message": "API key is valid and working!",
                     "model_info": model_info
                 }
         # If we get here, the response format is unexpected
         logger.error(f"❌ API key test failed: Unexpected response format - {test_response}")
         return {
             "message": "API key test failed: Unexpected response format from OpenRouter",
             "model_info": None
         }
     except Exception as e:
         logger.error(f"❌ API key test failed with exception: {str(e)}")
         error_msg = str(e)
         # Parse common error patterns
         if "401" in error_msg or "403" in error_msg or "Unauthorized" in error_msg:
             return {
                 "model_info": None
             }
 async def process_and_index_file(file: UploadFile) -> Tuple[int, str]:
     """
     Processes an uploaded file, extracts text, calls the indexing function,
     Supports: .txt, .pdf, .docx, .pptx, .xlsx, .csv, .json, .xml, .html, .md, .rtf
     """
     logger.info(f"📄 Processing file '{file.filename}' with content type '{file.content_type}'")
     # Read file content
     file_content = await file.read()
     text = ""
     try:
         if file_extension == "txt":
             text = await _process_txt_file(file_content)
         elif file_extension == "pdf":
             text = await _process_pdf_file(file_content)
         elif file_extension == "docx":
             text = await _process_docx_file(file_content)
         elif file_extension in ["ppt", "pptx"]:
             text = await _process_pptx_file(file_content)
         elif file_extension in ["xls", "xlsx"]:
             text = await _process_excel_file(file_content, file.filename)
         elif file_extension == "csv":
             text = await _process_csv_file(file_content)
         elif file_extension == "json":
             text = await _process_json_file(file_content)
         elif file_extension == "xml":
             text = await _process_xml_file(file_content)
         elif file_extension in ["html", "htm"]:
             text = await _process_html_file(file_content)
         elif file_extension in ["md", "markdown"]:
             text = await _process_markdown_file(file_content)
         elif file_extension == "rtf":
             text = await _process_rtf_file(file_content)
         else:
+            supported_extensions = ['.txt', '.pdf', '.docx', '.pptx', '.xlsx', '.csv', '.json', '.xml', '.html', '.md', '.rtf']
             logger.error(f"❌ Unsupported file type: {file.filename}")
             raise HTTPException(
+                status_code=400,
                 detail=f"Unsupported file type. Please upload one of: {', '.join(supported_extensions)}"
             )
     # Validate extracted text
     if not text or not text.strip():
         logger.error("❌ Extracted text is empty or whitespace only")
+        raise HTTPException(status_code=400, detail="Extracted text is empty. The file might be empty, corrupted, or unreadable.")
     # Clean up the text
     text = text.strip()
     # Log processing stats
     word_count = len(text.split())
     logger.info(f"📊 Text processing complete: {len(text)} characters, {word_count} words")
     # Index the extracted text using existing logic
     try:
         doc_request = DocumentRequest(context=text)
         docs_added = index_document(doc_request)
         logger.info(f"✅ Successfully indexed {docs_added} document chunks from file")
         return docs_added, text
     except Exception as e:
         logger.error(f"❌ Failed to index extracted text: {e}")
         raise HTTPException(status_code=500, detail=f"Failed to index extracted text: {str(e)}")
                     continue
             else:
                 raise UnicodeDecodeError("Unable to decode file with any common encoding")
         logger.info(f"✅ Extracted {len(text)} characters from .txt file")
         return text
     except UnicodeDecodeError as e:
         logger.error(f"❌ Could not decode .txt file: {e}")
+        raise HTTPException(status_code=400, detail="Could not decode .txt file. Please ensure it uses UTF-8, Latin-1, or CP1252 encoding.")
 async def _process_pdf_file(file_content: bytes) -> str:
     if fitz is None:
         logger.error("❌ PyMuPDF not available for PDF processing")
         raise HTTPException(status_code=501, detail="PDF processing is not available. PyMuPDF is not installed.")
     logger.info("📖 Opening PDF document...")
     doc = fitz.open(stream=file_content, filetype="pdf")
     try:
         text_parts = []
         page_count = len(doc)
         logger.info(f"📑 PDF has {page_count} pages")
         for page_num in range(page_count):
             try:
                 page = doc[page_num]
                 page_text = page.get_text()
                 if page_text and page_text.strip():
                     text_parts.append(f"--- Page {page_num + 1} ---\n{page_text.strip()}")
                     logger.info(f"📄 Extracted text from page {page_num + 1}: {len(page_text)} characters")
                 else:
                     logger.info(f"📄 Page {page_num + 1} is empty or contains no extractable text")
             except Exception as page_error:
                 logger.warning(f"⚠️  Could not extract text from page {page_num + 1}: {page_error}")
                 continue
         text = "\n\n".join(text_parts)
         logger.info(f"✅ Extracted text from {len(text_parts)} pages of the PDF file ({len(text)} characters)")
         return text
     finally:
         doc.close()
         logger.info("📕 PDF document closed successfully")
 async def _process_docx_file(file_content: bytes) -> str:
     """Process .docx files using python-docx."""
     if docx is None:
+        raise HTTPException(status_code=501, detail="Word document processing is not available. python-docx is not installed.")
     from io import BytesIO
     doc = docx.Document(BytesIO(file_content))
     text_parts = []
     for paragraph in doc.paragraphs:
         if paragraph.text.strip():
             text_parts.append(paragraph.text.strip())
     # Also extract text from tables
     for table in doc.tables:
         for row in table.rows:
                     row_text.append(cell.text.strip())
             if row_text:
                 text_parts.append(" | ".join(row_text))
     text = "\n\n".join(text_parts)
     logger.info(f"✅ Extracted {len(text)} characters from Word document")
     return text
 async def _process_pptx_file(file_content: bytes) -> str:
     """Process .pptx files using python-pptx."""
     if Presentation is None:
+        raise HTTPException(status_code=501, detail="PowerPoint processing is not available. python-pptx is not installed.")
     from io import BytesIO
     prs = Presentation(BytesIO(file_content))
     text_parts = []
     for slide_num, slide in enumerate(prs.slides, 1):
         slide_text = [f"--- Slide {slide_num} ---"]
         for shape in slide.shapes:
             if hasattr(shape, "text") and shape.text.strip():
                 slide_text.append(shape.text.strip())
         if len(slide_text) > 1:  # More than just the slide header
             text_parts.append("\n".join(slide_text))
     text = "\n\n".join(text_parts)
     logger.info(f"✅ Extracted text from {len(prs.slides)} PowerPoint slides ({len(text)} characters)")
     return text
     """Process .xlsx/.xls files using pandas."""
     if pd is None:
         raise HTTPException(status_code=501, detail="Excel processing is not available. pandas is not installed.")
     from io import BytesIO
     try:
         # Read all sheets
         excel_file = pd.ExcelFile(BytesIO(file_content))
         text_parts = [f"Excel File: {filename}"]
         for sheet_name in excel_file.sheet_names:
             df = pd.read_excel(excel_file, sheet_name=sheet_name)
             if not df.empty:
                 text_parts.append(f"\n--- Sheet: {sheet_name} ---")
                 # Convert DataFrame to readable text
                 # Include column headers
                 text_parts.append("Columns: " + " | ".join(str(col) for col in df.columns))
                 # Add row data (limit to first 100 rows to avoid huge files)
                 for idx, row in df.head(100).iterrows():
                     row_text = " | ".join(str(val) for val in row.values if pd.notna(val))
                     if row_text.strip():
                         text_parts.append(row_text)
                 if len(df) > 100:
                     text_parts.append(f"... and {len(df) - 100} more rows")
         text = "\n".join(text_parts)
+        logger.info(f"✅ Extracted data from Excel file with {len(excel_file.sheet_names)} sheets ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process Excel file: {str(e)}")
     """Process .csv files using pandas."""
     if pd is None:
         raise HTTPException(status_code=501, detail="CSV processing is not available. pandas is not installed.")
     from io import StringIO
     try:
         # Try different encodings for CSV
         for encoding in ['utf-8', 'latin-1', 'cp1252']:
                 continue
         else:
             raise ValueError("Could not decode CSV file with any common encoding")
         if df.empty:
             raise ValueError("CSV file is empty")
         text_parts = ["CSV Data:"]
         text_parts.append("Columns: " + " | ".join(str(col) for col in df.columns))
         # Add row data (limit to first 200 rows)
         for idx, row in df.head(200).iterrows():
             row_text = " | ".join(str(val) for val in row.values if pd.notna(val))
             if row_text.strip():
                 text_parts.append(row_text)
         if len(df) > 200:
             text_parts.append(f"... and {len(df) - 200} more rows")
         text = "\n".join(text_parts)
         logger.info(f"✅ Extracted data from CSV file with {len(df)} rows ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process CSV file: {str(e)}")
     try:
         json_text = file_content.decode('utf-8')
         data = json.loads(json_text)
         # Convert JSON to readable text format
         def json_to_text(obj, indent=0):
             lines = []
             prefix = "  " * indent
             if isinstance(obj, dict):
                 for key, value in obj.items():
                     if isinstance(value, (dict, list)):
                         lines.append(f"{prefix}[{i}]: {item}")
             else:
                 lines.append(f"{prefix}{obj}")
             return lines
         text_lines = ["JSON Data:"] + json_to_text(data)
         text = "\n".join(text_lines)
         logger.info(f"✅ Extracted data from JSON file ({len(text)} characters)")
         return text
     except json.JSONDecodeError as e:
         raise HTTPException(status_code=400, detail=f"Invalid JSON file: {str(e)}")
     except Exception as e:
     try:
         xml_text = file_content.decode('utf-8')
         root = ET.fromstring(xml_text)
         def xml_to_text(element, indent=0):
             lines = []
             prefix = "  " * indent
             # Add element name and attributes
             if element.attrib:
                 attrs = " ".join(f'{k}="{v}"' for k, v in element.attrib.items())
                 lines.append(f"{prefix}{element.tag} ({attrs}):")
             else:
                 lines.append(f"{prefix}{element.tag}:")
             # Add text content
             if element.text and element.text.strip():
                 lines.append(f"{prefix}  {element.text.strip()}")
             # Add child elements
             for child in element:
                 lines.extend(xml_to_text(child, indent + 1))
             return lines
         text_lines = ["XML Data:"] + xml_to_text(root)
         text = "\n".join(text_lines)
         logger.info(f"✅ Extracted data from XML file ({len(text)} characters)")
         return text
     except ET.ParseError as e:
         raise HTTPException(status_code=400, detail=f"Invalid XML file: {str(e)}")
     except Exception as e:
     """Process .html files using BeautifulSoup."""
     if BeautifulSoup is None:
         raise HTTPException(status_code=501, detail="HTML processing is not available. BeautifulSoup is not installed.")
     try:
         html_text = file_content.decode('utf-8')
         soup = BeautifulSoup(html_text, 'html.parser')
         # Remove script and style elements
         for script in soup(["script", "style"]):
             script.decompose()
         # Get text content
         text = soup.get_text()
         # Clean up whitespace
         lines = (line.strip() for line in text.splitlines())
         chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
         text = '\n'.join(chunk for chunk in chunks if chunk)
         logger.info(f"✅ Extracted text from HTML file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process HTML file: {str(e)}")
     """Process .md files."""
     try:
         md_text = file_content.decode('utf-8')
         # Convert markdown to HTML then to plain text for better readability
         html = markdown.markdown(md_text)
         if BeautifulSoup:
         else:
             # Fallback: use raw markdown
             text = md_text
         logger.info(f"✅ Extracted text from Markdown file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process Markdown file: {str(e)}")
     try:
         rtf_text = file_content.decode('utf-8')
         text = rtf_to_text(rtf_text)
         logger.info(f"✅ Extracted text from RTF file ({len(text)} characters)")
         return text
     except Exception as e:
         raise HTTPException(status_code=400, detail=f"Could not process RTF file: {str(e)}")
 def index_document(request_data: DocumentRequest) -> int:
     logger.info("=" * 80)
+    logger.info("📚 STARTING DOCUMENT INDEXING PROCESS")
     logger.info("=" * 80)
     # Log the incoming context
     context_preview = request_data.context[:200] + "..." if len(request_data.context) > 200 else request_data.context
     logger.info(f"📝 CONTEXT TO INDEX (length: {len(request_data.context)} chars):")
         else:
             logger.info("📂 No existing documents to clear.")
+        # Step 2: Chunk document with better chunking strategy
+        text_chunks = textwrap.wrap(
+            request_data.context,
+            width=600,
+            break_long_words=False,
+            replace_whitespace=False,
+            break_on_hyphens=False
+        )
+        # If chunks are still too few, try splitting on sentences/paragraphs
+        if len(text_chunks) < 3 and len(request_data.context) > 1200:
+            logger.info("🔧 Using sentence-based chunking for better granularity")
+            paragraphs = request_data.context.split('\n\n')
+            text_chunks = []
+            for para in paragraphs:
+                para = para.strip()
+                if not para:
+                    continue
+                if len(para) <= 600:
+                    text_chunks.append(para)
+                else:
+                    sub_chunks = textwrap.wrap(para, width=600, break_long_words=False)
+                    text_chunks.extend(sub_chunks)
+        # Filter out empty chunks
+        text_chunks = [chunk.strip() for chunk in text_chunks if chunk.strip()]
         if not text_chunks:
             logger.warning("⚠️  No text chunks were generated.")
             return 0
+        logger.info(f"✂️  Document split into {len(text_chunks)} chunks")
+        # Step 3: Add chunks to ChromaDB
+        chunk_ids = [f"doc_chunk_{i}_{int(time.time())}" for i in range(len(text_chunks))]
+        logger.info(f"💾 Adding {len(chunk_ids)} chunks to ChromaDB...")
+        # Add documents with metadata
+        metadatas = [{"chunk_index": i, "timestamp": int(time.time())} for i in range(len(text_chunks))]
         rag_setup.collection.add(
+            documents=text_chunks,
             ids=chunk_ids,
             metadatas=metadatas
         )
+        logger.info("✅ DOCUMENT INDEXING COMPLETED SUCCESSFULLY")
         logger.info(f"📊 Total chunks indexed: {len(text_chunks)}")
         logger.info("=" * 80)
         return len(text_chunks)
     except Exception as e:
         logger.error(f"❌ Error during indexing: {str(e)}", exc_info=True)
         raise
 def clear_index():
     """Clears all documents from the vector database."""
     logger.info("🗑️  Clearing vector index...")
         logger.error(f"❌ Error clearing vector index: {e}")
         raise
 async def get_rag_response(request_data: ChatRequest, api_key: Optional[str] = None) -> str:
     """
+    Performs the RAG pipeline: checks cache, retrieves context, generates a response.
+    Uses provided API key or falls back to default.
     """
     start_total = time.time()
     logger.info("=" * 80)
+    logger.info("🤖 STARTING RAG PIPELINE")
     logger.info("=" * 80)
     logger.info(f"❓ USER PROMPT: '{request_data.prompt}'")
     logger.info(f"📏 Prompt length: {len(request_data.prompt)} characters")
     logger.info(f"🔑 Using custom API key: {'Yes' if api_key else 'No'}")
     logger.info("-" * 60)
     try:
+        # Step 1: Check cache for a recent, identical query
+        cache_key = f"{api_key or 'default'}:{request_data.prompt}"
         cached_response = _get_cached_response(cache_key)
         if cached_response:
             logger.info("💾 CACHE HIT! Returning cached response.")
+            return f"{cached_response}\n\n(This response was retrieved from cache)"
+        logger.info("🔍 Cache miss. Proceeding with RAG pipeline.")
         # Step 2: Check if the vector database has any content
         doc_count = rag_setup.collection.count()
         logger.info(f"📚 Vector DB contains {doc_count} documents")
         if doc_count == 0:
             logger.warning("⚠️  Vector DB is empty. Cannot answer query.")
             return "I don't have any specific context loaded right now. Please provide some context in the Knowledge Base and click 'Index Context' before asking questions. However, I'd be happy to help with general questions using my built-in knowledge!"
+        # Step 3: Retrieve relevant chunks from ChromaDB
+        logger.info("🔎 Retrieving relevant chunks from vector DB...")
+        retrieved_chunks = await _retrieve_chunks_async(
+            request_data.prompt,
+            n_results=settings.MAX_CHUNKS_RETRIEVE
+        )
+        if not retrieved_chunks or not retrieved_chunks.get('documents') or not retrieved_chunks['documents'][0]:
             logger.warning("❌ No relevant chunks found in the vector DB for this query.")
+            return "I couldn't find specific information about that in the provided context. Let me help you with what I know from my general knowledge:\n\n" + await _generate_fallback_response(request_data.prompt, api_key)
+        # Log retrieved chunks
+        chunks = retrieved_chunks['documents'][0]
+        logger.info(f"📋 Retrieved {len(chunks)} relevant chunks")
+        context_for_prompt = "\n\n---\n\n".join(chunks)
         # Limit context length to prevent timeouts
         max_context_length = settings.MAX_CONTEXT_LENGTH_CHAT
         if len(context_for_prompt) > max_context_length:
             logger.warning(f"⚠️  Context too long, truncating to {max_context_length}")
             context_for_prompt = context_for_prompt[:max_context_length] + "\n\n[... content truncated ...]"
+        # Step 4: Construct improved prompt for the LLM
         full_prompt = (
+            "You are an intelligent assistant with access to specific context information. "
+            "Your goal is to provide comprehensive, helpful answers that combine the provided context with your expertise.\n\n"
             "INSTRUCTIONS:\n"
+            "• Use the provided context as your PRIMARY source when it's relevant\n"
+            "• If the context fully answers the question, focus on that information and enhance it with practical insights\n"
+            "• If the context only partially addresses the question, build upon it with your knowledge\n"
+            "• If the context isn't relevant to the question, briefly mention this and provide a helpful answer based on your expertise\n"
+            "• Be natural and conversational - avoid robotic phrases like 'based solely on the context'\n"
+            "• Provide actionable, practical advice when appropriate\n"
+            "• Structure your response clearly with headings or bullet points when helpful\n\n"
+            "CONTEXT INFORMATION:\n"
             f"{context_for_prompt}\n\n"
+            f"USER QUESTION: {request_data.prompt}\n\n"
+            "Please provide a comprehensive, helpful response:"
         )
+        # Step 5: Generate the response using the LLM
+        logger.info("🧠 Generating response from OpenRouter...")
         response_text = await _generate_response_async(full_prompt, api_key)
+        # Step 6: Cache the newly generated response
         _cache_response(cache_key, response_text)
         logger.info("💾 Response cached for future use")
         total_time = time.time() - start_total
         logger.info(f"⏱️  Total processing time: {total_time:.2f}s")
+        logger.info("✅ RAG PIPELINE COMPLETED SUCCESSFULLY")
         logger.info("=" * 80)
         return response_text
     except asyncio.TimeoutError:
         f"Question: {prompt}\n\n"
         f"Answer:"
     )
     try:
         return await _generate_response_async(fallback_prompt, api_key)
     except Exception as e:
         logger.error(f"❌ Fallback response generation failed: {e}")
         return "I'm having trouble generating a response right now. Please try again or rephrase your question."
 async def execute_task(request_data: TaskRequest, api_key: Optional[str] = None) -> str:
     """
     Executes a specific task on the given context.
     Uses provided API key or falls back to default.
     """
     start_total = time.time()
     logger.info("=" * 80)
     logger.info("🎯 STARTING TASK EXECUTION")
     logger.info("=" * 80)
         logger.info(f"⏱️  Task execution time: {total_time:.2f}s")
         logger.info("✅ TASK EXECUTION COMPLETED SUCCESSFULLY")
         logger.info("=" * 80)
         return response_text
     except asyncio.TimeoutError:
         logger.error(f"❌ An unexpected error occurred during task execution: {e}", exc_info=True)
         return f"An unexpected error occurred: {e}"
 # --- ASYNC WRAPPERS & CACHE HELPERS ---
 async def _retrieve_chunks_async(prompt: str, n_results: int = 2):
     logger.info(f"📊 ChromaDB query returned {len(result.get('documents', [[]])[0])} chunks")
     return result
 async def _generate_response_async(full_prompt: str, api_key: Optional[str] = None):
     """Asynchronously calls the LLM to generate content."""
     logger.info("🤖 Calling LLM for content generation...")
     logger.info(f"📏 Prompt length sent to LLM: {len(full_prompt)} characters")
     # Use custom API key if provided, otherwise use default
     if api_key:
         llm_instance = create_llm_instance(api_key)
     else:
         llm_instance = rag_setup.generation_model
         logger.info("�� Using default API key")
     loop = asyncio.get_event_loop()
     response = await loop.run_in_executor(
         None,
         llm_instance.generate_content,
         full_prompt
     )
     logger.info(f"✅ LLM response received (length: {len(response)} chars)")
     return response
 def _get_cached_response(key: str):
     """Checks the cache for a valid (non-expired) entry."""
     if key in _response_cache:
             logger.info(f"🗑️  Expired cache entry removed for key: '{key[:50]}...'")
     return None
 def _cache_response(key: str, response: str):
     """Adds a response to the cache with the current timestamp."""
     _response_cache[key] = (time.time(), response)

main.py CHANGED Viewed

@@ -24,19 +24,19 @@ try:
     # Import and run the FastAPI app
     from app.main import app
     import uvicorn
     logger.info("Successfully imported FastAPI app")
     if __name__ == "__main__":
         port = int(os.environ.get("PORT", 7860))
         logger.info(f"Starting server on port {port}")
         uvicorn.run(
-            app,
-            host="0.0.0.0",
             port=port,
             log_level="info"
         )
 except Exception as e:
     logger.error(f"Error starting application: {e}")
     raise

     # Import and run the FastAPI app
     from app.main import app
     import uvicorn
     logger.info("Successfully imported FastAPI app")
     if __name__ == "__main__":
         port = int(os.environ.get("PORT", 7860))
         logger.info(f"Starting server on port {port}")
         uvicorn.run(
+            app,
+            host="0.0.0.0",
             port=port,
             log_level="info"
         )
 except Exception as e:
     logger.error(f"Error starting application: {e}")
     raise

requirements.txt CHANGED Viewed

@@ -1,24 +1,35 @@
-# Core Framework
-fastapi~=0.116.1
-uvicorn~=0.35.0
-pydantic~=2.11.7
-pydantic-settings~=2.10.1
-python-multipart~=0.0.20
-jinja2~=3.1.2
-# AI/ML Libraries
-openai~=1.62.0
-chromadb~=1.0.15
-scikit-learn~=1.7.1
-numpy~=2.3.1
-requests~=2.32.4
-# File Processing Libraries
-PyMuPDF~=1.25.2
-python-docx~=1.1.2
-python-pptx~=1.0.2
-pandas~=2.2.3
-openpyxl~=3.1.5
-beautifulsoup4~=4.12.3
-striprtf~=0.0.26
-markdown~=3.7

+# Core FastAPI dependencies
+fastapi==0.104.1
+uvicorn==0.24.0
+python-multipart==0.0.6
+jinja2==3.1.2
+aiofiles==23.2.1
+# Configuration management
+pydantic==2.5.0
+pydantic-settings==2.1.0
+python-dotenv==1.0.0
+# Vector database and embeddings
+chromadb==0.4.18
+sentence-transformers==2.2.2
+scikit-learn==1.3.2
+numpy==1.24.4
+# HTTP client
+requests==2.31.0
+# Existing file processing
+pymupdf==1.23.9  # PDF processing
+# New file processing dependencies
+python-docx==1.1.0          # Word documents (.docx)
+python-pptx==0.6.23         # PowerPoint presentations (.pptx)
+pandas==2.1.4               # Excel and CSV files
+openpyxl==3.1.2             # Excel file support for pandas
+xlrd==2.0.1                 # Legacy Excel file support
+beautifulsoup4==4.12.2      # HTML parsing
+lxml==4.9.3                 # XML parsing (faster than built-in)
+markdown==3.5.1             # Markdown processing
+striprtf==0.0.26            # RTF file processing
+chardet==5.2.0              # Character encoding detection

static/app.js CHANGED Viewed

@@ -9,17 +9,13 @@ class ContextAwareApp {
             chatContainer: document.getElementById('chat-container'),
             statusIndicator: document.getElementById('status-indicator'),
             clearContextBtn: document.getElementById('clear-context-btn'),
-            clearHistoryBtn: document.getElementById('clear-history-btn'),
             indexContextBtn: document.getElementById('index-context-btn'),
             taskSelect: document.getElementById('task-select'),
             charCount: document.getElementById('char-count'),
             wordCount: document.getElementById('word-count'),
             // API Key elements
             apiKeyInput: document.getElementById('api-key-input'),
-            providerSelect: document.getElementById('provider-select'),
-            providerLink: document.getElementById('provider-link'),
-            providerModels: document.getElementById('provider-models'),
             testApiKeyBtn: document.getElementById('test-api-key'),
             saveApiKeyBtn: document.getElementById('save-api-key'),
             apiKeyStatus: document.getElementById('api-key-status'),
@@ -40,18 +36,6 @@ class ContextAwareApp {
             assistantHeader: document.getElementById('assistant-header'),
             assistantContent: document.getElementById('assistant-content'),
             assistantToggleIcon: document.getElementById('assistant-toggle-icon'),
-            // RAG Info Panel elements
-            ragInfoPanel: document.getElementById('rag-info-panel'),
-            ragInfoContent: document.getElementById('rag-info-content'),
-            toggleRagInfo: document.getElementById('toggle-rag-info'),
-            ragToggleText: document.getElementById('rag-toggle-text'),
-            ragStatus: document.getElementById('rag-status'),
-            ragChunks: document.getElementById('rag-chunks'),
-            ragContextSize: document.getElementById('rag-context-size'),
-            ragModel: document.getElementById('rag-model'),
-            ragRetrieval: document.getElementById('rag-retrieval'),
-            ragHistory: document.getElementById('rag-history'),
         };
         // Application state
@@ -62,11 +46,7 @@ class ContextAwareApp {
             apiKeyValidated: false,
             isTestingApiKey: false,
             userApiKey: '',
-            provider: 'openrouter',
-            conversationHistory: [],
-            ragInfoCollapsed: false,
-            chunksIndexed: 0,
-            contextSize: 0,
             apiSectionCollapsed: false,
             kbSectionCollapsed: false,
             assistantSectionCollapsed: false,
@@ -82,20 +62,18 @@ class ContextAwareApp {
         this.addEventListeners();
         this.loadStoredApiKey();
         this.setupResponsiveUI();
         // Show welcome message
         this.addMessageToChat(
             "👋 **Welcome to ContextIQ!**\n\n" +
             "To get started:\n" +
-            "1. **Choose your AI provider** (OpenRouter or OpenAI) in the configuration section above.\n" +
-            "2. **Enter your API key** for your chosen provider.\n" +
-            "3. **Add your context** by uploading a file or pasting text in the Knowledge Base.\n" +
-            "4. **Index the context** and start asking questions!\n\n" +
-            "🆓 **OpenRouter** offers free access to 200+ models including Claude, GPT, and Gemini!\n" +
-            "💡 **OpenAI** provides GPT-4o, GPT-4o-mini, and other cutting-edge models!",
             'system'
         );
         // Initial UI update
         this.updateUI();
         this.updateContextStats();
@@ -107,7 +85,6 @@ class ContextAwareApp {
     addEventListeners() {
         this.elements.indexContextBtn.addEventListener('click', () => this.handleIndexContext());
         this.elements.clearContextBtn.addEventListener('click', () => this.handleClearContext());
-        this.elements.clearHistoryBtn.addEventListener('click', () => this.clearConversationHistory());
         this.elements.sendButton.addEventListener('click', () => this.handleSubmit());
         this.elements.chatInput.addEventListener('keydown', e => {
             if (e.key === 'Enter' && !e.shiftKey) {
@@ -125,16 +102,11 @@ class ContextAwareApp {
             this.updateUI();
         });
         this.elements.chatInput.addEventListener('input', () => this.autoResizeTextarea(this.elements.chatInput));
         // File input listener
         this.elements.fileInput.addEventListener('change', () => this.handleFileSelection());
-        // Provider selection listener
-        this.elements.providerSelect.addEventListener('change', () => {
-            this.handleProviderChange();
-        });
         // API Key listeners
         this.elements.testApiKeyBtn.addEventListener('click', (e) => {
             e.preventDefault();
@@ -154,17 +126,12 @@ class ContextAwareApp {
                 this.testApiKey();
             }
         });
         // Toggle listeners for collapsible sections
         this.elements.toggleApiSection.addEventListener('click', () => this.toggleSection('api'));
         this.elements.kbHeader.addEventListener('click', () => this.toggleSection('kb'));
         this.elements.assistantHeader.addEventListener('click', () => this.toggleSection('assistant'));
-        // RAG info panel toggle
-        if (this.elements.toggleRagInfo) {
-            this.elements.toggleRagInfo.addEventListener('click', () => this.toggleRagInfo());
-        }
         // Listen for window resize to adjust UI
         window.addEventListener('resize', () => this.setupResponsiveUI());
     }
@@ -191,10 +158,10 @@ class ContextAwareApp {
      */
     setupResponsiveUI() {
         const isMobile = window.innerWidth < 1024;
         this.state.kbSectionCollapsed = isMobile;
         this.state.assistantSectionCollapsed = false;
         if (this.state.apiKeyValidated) {
             this.state.apiSectionCollapsed = true;
         }
@@ -237,23 +204,10 @@ class ContextAwareApp {
      */
     loadStoredApiKey() {
         try {
-            const storedKey = localStorage.getItem('ai_api_key');
-            const storedProvider = localStorage.getItem('ai_provider');
             if (storedKey) {
                 this.elements.apiKeyInput.value = storedKey;
                 this.state.userApiKey = storedKey;
-            }
-            if (storedProvider) {
-                this.state.provider = storedProvider;
-                this.elements.providerSelect.value = storedProvider;
-            }
-            // Update UI based on provider
-            this.handleProviderChange();
-            if (storedKey) {
                 this.onApiKeyInputChange();
             }
         } catch (error) {
@@ -266,17 +220,14 @@ class ContextAwareApp {
      */
     onApiKeyInputChange() {
         const apiKey = this.elements.apiKeyInput.value.trim();
-        const provider = this.state.provider;
         this.state.apiKeyValidated = false;
         this.state.userApiKey = '';
         if (!apiKey) {
             this.updateApiKeyStatus('pending', 'Enter API key and click Test');
-        } else if (provider === 'openrouter' && !apiKey.startsWith('sk-or-')) {
-            this.updateApiKeyStatus('error', 'OpenRouter keys should start with "sk-or-"');
-        } else if (provider === 'openai' && !apiKey.startsWith('sk-')) {
-            this.updateApiKeyStatus('error', 'OpenAI keys should start with "sk-"');
         } else if (apiKey.length < 40) {
             this.updateApiKeyStatus('error', 'API key appears too short');
         } else {
@@ -307,10 +258,7 @@ class ContextAwareApp {
             const response = await fetch('/api/v1/test-api-key', {
                 method: 'POST',
                 headers: { 'Content-Type': 'application/json' },
-                body: JSON.stringify({
-                    api_key: apiKey,
-                    provider: this.state.provider
-                }),
                 signal: controller.signal
             });
@@ -322,7 +270,6 @@ class ContextAwareApp {
                 this.state.apiKeyValidated = true;
                 this.state.userApiKey = apiKey;
                 this.updateApiKeyStatus('success', result.message || 'API key is valid');
-                this.updateRagInfo();
                 if (!silent) {
                     this.addMessageToChat("✅ **API Key Validated!** You can now use the assistant.", 'system');
                     this.state.apiSectionCollapsed = true;
@@ -338,7 +285,7 @@ class ContextAwareApp {
             console.error('API key test error:', error);
             this.state.apiKeyValidated = false;
             this.state.userApiKey = '';
             let errorMessage = (error.name === 'AbortError') ? 'Request timed out.' : error.message;
             this.updateApiKeyStatus('error', errorMessage);
             if (!silent) this.addMessageToChat(`❌ **Connection Error**: ${errorMessage}`, 'system');
@@ -358,56 +305,29 @@ class ContextAwareApp {
             return;
         }
         try {
-            localStorage.setItem('ai_api_key', apiKey);
-            localStorage.setItem('ai_provider', this.state.provider);
             this.updateApiKeyStatus('success', 'API key saved locally!');
-            const providerName = this.state.provider === 'openai' ? 'OpenAI' : 'OpenRouter';
-            this.addMessageToChat(`💾 **API Key Saved!** Your ${providerName} key will be remembered for future sessions.`, 'system');
         } catch (error) {
             console.error('Save error:', error);
             this.addMessageToChat("❌ **Save Failed**: Could not save API key to local storage.", 'system');
         }
     }
-    /**
-     * Handle provider selection change
-     */
-    handleProviderChange() {
-        this.state.provider = this.elements.providerSelect.value;
-        // Update placeholder text
-        if (this.state.provider === 'openai') {
-            this.elements.apiKeyInput.placeholder = 'sk-your-openai-api-key-here';
-            this.elements.providerLink.innerHTML = '• Get your OpenAI API key from <a href="https://platform.openai.com/api-keys" target="_blank" class="text-indigo-400 hover:text-indigo-300">platform.openai.com</a>';
-            this.elements.providerModels.textContent = '• Access GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5-turbo, and more models';
-        } else {
-            this.elements.apiKeyInput.placeholder = 'sk-or-your-openrouter-api-key-here';
-            this.elements.providerLink.innerHTML = '• Get your free API key from <a href="https://openrouter.ai/" target="_blank" class="text-indigo-400 hover:text-indigo-300">openrouter.ai</a>';
-            this.elements.providerModels.textContent = '• OpenRouter provides access to 200+ models including Claude, GPT, Gemini, and more';
-        }
-        // Reset validation state when provider changes
-        this.state.apiKeyValidated = false;
-        this.state.userApiKey = '';
-        this.onApiKeyInputChange();
-        this.updateUI();
-        this.updateRagInfo();
-    }
     /**
      * Update API key status display
      */
     updateApiKeyStatus(status, message) {
         const statusEl = this.elements.apiStatusText;
         const iconEl = this.elements.apiStatusIcon;
         const statusConfig = {
             testing: { icon: 'bg-blue-500 animate-pulse', text: 'Testing...' },
             success: { icon: 'bg-green-500', text: 'API Key Valid' },
             error:   { icon: 'bg-red-500', text: 'API Key Invalid' },
             pending: { icon: 'bg-yellow-500', text: 'API Key Pending' },
         };
         iconEl.className = `w-3 h-3 ${statusConfig[status].icon} rounded-full flex-shrink-0`;
         statusEl.textContent = statusConfig[status].text;
@@ -446,7 +366,7 @@ class ContextAwareApp {
             this.handleExecuteTask();
         }
     }
     /**
      * ✨ REFACTORED: Unified logic for indexing from file or text.
      */
@@ -501,20 +421,14 @@ class ContextAwareApp {
             }
             this.state.isIndexed = true;
-            this.state.chunksIndexed = result.documents_added || 0;
-            this.state.contextSize = textContext.length || result.extracted_text?.length || 0;
             this.showStatus(result.message || 'Successfully indexed context.', 'success');
-            // Populate textarea with extracted text if available
             if (result.extracted_text) {
                 this.elements.contextInput.value = result.extracted_text;
-                this.state.contextSize = result.extracted_text.length;
                 this.updateContextStats();
             }
-            // Update RAG info panel
-            this.updateRagInfo();
         } catch (error) {
             console.error('Indexing error:', error);
             this.showStatus(`Error: ${error.message}`, 'error');
@@ -529,44 +443,36 @@ class ContextAwareApp {
     }
     /**
-     * Handles sending a user's prompt to the backend for a response with conversation history.
      */
     async handleSendPrompt() {
         const prompt = this.elements.chatInput.value.trim();
         if (prompt.length < 2 || this.state.isGenerating) return;
         if (!this.state.isIndexed) {
             this.showStatus('Please index your context before asking questions.', 'error');
             return;
         }
-        // Add user message to chat and conversation history
         this.addMessageToChat(prompt, 'user');
-        this.state.conversationHistory.push({ role: 'user', content: prompt });
         this.elements.chatInput.value = '';
         this.autoResizeTextarea(this.elements.chatInput);
         this.state.isGenerating = true;
         this.updateUI();
-        this.updateRagInfo();
-        this.showStatus('AI is thinking with full conversation context...', 'loading');
         try {
             const controller = new AbortController();
-            const timeoutId = setTimeout(() => controller.abort(), 90000); // Increased timeout for better responses
-            // Send prompt with conversation history for context-aware responses
             const response = await fetch('/api/v1/generate', {
                 method: 'POST',
-                headers: {
                     'Content-Type': 'application/json',
                     'X-API-Key': this.state.userApiKey
                 },
-                body: JSON.stringify({
-                    prompt,
-                    conversation_history: this.state.conversationHistory.slice(-20) // Last 20 messages (10 exchanges)
-                }),
                 signal: controller.signal
             });
@@ -574,15 +480,7 @@ class ContextAwareApp {
             const result = await response.json();
             if (!response.ok) throw new Error(result.detail || 'An unknown error occurred.');
-            // Add AI response to chat and conversation history
             this.addMessageToChat(result.response, 'ai');
-            this.state.conversationHistory.push({ role: 'assistant', content: result.response });
-            // Limit history size to prevent memory issues (keep last 40 messages = 20 exchanges)
-            if (this.state.conversationHistory.length > 40) {
-                this.state.conversationHistory = this.state.conversationHistory.slice(-40);
-            }
             this.showStatus('Ready for your next question.', 'success');
         } catch (error) {
             console.error('Generation error:', error);
@@ -592,7 +490,6 @@ class ContextAwareApp {
         } finally {
             this.state.isGenerating = false;
             this.updateUI();
-            this.updateRagInfo();
         }
     }
@@ -626,7 +523,7 @@ class ContextAwareApp {
             const response = await fetch('/api/v1/task', {
                 method: 'POST',
-                headers: {
                     'Content-Type': 'application/json',
                     'X-API-Key': this.state.userApiKey
                 },
@@ -656,95 +553,25 @@ class ContextAwareApp {
      */
     async handleClearContext() {
         this.elements.contextInput.value = '';
-        this.elements.fileInput.value = '';
         this.elements.fileName.textContent = 'Choose a file...';
         this.updateContextStats();
         this.state.isIndexed = false;
-        this.state.chunksIndexed = 0;
-        this.state.contextSize = 0;
         this.showStatus('Clearing knowledge base...', 'loading');
         try {
-            await fetch('/api/v1/clear_index', {
                 method: 'POST',
                 headers: { 'X-API-Key': this.state.userApiKey }
             });
             this.showStatus('Knowledge base cleared. Ready for new context.', 'success');
-            if (this.elements.ragInfoPanel) {
-                this.elements.ragInfoPanel.classList.add('hidden');
-            }
         } catch (error) {
             console.error('Clear index error:', error);
             this.showStatus(`Error clearing index: ${error.message}`, 'error');
         } finally {
             this.updateUI();
-            this.updateRagInfo();
-        }
-    }
-    /**
-     * Clear conversation history for a fresh start
-     */
-    clearConversationHistory() {
-        if (this.state.conversationHistory.length === 0) {
-            this.showStatus('Conversation history is already empty.', 'success');
-            return;
         }
-        if (confirm(`Clear ${this.state.conversationHistory.length} message(s) from conversation history?\n\nThis will reset the AI's memory of the conversation.`)) {
-            this.state.conversationHistory = [];
-            this.showStatus('Conversation history cleared. The AI will start fresh.', 'success');
-            this.addMessageToChat('💭 **Conversation history cleared.** I\'ll start fresh with your next question!', 'system');
-            this.updateRagInfo();
-        }
-    }
-    toggleRagInfo() {
-        this.state.ragInfoCollapsed = !this.state.ragInfoCollapsed;
-        if (this.state.ragInfoCollapsed) {
-            this.elements.ragInfoContent.classList.add('hidden');
-            this.elements.ragToggleText.textContent = 'Show';
-        } else {
-            this.elements.ragInfoContent.classList.remove('hidden');
-            this.elements.ragToggleText.textContent = 'Hide';
-        }
-    }
-    updateRagInfo() {
-        if (!this.elements.ragInfoPanel) return;
-        // Show panel when indexed
-        if (this.state.isIndexed) {
-            this.elements.ragInfoPanel.classList.remove('hidden');
-        }
-        // Update status
-        if (this.state.isGenerating) {
-            this.elements.ragStatus.textContent = 'Processing...';
-            this.elements.ragStatus.className = 'font-medium text-yellow-400';
-        } else if (this.state.isIndexed) {
-            this.elements.ragStatus.textContent = 'Active';
-            this.elements.ragStatus.className = 'font-medium text-green-400';
-        } else {
-            this.elements.ragStatus.textContent = 'Ready';
-            this.elements.ragStatus.className = 'font-medium text-slate-400';
-        }
-        // Update chunks and context size
-        this.elements.ragChunks.textContent = this.state.chunksIndexed;
-        this.elements.ragContextSize.textContent = this.state.contextSize.toLocaleString() + ' chars';
-        // Update model info
-        const modelMap = {
-            'openrouter': 'DeepSeek R1 (Free)',
-            'openai': 'GPT-4o-mini'
-        };
-        this.elements.ragModel.textContent = this.state.apiKeyValidated ?
-            modelMap[this.state.provider] || this.state.provider : 'Not set';
-        // Update conversation history count
-        this.elements.ragHistory.textContent = this.state.conversationHistory.length + ' messages';
     }
     /**

             chatContainer: document.getElementById('chat-container'),
             statusIndicator: document.getElementById('status-indicator'),
             clearContextBtn: document.getElementById('clear-context-btn'),
             indexContextBtn: document.getElementById('index-context-btn'),
             taskSelect: document.getElementById('task-select'),
             charCount: document.getElementById('char-count'),
             wordCount: document.getElementById('word-count'),
             // API Key elements
             apiKeyInput: document.getElementById('api-key-input'),
             testApiKeyBtn: document.getElementById('test-api-key'),
             saveApiKeyBtn: document.getElementById('save-api-key'),
             apiKeyStatus: document.getElementById('api-key-status'),
             assistantHeader: document.getElementById('assistant-header'),
             assistantContent: document.getElementById('assistant-content'),
             assistantToggleIcon: document.getElementById('assistant-toggle-icon'),
         };
         // Application state
             apiKeyValidated: false,
             isTestingApiKey: false,
             userApiKey: '',
+            // Collapse states for mobile view
             apiSectionCollapsed: false,
             kbSectionCollapsed: false,
             assistantSectionCollapsed: false,
         this.addEventListeners();
         this.loadStoredApiKey();
         this.setupResponsiveUI();
         // Show welcome message
         this.addMessageToChat(
             "👋 **Welcome to ContextIQ!**\n\n" +
             "To get started:\n" +
+            "1. **Enter your OpenRouter API key** in the configuration section above.\n" +
+            "2. **Add your context** by uploading a file or pasting text in the Knowledge Base.\n" +
+            "3. **Index the context** and start asking questions!\n\n" +
+            "🆓 You can get a free API key from [openrouter.ai](https://openrouter.ai) - no credit card required!",
             'system'
         );
         // Initial UI update
         this.updateUI();
         this.updateContextStats();
     addEventListeners() {
         this.elements.indexContextBtn.addEventListener('click', () => this.handleIndexContext());
         this.elements.clearContextBtn.addEventListener('click', () => this.handleClearContext());
         this.elements.sendButton.addEventListener('click', () => this.handleSubmit());
         this.elements.chatInput.addEventListener('keydown', e => {
             if (e.key === 'Enter' && !e.shiftKey) {
             this.updateUI();
         });
         this.elements.chatInput.addEventListener('input', () => this.autoResizeTextarea(this.elements.chatInput));
         // File input listener
         this.elements.fileInput.addEventListener('change', () => this.handleFileSelection());
         // API Key listeners
         this.elements.testApiKeyBtn.addEventListener('click', (e) => {
             e.preventDefault();
                 this.testApiKey();
             }
         });
         // Toggle listeners for collapsible sections
         this.elements.toggleApiSection.addEventListener('click', () => this.toggleSection('api'));
         this.elements.kbHeader.addEventListener('click', () => this.toggleSection('kb'));
         this.elements.assistantHeader.addEventListener('click', () => this.toggleSection('assistant'));
         // Listen for window resize to adjust UI
         window.addEventListener('resize', () => this.setupResponsiveUI());
     }
      */
     setupResponsiveUI() {
         const isMobile = window.innerWidth < 1024;
         this.state.kbSectionCollapsed = isMobile;
         this.state.assistantSectionCollapsed = false;
         if (this.state.apiKeyValidated) {
             this.state.apiSectionCollapsed = true;
         }
      */
     loadStoredApiKey() {
         try {
+            const storedKey = localStorage.getItem('openrouter_api_key');
             if (storedKey) {
                 this.elements.apiKeyInput.value = storedKey;
                 this.state.userApiKey = storedKey;
                 this.onApiKeyInputChange();
             }
         } catch (error) {
      */
     onApiKeyInputChange() {
         const apiKey = this.elements.apiKeyInput.value.trim();
         this.state.apiKeyValidated = false;
         this.state.userApiKey = '';
         if (!apiKey) {
             this.updateApiKeyStatus('pending', 'Enter API key and click Test');
+        } else if (!apiKey.startsWith('sk-or-')) {
+            this.updateApiKeyStatus('error', 'Key should start with "sk-or-"');
         } else if (apiKey.length < 40) {
             this.updateApiKeyStatus('error', 'API key appears too short');
         } else {
             const response = await fetch('/api/v1/test-api-key', {
                 method: 'POST',
                 headers: { 'Content-Type': 'application/json' },
+                body: JSON.stringify({ api_key: apiKey }),
                 signal: controller.signal
             });
                 this.state.apiKeyValidated = true;
                 this.state.userApiKey = apiKey;
                 this.updateApiKeyStatus('success', result.message || 'API key is valid');
                 if (!silent) {
                     this.addMessageToChat("✅ **API Key Validated!** You can now use the assistant.", 'system');
                     this.state.apiSectionCollapsed = true;
             console.error('API key test error:', error);
             this.state.apiKeyValidated = false;
             this.state.userApiKey = '';
             let errorMessage = (error.name === 'AbortError') ? 'Request timed out.' : error.message;
             this.updateApiKeyStatus('error', errorMessage);
             if (!silent) this.addMessageToChat(`❌ **Connection Error**: ${errorMessage}`, 'system');
             return;
         }
         try {
+            localStorage.setItem('openrouter_api_key', apiKey);
             this.updateApiKeyStatus('success', 'API key saved locally!');
+            this.addMessageToChat("💾 **API Key Saved!** It will be remembered for future sessions.", 'system');
         } catch (error) {
             console.error('Save error:', error);
             this.addMessageToChat("❌ **Save Failed**: Could not save API key to local storage.", 'system');
         }
     }
     /**
      * Update API key status display
      */
     updateApiKeyStatus(status, message) {
         const statusEl = this.elements.apiStatusText;
         const iconEl = this.elements.apiStatusIcon;
         const statusConfig = {
             testing: { icon: 'bg-blue-500 animate-pulse', text: 'Testing...' },
             success: { icon: 'bg-green-500', text: 'API Key Valid' },
             error:   { icon: 'bg-red-500', text: 'API Key Invalid' },
             pending: { icon: 'bg-yellow-500', text: 'API Key Pending' },
         };
         iconEl.className = `w-3 h-3 ${statusConfig[status].icon} rounded-full flex-shrink-0`;
         statusEl.textContent = statusConfig[status].text;
             this.handleExecuteTask();
         }
     }
     /**
      * ✨ REFACTORED: Unified logic for indexing from file or text.
      */
             }
             this.state.isIndexed = true;
             this.showStatus(result.message || 'Successfully indexed context.', 'success');
+            // NEW: Populate textarea with extracted text if available
             if (result.extracted_text) {
                 this.elements.contextInput.value = result.extracted_text;
                 this.updateContextStats();
             }
         } catch (error) {
             console.error('Indexing error:', error);
             this.showStatus(`Error: ${error.message}`, 'error');
     }
     /**
+     * Handles sending a user's prompt to the backend for a response.
      */
     async handleSendPrompt() {
         const prompt = this.elements.chatInput.value.trim();
         if (prompt.length < 2 || this.state.isGenerating) return;
         if (!this.state.isIndexed) {
             this.showStatus('Please index your context before asking questions.', 'error');
             return;
         }
         this.addMessageToChat(prompt, 'user');
         this.elements.chatInput.value = '';
         this.autoResizeTextarea(this.elements.chatInput);
         this.state.isGenerating = true;
         this.updateUI();
+        this.showStatus('AI is thinking...', 'loading');
         try {
             const controller = new AbortController();
+            const timeoutId = setTimeout(() => controller.abort(), 60000);
             const response = await fetch('/api/v1/generate', {
                 method: 'POST',
+                headers: {
                     'Content-Type': 'application/json',
                     'X-API-Key': this.state.userApiKey
                 },
+                body: JSON.stringify({ prompt }),
                 signal: controller.signal
             });
             const result = await response.json();
             if (!response.ok) throw new Error(result.detail || 'An unknown error occurred.');
             this.addMessageToChat(result.response, 'ai');
             this.showStatus('Ready for your next question.', 'success');
         } catch (error) {
             console.error('Generation error:', error);
         } finally {
             this.state.isGenerating = false;
             this.updateUI();
         }
     }
             const response = await fetch('/api/v1/task', {
                 method: 'POST',
+                headers: {
                     'Content-Type': 'application/json',
                     'X-API-Key': this.state.userApiKey
                 },
      */
     async handleClearContext() {
         this.elements.contextInput.value = '';
+        this.elements.fileInput.value = ''; // Also clear the file input
         this.elements.fileName.textContent = 'Choose a file...';
         this.updateContextStats();
         this.state.isIndexed = false;
         this.showStatus('Clearing knowledge base...', 'loading');
         try {
+            await fetch('/api/v1/clear_index', {
                 method: 'POST',
                 headers: { 'X-API-Key': this.state.userApiKey }
             });
             this.showStatus('Knowledge base cleared. Ready for new context.', 'success');
         } catch (error) {
             console.error('Clear index error:', error);
             this.showStatus(`Error clearing index: ${error.message}`, 'error');
         } finally {
             this.updateUI();
         }
     }
     /**

templates/index.html CHANGED Viewed

@@ -30,28 +30,28 @@
     </script>
     <style>
         /* General styling for a modern look and feel */
-        body {
-            font-family: 'Inter', sans-serif;
-            background: linear-gradient(135deg, #0f172a 0%, #1e293b 100%);
             overflow-x: hidden; /* Prevent horizontal scroll */
         }
-        .glass-effect {
-            background: rgba(30, 41, 59, 0.7);
-            backdrop-filter: blur(12px);
-            border: 1px solid rgba(148, 163, 184, 0.1);
         }
-        .gradient-text {
-            background: linear-gradient(135deg, #6366f1 0%, #8b5cf6 50%, #06b6d4 100%);
-            -webkit-background-clip: text;
-            -webkit-text-fill-color: transparent;
-            background-clip: text;
         }
         /* Custom scrollbar for a cleaner UI */
         .scroll-container::-webkit-scrollbar { width: 6px; }
         .scroll-container::-webkit-scrollbar-track { background: transparent; }
         .scroll-container::-webkit-scrollbar-thumb { background: #475569; border-radius: 3px; }
         /* Styling for markdown content rendered by marked.js */
         .markdown-content { word-wrap: break-word; }
         .markdown-content p { margin-bottom: 0.75rem; }
@@ -81,11 +81,11 @@
                         <svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5zM2 17l10 5 10-5M2 12l10 5 10-5"/></svg>
                     </div>
                     <div>
-                        <h1 class="text-xl sm:text-2xl font-bold gradient-text">Context AI by Ab-Romia</h1>
                         <p class="text-xs sm:text-sm text-slate-400">Abdelrahman Abouroumia...</p>
                     </div>
                 </div>
                 <div id="api-status" class="flex items-center space-x-2 mt-2 sm:mt-0">
                     <div id="api-status-icon" class="w-3 h-3 bg-red-500 rounded-full flex-shrink-0"></div>
                     <span id="api-status-text" class="text-sm text-slate-400">API Key Required</span>
@@ -104,21 +104,13 @@
                         </svg>
                     </button>
                 </div>
                 <div id="api-key-content" class="space-y-4 mt-4">
-                    <div class="mb-3">
-                        <label for="provider-select" class="block text-sm font-medium text-slate-300 mb-2">Choose AI Provider:</label>
-                        <select id="provider-select" class="w-full bg-slate-900/50 border border-slate-600/50 rounded-lg p-3 text-slate-200 focus:ring-2 focus:ring-indigo-500 focus:outline-none transition">
-                            <option value="openrouter">OpenRouter (Free & Multiple Models)</option>
-                            <option value="openai">OpenAI (GPT-4, GPT-3.5, etc.)</option>
-                        </select>
-                    </div>
                     <div class="flex flex-col sm:flex-row space-y-2 sm:space-y-0 sm:space-x-3">
-                        <input
-                            type="password"
-                            id="api-key-input"
-                            placeholder="Enter your API key here..."
                             class="flex-1 bg-slate-900/50 border border-slate-600/50 rounded-lg p-3 text-slate-200 placeholder-slate-500 focus:ring-2 focus:ring-indigo-500 focus:outline-none transition"
                         >
                         <div class="flex space-x-2">
@@ -130,13 +122,12 @@
                             </button>
                         </div>
                     </div>
-                    <div id="provider-info" class="text-xs text-slate-400 space-y-1">
-                        <p id="provider-link">• Get your free API key from <a href="https://openrouter.ai/" target="_blank" class="text-indigo-400 hover:text-indigo-300">openrouter.ai</a></p>
                         <p>• Your API key is stored locally in your browser and never sent to our servers</p>
-                        <p id="provider-models">• OpenRouter provides access to 200+ models including Claude, GPT, Gemini, and more</p>
                     </div>
                     <div id="api-key-status" class="hidden p-3 rounded-lg text-sm"></div>
                 </div>
             </div>
@@ -147,11 +138,11 @@
                         <h2 class="text-lg font-semibold text-slate-200">📚 Knowledge Base</h2>
                         <svg id="kb-toggle-icon" class="w-5 h-5 transition-transform duration-300" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M6 9l6 6 6-6"/></svg>
                     </div>
                     <div id="kb-content" class="flex-1 p-4 sm:p-6 flex flex-col lg:flex">
                         <h2 class="hidden lg:block text-lg font-semibold text-slate-200">📚 Knowledge Base</h2>
                         <p class="hidden lg:block text-sm text-slate-400 mt-1 mb-4">Provide context for the AI to learn from.</p>
                         <div class="mb-4">
                             <label for="file-input" class="block text-sm font-medium text-slate-300 mb-2">Upload a document:</label>
                             <div class="flex items-center space-x-2">
@@ -170,7 +161,7 @@
                                 <p class="text-slate-400 italic">Uploading a file will replace the text in the text area below.</p>
                             </div>
                         </div>
                         <div class="flex-1 flex flex-col">
                            <textarea id="context-input" class="w-full flex-1 bg-slate-900/50 border border-slate-600/50 rounded-xl p-4 text-slate-200 placeholder-slate-500 focus:ring-2 focus:ring-indigo-500 focus:outline-none resize-none transition scroll-container min-h-[250px] lg:min-h-0" placeholder="... or paste your documents, meeting notes, or any relevant context here"></textarea>
                         </div>
@@ -206,43 +197,6 @@
                                 <option value="creative">Creative Writing</option>
                             </select>
                         </div>
-                        <!-- RAG System Info Panel -->
-                        <div id="rag-info-panel" class="hidden p-4 sm:p-4 border-b border-slate-600/30 bg-slate-900/30">
-                            <div class="flex items-center justify-between mb-3">
-                                <h3 class="text-sm font-semibold text-indigo-400">📊 RAG System Info</h3>
-                                <button id="toggle-rag-info" class="text-xs text-slate-400 hover:text-slate-300">
-                                    <span id="rag-toggle-text">Hide</span>
-                                </button>
-                            </div>
-                            <div id="rag-info-content" class="space-y-2 text-xs text-slate-300">
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">Status:</span>
-                                    <span id="rag-status" class="font-medium text-green-400">Ready</span>
-                                </div>
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">Indexed Chunks:</span>
-                                    <span id="rag-chunks" class="font-medium">0</span>
-                                </div>
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">Context Size:</span>
-                                    <span id="rag-context-size" class="font-medium">0 chars</span>
-                                </div>
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">AI Model:</span>
-                                    <span id="rag-model" class="font-medium">Not set</span>
-                                </div>
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">Retrieval Mode:</span>
-                                    <span id="rag-retrieval" class="font-medium">Smart (Overlapping)</span>
-                                </div>
-                                <div class="flex justify-between">
-                                    <span class="text-slate-400">Conversation:</span>
-                                    <span id="rag-history" class="font-medium">0 messages</span>
-                                </div>
-                            </div>
-                        </div>
                         <div id="chat-container" class="flex-1 overflow-y-auto scroll-container p-6 space-y-6 min-h-[300px] lg:min-h-0">
                             </div>
                         <div class="p-4 sm:p-6 border-t border-slate-600/30">
@@ -253,10 +207,8 @@
                                 </button>
                             </div>
                             <div class="flex items-center justify-between mt-3 text-xs text-slate-400 h-5">
-                                <div id="status-indicator" class="hidden"></div>
-                                <button id="clear-history-btn" class="px-3 py-1 text-xs bg-slate-700/50 text-slate-300 rounded-lg hover:bg-slate-600/50 transition-colors" title="Clear conversation history">
-                                    🗑️ Clear History
-                                </button>
                             </div>
                         </div>
                     </div>

     </script>
     <style>
         /* General styling for a modern look and feel */
+        body {
+            font-family: 'Inter', sans-serif;
+            background: linear-gradient(135deg, #0f172a 0%, #1e293b 100%);
             overflow-x: hidden; /* Prevent horizontal scroll */
         }
+        .glass-effect {
+            background: rgba(30, 41, 59, 0.7);
+            backdrop-filter: blur(12px);
+            border: 1px solid rgba(148, 163, 184, 0.1);
         }
+        .gradient-text {
+            background: linear-gradient(135deg, #6366f1 0%, #8b5cf6 50%, #06b6d4 100%);
+            -webkit-background-clip: text;
+            -webkit-text-fill-color: transparent;
+            background-clip: text;
         }
         /* Custom scrollbar for a cleaner UI */
         .scroll-container::-webkit-scrollbar { width: 6px; }
         .scroll-container::-webkit-scrollbar-track { background: transparent; }
         .scroll-container::-webkit-scrollbar-thumb { background: #475569; border-radius: 3px; }
         /* Styling for markdown content rendered by marked.js */
         .markdown-content { word-wrap: break-word; }
         .markdown-content p { margin-bottom: 0.75rem; }
                         <svg width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M12 2L2 7l10 5 10-5-10-5zM2 17l10 5 10-5M2 12l10 5 10-5"/></svg>
                     </div>
                     <div>
+                        <h1 class="text-xl sm:text-2xl font-bold gradient-text">ContextIQ by Ab-Romia</h1>
                         <p class="text-xs sm:text-sm text-slate-400">Abdelrahman Abouroumia...</p>
                     </div>
                 </div>
                 <div id="api-status" class="flex items-center space-x-2 mt-2 sm:mt-0">
                     <div id="api-status-icon" class="w-3 h-3 bg-red-500 rounded-full flex-shrink-0"></div>
                     <span id="api-status-text" class="text-sm text-slate-400">API Key Required</span>
                         </svg>
                     </button>
                 </div>
                 <div id="api-key-content" class="space-y-4 mt-4">
                     <div class="flex flex-col sm:flex-row space-y-2 sm:space-y-0 sm:space-x-3">
+                        <input
+                            type="password"
+                            id="api-key-input"
+                            placeholder="sk-or-your-openrouter-api-key-here"
                             class="flex-1 bg-slate-900/50 border border-slate-600/50 rounded-lg p-3 text-slate-200 placeholder-slate-500 focus:ring-2 focus:ring-indigo-500 focus:outline-none transition"
                         >
                         <div class="flex space-x-2">
                             </button>
                         </div>
                     </div>
+                    <div class="text-xs text-slate-400 space-y-1">
+                        <p>• Get your free API key from <a href="https://openrouter.ai/" target="_blank" class="text-indigo-400 hover:text-indigo-300">openrouter.ai</a></p>
                         <p>• Your API key is stored locally in your browser and never sent to our servers</p>
                     </div>
                     <div id="api-key-status" class="hidden p-3 rounded-lg text-sm"></div>
                 </div>
             </div>
                         <h2 class="text-lg font-semibold text-slate-200">📚 Knowledge Base</h2>
                         <svg id="kb-toggle-icon" class="w-5 h-5 transition-transform duration-300" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><path d="M6 9l6 6 6-6"/></svg>
                     </div>
                     <div id="kb-content" class="flex-1 p-4 sm:p-6 flex flex-col lg:flex">
                         <h2 class="hidden lg:block text-lg font-semibold text-slate-200">📚 Knowledge Base</h2>
                         <p class="hidden lg:block text-sm text-slate-400 mt-1 mb-4">Provide context for the AI to learn from.</p>
                         <div class="mb-4">
                             <label for="file-input" class="block text-sm font-medium text-slate-300 mb-2">Upload a document:</label>
                             <div class="flex items-center space-x-2">
                                 <p class="text-slate-400 italic">Uploading a file will replace the text in the text area below.</p>
                             </div>
                         </div>
                         <div class="flex-1 flex flex-col">
                            <textarea id="context-input" class="w-full flex-1 bg-slate-900/50 border border-slate-600/50 rounded-xl p-4 text-slate-200 placeholder-slate-500 focus:ring-2 focus:ring-indigo-500 focus:outline-none resize-none transition scroll-container min-h-[250px] lg:min-h-0" placeholder="... or paste your documents, meeting notes, or any relevant context here"></textarea>
                         </div>
                                 <option value="creative">Creative Writing</option>
                             </select>
                         </div>
                         <div id="chat-container" class="flex-1 overflow-y-auto scroll-container p-6 space-y-6 min-h-[300px] lg:min-h-0">
                             </div>
                         <div class="p-4 sm:p-6 border-t border-slate-600/30">
                                 </button>
                             </div>
                             <div class="flex items-center justify-between mt-3 text-xs text-slate-400 h-5">
+                                <div id="status-indicator" class="hidden">
+                                    </div>
                             </div>
                         </div>
                     </div>