| --- |
| language: en |
| tags: |
| - chatbot |
| - natural language processing |
| license: Apache 2.0 |
| datasets: |
| - Custom Dataset (Dronealexa) |
| --- |
|
|
| Model Card: NLP-Based Chatbot |
|
|
| Overview |
|
|
| The NLP-Based Chatbot is designed to explore Science & Technology topics. It utilizes a combination of semantic search and summarization techniques to provide relevant and concise responses to user queries. |
|
|
| Model Details |
|
|
| - **Model Name:** NLP-Based Chatbot |
| - **Model Type:** Natural Language Processing (NLP) Chatbot |
| - **Framework:** Gradio Blocks Interface, spaCy, Transformers |
|
|
| Components |
|
|
| 1. Semantic Search |
|
|
| The chatbot employs semantic search to retrieve relevant information from a preprocessed dataset (Dronealexa.csv). The search is based on a TF-IDF vectorizer and cosine similarity calculations. |
|
|
| 2. Summarization |
|
|
| A summarization pipeline is used to generate concise summaries of the retrieved information. The Hugging Face Transformers library is utilized for summarization tasks. |
|
|
| 3. Custom Embeddings |
|
|
| The model incorporates custom text embeddings using spaCy and pre-trained word embeddings. These embeddings enhance the understanding of user queries and contribute to the semantic search. |
|
|
| 4. Gradio Blocks Interface |
|
|
| The chatbot's frontend is built using Gradio Blocks Interface, providing an interactive and user-friendly platform for users to input queries and receive responses. |
|
|
| 5. Model Card Generation |
|
|
| The model card generation involves constructing prompts based on search results and utilizing a summarization pipeline to produce model card content. |
|
|
| Intended Use |
|
|
| The NLP-Based Chatbot is intended for users interested in exploring Science & Technology topics. It can be used to obtain information from the provided dataset, and users are encouraged to provide feedback for continuous improvement. |
|
|
| Training Data |
|
|
| The model is trained on a custom dataset (Dronealexa.csv) containing Science & Technology-related information. The dataset has been preprocessed to handle missing values and ensure efficient semantic search. |
|
|
|
|
| Evaluation Metrics |
|
|
| - Semantic Search: TF-IDF Vectorizer, Cosine Similarity |
| - Summarization: Hugging Face Transformers Pipeline |
|
|
|
|
| Ethical Considerations |
|
|
| The chatbot aims to provide accurate and relevant information. However, users are advised to critically evaluate the responses and understand that the model's knowledge is based on the training data. |
|
|
|
|
| Usage Instructions |
|
|
| 1. Input your query in the provided textbox. |
| 2. Click the "Send" button to receive a response. |
| 3. Optionally, submit feedback using the "Submit Feedback" button. |
|
|
|
|
| License |
|
|
| This model is released under the Apache 2.0 License. |
|
|
|
|
| Contact Information |
|
|
| For inquiries or issues, please contact varsagupta07@gmail.com. |
|
|
|
|