l3cube-pune/indic-squad
Viewer • Updated • 1.42M • 288
This model is a fine-tuned version of Naman0807/gujarati-lm on a custom Gujarati Question Answering dataset. It is designed to generate answers given a context and a question.
Naman0807/gujarati-lmThis model is intended to answer questions based on a provided context in Gujarati. It uses a generative approach where the input is formatted as:
Context: <context> Question: <question> Answer:
and the model generates the answer.
You can use this model with the Hugging Face pipeline.
from transformers import pipeline
# Load the pipeline
generator = pipeline("text-generation", model="Naman0807/fine_tuned_gujarati_qa") # Replace with your actual repo name if different
# Define context and question
context = "ગુજરાત ભારતનું એક રાજ્ય છે. તેનું પાટનગર ગાંધીનગર છે."
question = "ગુજરાતનું પાટનગર કયું છે?"
# Format the prompt
prompt = f"Context: {context}\nQuestion: {question}\nAnswer:"
# Generate answer
output = generator(prompt, max_length=256, num_return_sequences=1, do_sample=True, temperature=0.7)
generated_text = output[0]['generated_text']
# Extract the answer part
print(generated_text)
The model was fine-tuned on the Gujarati subset of the L3Cube-Pune IndicSQuAD dataset. Dataset Link
The dataset was formatted into a generative text format:
Context: ... Question: ... Answer: ...training_args = TrainingArguments(
output_dir="./fine_tuned_gujarati_qa",
per_device_train_batch_size=4,
num_train_epochs=3,
learning_rate=5e-5,
fp16=True,
...
)
Base model
Naman0807/gujarati-lm