Instructions to use RaghaRao314159/transcription-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RaghaRao314159/transcription-models with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="RaghaRao314159/transcription-models")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("RaghaRao314159/transcription-models", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Audio Transcription Model Stages
This repository contains comparable exports of the same audio transcription stack at two training stages:
stage_astage_b
Each stage subfolder contains:
- the full Whisper encoder weights
- the full LLM weights
- the stage-specific
audio_projectorweights - tokenizer and feature extractor files
Load A Specific Stage
import torch
from transformers import AutoModel, AutoTokenizer, WhisperFeatureExtractor
stage = "stage_b"
model = AutoModel.from_pretrained(
"RaghaRao314159/transcription-models",
subfolder=stage,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(
"RaghaRao314159/transcription-models",
subfolder=stage,
trust_remote_code=True,
)
feature_extractor = WhisperFeatureExtractor.from_pretrained(
"RaghaRao314159/transcription-models",
subfolder=stage,
)
Compare Both Stages
python pull_model_and_infer.py --model-source "RaghaRao314159/transcription-models" --stage both --audio-path test.mp3