| --- |
| tags: |
| - generated_from_triptuner |
| - transformer |
| - character-level |
| - custom-model |
| license: mit |
| library_name: torch |
| pipeline_tag: text-generation |
| --- |
| |
| # Triptuner Model |
|
|
| This model is trained to generate itineraries for locations in Sri Lanka's Central Province. |
| It uses a custom transformer-based language model designed to handle character-level sequences. |
|
|
| ## Usage |
|
|
| The Triptuner model cannot be directly used with Hugging Face's built-in Inference API because it uses a custom architecture. Below are the instructions on how to manually load and use this model with PyTorch. |
|
|
| ### Load and Use the Model with PyTorch |
|
|
| ```python |
| import torch |
| |
| # Define your custom model class |
| class BigramLanguageModel(nn.Module): |
| # Include the complete definition of your BigramLanguageModel here |
| |
| # Example method definitions: |
| def __init__(self): |
| super().__init__() |
| # Define your model layers here as per the training setup |
| # Example: |
| # self.token_embedding_table = nn.Embedding(vocab_size, n_embd) |
| # self.position_embedding_table = nn.Embedding(block_size, n_embd) |
| # self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)]) |
| # self.ln_f = nn.LayerNorm(n_embd) |
| # self.lm_head = nn.Linear(n_embd, vocab_size) |
| |
| def forward(self, idx, targets=None): |
| # Define the forward pass as per your model |
| pass |
| |
| def generate(self, idx, max_new_tokens): |
| # Implement the generate method for text generation |
| pass |
| |
| # Load the model weights from Hugging Face |
| model = BigramLanguageModel() |
| model_url = "https://huggingface.co/yoonusajwardapiit/triptuner/resolve/main/pytorch_model.bin" |
| model_weights = torch.hub.load_state_dict_from_url(model_url, map_location=torch.device('cpu'), weights_only=True) |
| model.load_state_dict(model_weights) |
| model.eval() |
| |
| # Define your character mappings |
| chars = sorted(list(set("your_training_text_here"))) # Replace with the actual character set used in training |
| stoi = {ch: i for i, ch in enumerate(chars)} |
| itos = {i: ch for i, ch in enumerate(chars)} |
| encode = lambda s: [stoi[c] for c in s] |
| decode = lambda l: ''.join([itos[i] for i in l]) |
| |
| # Test the model with a sample prompt |
| prompt = "Hanthana" # Replace with any relevant location or prompt |
| context = torch.tensor([encode(prompt)], dtype=torch.long) |
| |
| # Generate text using the model |
| with torch.no_grad(): |
| generated = model.generate(context, max_new_tokens=250) # Adjust the number of new tokens as needed |
| |
| # Decode and print the generated text |
| generated_text = decode(generated[0].tolist()) |
| print(generated_text) |
| |
| |
| ## Training Data |
| |
| The model was trained on a dataset containing information about various locations in Sri Lanka's Central Province. |
| |
| ## Model Architecture |
| |
| - Number of Layers: 4 |
| - Embedding Size: 64 |
| - Number of Heads: 4 |
| - Context Length: 32 tokens |
| |
| ## License |
| |
| MIT License |