yoonusajwardapiit
/

triptuner

Text Generation

generated_from_triptuner

character-level

Model card Files Files and versions

triptuner / README.md

yoonusajwardapiit's picture

yoonusajwardapiit

Update README.md

4645d24 verified over 1 year ago

|

history blame contribute delete

2.91 kB

	---
	tags:
	- generated_from_triptuner
	- transformer
	- character-level
	- custom-model
	license: mit
	library_name: torch
	pipeline_tag: text-generation
	---

	# Triptuner Model

	This model is trained to generate itineraries for locations in Sri Lanka's Central Province.
	It uses a custom transformer-based language model designed to handle character-level sequences.

	## Usage

	The Triptuner model cannot be directly used with Hugging Face's built-in Inference API because it uses a custom architecture. Below are the instructions on how to manually load and use this model with PyTorch.

	### Load and Use the Model with PyTorch

	```python
	import torch

	# Define your custom model class
	class BigramLanguageModel(nn.Module):
	# Include the complete definition of your BigramLanguageModel here

	# Example method definitions:
	def __init__(self):
	super().__init__()
	# Define your model layers here as per the training setup
	# Example:
	# self.token_embedding_table = nn.Embedding(vocab_size, n_embd)
	# self.position_embedding_table = nn.Embedding(block_size, n_embd)
	# self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)])
	# self.ln_f = nn.LayerNorm(n_embd)
	# self.lm_head = nn.Linear(n_embd, vocab_size)

	def forward(self, idx, targets=None):
	# Define the forward pass as per your model
	pass

	def generate(self, idx, max_new_tokens):
	# Implement the generate method for text generation
	pass

	# Load the model weights from Hugging Face
	model = BigramLanguageModel()
	model_url = "https://huggingface.co/yoonusajwardapiit/triptuner/resolve/main/pytorch_model.bin"
	model_weights = torch.hub.load_state_dict_from_url(model_url, map_location=torch.device('cpu'), weights_only=True)
	model.load_state_dict(model_weights)
	model.eval()

	# Define your character mappings
	chars = sorted(list(set("your_training_text_here"))) # Replace with the actual character set used in training
	stoi = {ch: i for i, ch in enumerate(chars)}
	itos = {i: ch for i, ch in enumerate(chars)}
	encode = lambda s: [stoi[c] for c in s]
	decode = lambda l: ''.join([itos[i] for i in l])

	# Test the model with a sample prompt
	prompt = "Hanthana" # Replace with any relevant location or prompt
	context = torch.tensor([encode(prompt)], dtype=torch.long)

	# Generate text using the model
	with torch.no_grad():
	generated = model.generate(context, max_new_tokens=250) # Adjust the number of new tokens as needed

	# Decode and print the generated text
	generated_text = decode(generated[0].tolist())
	print(generated_text)


	## Training Data

	The model was trained on a dataset containing information about various locations in Sri Lanka's Central Province.

	## Model Architecture

	- Number of Layers: 4
	- Embedding Size: 64
	- Number of Heads: 4
	- Context Length: 32 tokens

	## License

	MIT License