Update README.md

875005a verified 8 months ago

1.66 kB

license: mit
datasets:
  - takara-ai/micropajama
language:
  - en

From the Frontier Research Team at takara.ai we present a linear projection model that maps Qwen embeddings to RWKV embeddings for enhanced cross-model compatibility.

Model Details

Input Dimensions: 4096 (Qwen embeddings)
Output Dimensions: 768 (RWKV embeddings)
Architecture: Linear layer (no bias)
Training: Cosine similarity loss on L2-normalized pairs
Dataset: takara-ai/micropajama_embedded_concat

Usage

Quick Start

import torch
from huggingface_hub import PyTorchModelHubMixin

# Define the model class (copy this exactly)
class QwenRwkvProjection(torch.nn.Module, PyTorchModelHubMixin,
                        library_name="takara-ai",
                        tags=["embedding", "projection", "qwen", "rwkv"],
                        license="mit"):
    def __init__(self, din=4096, dout=768):
        super().__init__()
        self.linear = torch.nn.Linear(din, dout, bias=False)

    def forward(self, x):
        return self.linear(x)

# Load from Hub
model = QwenRwkvProjection.from_pretrained("takara-ai/qwen_rwkv_projection")
model.eval()

# Project embeddings (don't forget to normalize!)
normalized_qwen_embeddings = torch.nn.functional.normalize(your_qwen_embeddings, p=2, dim=-1, eps=1e-8)
projected_embeddings = model(normalized_qwen_embeddings)

Important Notes

Dimensions: Input must be (batch_size, 4096), output will be (batch_size, 768)
Bias: Model uses no bias term (trained on normalized pairs)