| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: transformers.js |
| | tags: |
| | - code |
| | - python |
| | - maincoder |
| | - code-generation |
| | - reinforcement-learning |
| | - mcpo |
| | - onnx |
| | pipeline_tag: text-generation |
| | base_model: Maincode/Maincoder-1B |
| | --- |
| | <img src="https://huggingface.co/datasets/Maincode/assets/resolve/e51154e034201be1a5dad0e9c8de31d8b9f17643/maincoder_logo.png" alt="" width="1250"> |
| |
|
| | [**Maincoder-1B-ONNX**](https://maincode.com/maincoder/) is the ONNX-optimized version of [Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), a code-focused language model optimized for code generation and completion tasks. This version enables fast inference using ONNX Runtime in Python and runs directly in the browser with Transformers.js. |
| |
|
| | # Key Features |
| |
|
| | - **ONNX Optimized**: Efficient inference with ONNX Runtime and KV-cache support |
| | - **Cross-Platform**: Run in Python, Node.js, or directly in the browser |
| | - **Code Generation**: Optimized for Python code completion and generation tasks. |
| | - **Compact Size**: 1 billion parameters, lightweight enough to run on consumer hardware. |
| | - **SOTA Performance**: State-of-the-art performance on Python coding benchmarks HumanEval, HumanEval+ and MBPP+. |
| |
|
| | # Benchmark Results |
| |
|
| | <img src="https://huggingface.co/datasets/Maincode/assets/resolve/main/performance_h.png" alt="Benchmark Performance Across Baseline LLMs" width="1050"> |
| |
|
| | | Model | HumanEval | HumanEval+ | MBPP+ | MMLU | GSM8K | |
| | |---|---:|---:|---:|---:|---:| |
| | | [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B) | **0.7622** | **0.7256** | **0.7090** | 0.3054 | 0.2976 | |
| | | [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) | 0.5610 | 0.5305 | 0.6217 | 0.2705 | 0.0413 | |
| | | [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) | 0.5366 | 0.5000 | 0.6799 | **0.5928** | 0.5505 | |
| | | [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) | 0.4634 | 0.4451 | 0.6561 | 0.4984 | 0.4944 | |
| | | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) | 0.4024 | 0.3780 | 0.5582 | 0.5571 |**0.6865** | |
| |
|
| | # Model Overview |
| |
|
| | Maincoder uses a modern transformer decoder architecture with: |
| |
|
| | - **Rotary Position Embeddings**: With theta of 1,000,000. |
| | - **RMSNorm**: Pre-normalization for stable training. |
| | - **Grouped Query Attention**: 4:1 ratio of query to key-value heads. |
| | - **QK Normalization**: RMSNorm applied to attention queries and keys. |
| | - **SwiGLU MLP**: Gated linear units with SiLU activation. |
| |
|
| | | Attribute | Value | |
| | |-----------|-------| |
| | | Parameters | 1B | |
| | | Hidden Size | 1536 | |
| | | Layers | 32 | |
| | | Attention Heads | 16 (4 KV heads) | |
| | | Head Dimension | 96 | |
| | | Vocabulary Size | 151,936 | |
| | | Context Length | 2,048 | |
| | | Format | ONNX | |
| |
|
| | # Usage |
| |
|
| | ## Python (ONNX Runtime) |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install optimum[onnxruntime] transformers |
| | ``` |
| |
|
| | For GPU acceleration: |
| |
|
| | ```bash |
| | pip install optimum[onnxruntime-gpu] |
| | ``` |
| |
|
| | ### Quick Start |
| |
|
| | ```python |
| | from optimum.onnxruntime import ORTModelForCausalLM |
| | from transformers import AutoTokenizer |
| | |
| | # Load the ONNX model with KV-cache support |
| | model = ORTModelForCausalLM.from_pretrained( |
| | "Maincode/Maincoder-1B-ONNX", |
| | file_name="decoder_with_past_model.onnx", |
| | use_cache=True |
| | ) |
| | |
| | # Load the tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained("Maincode/Maincoder-1B-ONNX") |
| | |
| | # Code completion example |
| | prompt = '''def fibonacci(n: int) -> int: |
| | """Return the n-th Fibonacci number.""" |
| | ''' |
| | |
| | inputs = tokenizer(prompt, return_tensors="pt") |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=128, |
| | temperature=0.2, |
| | do_sample=True, |
| | pad_token_id=tokenizer.eos_token_id |
| | ) |
| | |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ### GPU Acceleration |
| |
|
| | ```python |
| | from optimum.onnxruntime import ORTModelForCausalLM |
| | |
| | model = ORTModelForCausalLM.from_pretrained( |
| | "Maincode/Maincoder-1B-ONNX", |
| | use_cache=True, |
| | file_name="decoder_with_past_model.onnx", |
| | provider="CUDAExecutionProvider" |
| | ) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## JavaScript (Transformers.js) |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | npm install @huggingface/transformers |
| | ``` |
| |
|
| | ### Node.js |
| |
|
| | ```javascript |
| | import { AutoModelForCausalLM, AutoTokenizer } from '@huggingface/transformers'; |
| | |
| | // Load the tokenizer and model |
| | const tokenizer = await AutoTokenizer.from_pretrained('Maincode/Maincoder-1B-ONNX'); |
| | const model = await AutoModelForCausalLM.from_pretrained('Maincode/Maincoder-1B-ONNX', { |
| | subfolder: '.', |
| | model_file_name: 'decoder_with_past_model', |
| | use_external_data_format: true, |
| | |
| | }); |
| | |
| | // Code completion example |
| | const prompt = `def fibonacci(n: int) -> int: |
| | """Return the n-th Fibonacci number.""" |
| | `; |
| | |
| | const inputs = await tokenizer(prompt, { return_tensors: 'pt' }); |
| | |
| | const outputs = await model.generate({ |
| | input_ids: inputs.input_ids, |
| | attention_mask: inputs.attention_mask, |
| | max_new_tokens: 128, |
| | temperature: 0.2, |
| | do_sample: true, |
| | }); |
| | |
| | const decoded = tokenizer.decode(outputs[0], { skip_special_tokens: true }); |
| | console.log(decoded); |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Code Completion Examples |
| |
|
| | ```python |
| | # Function completion |
| | prompt = '''def quicksort(arr: list) -> list: |
| | """Sort a list using the quicksort algorithm.""" |
| | ''' |
| | |
| | # Class completion |
| | prompt = '''class BinarySearchTree: |
| | """A binary search tree implementation.""" |
| | |
| | def __init__(self): |
| | ''' |
| | |
| | # Algorithm implementation |
| | prompt = '''def dijkstra(graph: dict, start: str, end: str) -> tuple: |
| | """Find the shortest path using Dijkstra's algorithm. |
| | |
| | Args: |
| | graph: Adjacency list representation of the graph |
| | start: Starting node |
| | end: Target node |
| | |
| | Returns: |
| | Tuple of (distance, path) |
| | """ |
| | ''' |
| | ``` |
| |
|
| | # Additional Notes |
| |
|
| | ## Limitations |
| |
|
| | - Context length limited to 2,048 tokens |
| | - Primarily optimized for Python, performance may vary on other languages |
| | - May generate code with bugs or security issues - always review generated code |
| | - Browser performance depends on device capabilities |
| |
|
| | <div style="margin-left:14px; border-left:4px solid #3b82f6; background:rgba(59,130,246,0.08); padding:8px 10px; border-radius:8px; font-size:0.92em; margin:10px 0;"> |
| | <strong>Disclaimer</strong>: This model has <strong>not</strong> undergone any alignment or safety tuning (e.g., RLHF/RLAIF, DPO, or safety fine-tuning). Outputs may be unsafe or biased. Please use appropriate safeguards and evaluate carefully for your use case. |
| | </div> |
| |
|
| | ## License |
| |
|
| | This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @misc{maincoder2025, |
| | title = {Maincoder-1B: A High-Performance 1B Parameter Coding Model}, |
| | author = {Maincode Team}, |
| | year = {2025}, |
| | organization = {Maincode}, |
| | howpublished = {\url{https://huggingface.co/Maincode/Maincoder-1B}} |
| | } |
| | ``` |
| |
|
| | ## Related Models |
| |
|
| | - [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B) - Original PyTorch model |
| |
|
| | ## Contact |
| |
|
| | For questions, issues, or collaboration inquiries, please visit [Maincode](https://maincode.com). |
| |
|