** THIS MODEL IS BEING FINISHED AND POLISHED YET **

๐Ÿง  MiniAxion1-0.9M

MiniAxion1-0.9M is a Nano Reasoning Model (NRM) with ~920K parameters designed to explore the emergence of structured reasoning in extremely small neural networks.

Despite its minimal size, the model demonstrates strong consistency in reasoning format and step-based thinking using explicit <THINK> and <STEP> tokens.


๐Ÿš€ Overview

  • Model Type: Nano Reasoning Model (NRM)
  • Parameters: ~920,833
  • Architecture: Transformer (6 layers: 2 entry + 2 shared + 2 exit)
  • d_model: 256
  • Heads: 8
  • FFN size: 512
  • LoRA Rank: 16
  • Vocabulary Size: 2048
  • Training Time: ~80 minutes (CPU)

๐Ÿง  Key Capabilities

โœ… Structured Reasoning

The model reliably produces structured reasoning traces:

<THINK>
<STEP> ...
<STEP> ...
</THINK>
<ANS>...</ANS>
  • 100% usage of reasoning tokens
  • Consistent multi-step formatting
  • Stable output structure across tasks

โšก Ultra-Lightweight

  • Runs efficiently on CPU
  • Designed for experimentation and rapid iteration
  • Suitable for embedded or game-like environments

๐Ÿงช Research-Oriented Design

MiniAxion1 is not intended to compete with large-scale models. Instead, it is built to:

  • Study reasoning emergence in small models
  • Explore structure vs correctness trade-offs
  • Enable fast iteration cycles for AI research

๐Ÿ“Š Evaluation Results

Task Accuracy
Arithmetic 3.3%
Two-Step Arithmetic 10.0%
Even/Odd 100.0%
Comparison 5.0%
Pattern Completion 0.0%
Word Problems 0.0%
Sorting 0.0%
Chain-of-Thought Format 100.0%

Average Accuracy: 16.9%


๐Ÿ” Observations

  • The model learns reasoning structure before reasoning correctness
  • Chain-of-thought formatting is highly reliable
  • Arithmetic and symbolic reasoning remain limited at this scale
  • Evidence of partial decoupling between reasoning steps and final answers

โš ๏ธ Limitations

  • Weak performance on arithmetic and multi-step reasoning tasks
  • Susceptible to incorrect intermediate reasoning steps
  • Limited generalization beyond trained patterns
  • Not suitable for production use in critical systems
  • Due to 920k parameters, low results on evaluation is expected

๐ŸŽฏ Intended Use Cases

  • ๐Ÿงช AI research and experimentation
  • ๐ŸŽฎ Game AI / NPC reasoning simulation
  • ๐Ÿ“š Educational demonstrations of reasoning structure
  • โš™๏ธ Lightweight reasoning prototypes

Quick start


import torch
from model import NRMModel
from tokenizer import Tokenizer

# load
model = NRMModel.from_config("config.json")
model.load_state_dict(torch.load("model.pt"))
model.eval()

tokenizer = Tokenizer.load("tokenizer.json")

def generate(prompt):
    tokens = tokenizer.encode(prompt)
    output = model.generate(tokens)
    return tokenizer.decode(output)

print(generate("<INST>What is 2 + 2?</INST>"))

๐Ÿง  Philosophy

MiniAxion1 explores a key question:

Can structured reasoning emerge in extremely small models?

This model provides early evidence that:

  • Reasoning format can be learned efficiently
  • Structure and correctness are separable capabilities
  • Useful behavior can emerge even at sub-1M scale

๐Ÿ”ฎ Future Directions

  • Improved dataset alignment for arithmetic reasoning
  • Scaling parameters (1M โ†’ 10M range)
  • Better coupling between reasoning and answers
  • Task-specific specialization (e.g., math-only variants)
  • distillation knowledge on bigger models

๐Ÿค Acknowledgments

This model was developed as part of ongoing experimentation in nano-scale reasoning systems. the main question was: "How low could a model think(or mimic it)?


๐Ÿ“Ž Model

๐Ÿ‘‰ https://huggingface.co/AxionLab-Co/MiniAxion1-0.9M


๐Ÿงช Disclaimer

This is an experimental research model. Outputs may be incorrect even when reasoning appears structured or convincing.

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support