jph
phvv8
AI & ML interests
None yet
Recent Activity
reacted to DedeProGames's post with π about 1 hour ago
Introducing GRM2, a powerful 3 billion parameter model designed for long-term reasoning and high performance in complex tasks.
Even with only 3 billion parameters, it outperforms qwen3-32b in several benchmarks and complex reasoning tasks.
With just 3 billion parameters, it can also generate extensive and complex code with over 1000 lines, utilize tools comparable to larger models, and is perfect for agentic tasks.
GRM2 is licensed under Apache 2.0, making it ideal as a base for FineTune in other tasks.
You can see more here: https://huggingface.co/OrionLLM/GRM2-3b reacted to Shrijanagain's post with π 2 days ago
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: SKT AI LABS
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion
Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull
Whitepaper - https://github.com/SHRIJANAGAIN/PROFF reacted to Shrijanagain's post with β 2 days ago
Surya-1.1T: Scaling Beyond Human-Level Reasoning via 146 Trillion Token Pre-training
Author: SKT AI LABS
Affiliation: SKT AI Labs / Project Surya
Model Architecture: Optimized Dense Transformer
Parameters: 1.1 Trillion
Training Tokens: 146 Trillion
Wanna collaborate us Friends let's Start Journey we have Collected 146 trillon tokens and done pre training but we need to made more powerfull
Whitepaper - https://github.com/SHRIJANAGAIN/PROFFOrganizations
None yet