8 27

Jinki Jeong PRO

Anserwise

AI & ML interests

None yet

Recent Activity

updated a model about 8 hours ago

Anserwise/AWAXIS-Think-27b

published a model about 8 hours ago

Anserwise/AWAXIS-Think-27b

reacted to SeaWolf-AI's post with ❤️ about 16 hours ago

Darwin-TTS: 3% of an LLM's Brain Makes TTS Speak with Emotion — Zero Training We blended 3% of Qwen3-1.7B (LLM) FFN weights into Qwen3-TTS-1.7B's talker module. The result: emotionally enhanced speech synthesis — with zero training, zero data, and zero GPU hours. Try the Demo: https://huggingface.co/spaces/FINAL-Bench/Darwin-TTS-1.7B-Cross Model Weights: https://huggingface.co/FINAL-Bench/Darwin-TTS-1.7B-Cross Full Research Article: https://huggingface.co/blog/FINAL-Bench/darwin-tts Qwen3-1.7B (LLM) and Qwen3-TTS-1.7B's talker share 100% identical architecture — same hidden_size (2048), same layers (28), same heads (16). This enabled pure 1:1 weight blending across 84 FFN tensors with a single lerp operation. At 3% blend, emotion appears. At 5%, emotion intensifies. At 10%, the model breaks — producing 655-second outputs for a 3-second sentence, because the LLM's "keep generating" pattern overwhelms the TTS stop signal. To our knowledge, this is the first training-free cross-modal weight transfer between an LLM and a TTS model. Prior work either requires adapter training (SmolTolk, 2025), fine-tuning (CSLM, 2025), or massive end-to-end compute (GPT-4o). Darwin-TTS achieves cross-modal capability transfer in under 2 minutes on CPU. The key insight: TTS models with LLM backbones already "think" in language. We're just restoring 3% of the original LLM's language understanding patterns — particularly those related to emotional semantics and prosody planning. The code is three lines: load the model, load the LLM FFN, call p.lerp_(llm_weight, 0.03). creators of the Darwin Evolutionary Merge Framework. Darwin LLM V7 achieved GPQA Diamond 86.9% (HF Benchmark #3) through CMA-ES optimized FFN crossbreeding. Darwin-TTS extends this principle from LLM-to-LLM merging into cross-modal LLM-to-TTS transfer. Apache 2.0.

View all activity

Organizations

None yet

updated a model about 8 hours ago

Anserwise/AWAXIS-Think-27b

Text Generation • 27B • Updated about 8 hours ago

published a model about 8 hours ago

Anserwise/AWAXIS-Think-27b

Text Generation • 27B • Updated about 8 hours ago

reactedto SeaWolf-AI's post with ❤️👍🔥 about 16 hours ago

Post

1571

Darwin-TTS: 3% of an LLM's Brain Makes TTS Speak with Emotion — Zero Training

We blended 3% of Qwen3-1.7B (LLM) FFN weights into Qwen3-TTS-1.7B's talker module. The result: emotionally enhanced speech synthesis — with zero training, zero data, and zero GPU hours.

Try the Demo: FINAL-Bench/Darwin-TTS-1.7B-Cross

Model Weights: FINAL-Bench/Darwin-TTS-1.7B-Cross

Full Research Article: https://huggingface.co/blog/FINAL-Bench/darwin-tts

Qwen3-1.7B (LLM) and Qwen3-TTS-1.7B's talker share 100% identical architecture — same hidden_size (2048), same layers (28), same heads (16). This enabled pure 1:1 weight blending across 84 FFN tensors with a single lerp operation. At 3% blend, emotion appears. At 5%, emotion intensifies. At 10%, the model breaks — producing 655-second outputs for a 3-second sentence, because the LLM's "keep generating" pattern overwhelms the TTS stop signal.

To our knowledge, this is the first training-free cross-modal weight transfer between an LLM and a TTS model. Prior work either requires adapter training (SmolTolk, 2025), fine-tuning (CSLM, 2025), or massive end-to-end compute (GPT-4o). Darwin-TTS achieves cross-modal capability transfer in under 2 minutes on CPU.

The key insight: TTS models with LLM backbones already "think" in language. We're just restoring 3% of the original LLM's language understanding patterns — particularly those related to emotional semantics and prosody planning. The code is three lines: load the model, load the LLM FFN, call p.lerp_(llm_weight, 0.03).

creators of the Darwin Evolutionary Merge Framework.
Darwin LLM V7 achieved GPQA Diamond 86.9% (HF Benchmark #3)
through CMA-ES optimized FFN crossbreeding. Darwin-TTS extends this principle from LLM-to-LLM merging into cross-modal LLM-to-TTS transfer. Apache 2.0.

upvoted an article about 16 hours ago

Article

Darwin-TTS: We Gave a TTS Model 3% of an LLM's Brain — It Started Showing Emotion

about 17 hours ago

•

liked a Space about 16 hours ago

Darwin TTS 1.7B Cross

🦀

Darwin-TTS-1.7B-Cross

liked a model about 16 hours ago

FINAL-Bench/Darwin-TTS-1.7B-Cross

Text-to-Speech • Updated about 14 hours ago • 10

liked a dataset 3 days ago

Idavidrein/gpqa

Benchmark • Updated Mar 5 • 1.25k • 103k • 415

liked a model 3 days ago

FINAL-Bench/Darwin-27B-Opus

Text Generation • 28B • Updated 2 days ago • 59 • 25

liked a Space 11 days ago

Darwin 9B Opus

👀

The child surpassed both parents — that is evolution

liked a model 11 days ago

FINAL-Bench/Darwin-9B-Opus

Text Generation • 10B • Updated 5 days ago • 2.21k • 20

liked a Space 13 days ago

Gemma-4 Multichat

👀

Gemma 4 — MoE 26B or Dense 31B, Vision, Thinking

liked a model 14 days ago

FINAL-Bench/Darwin-35B-A3B-Opus-Q8-GGUF

Text Generation • 35B • Updated 5 days ago • 2.69k • 16

reactedto SeaWolf-AI's post with 🤗👍🚀❤️🔥 15 days ago

Post

2166

🧬 Darwin-35B-A3B-Opus — The Child That Surpassed Both Parents

What if a merged model could beat both its parents? We proved it can.
Darwin-35B-A3B-Opus is a 35B MoE model (3B active) built with our Darwin V5 engine — the first evolution system that CT-scans parent models before merging them.
🤗 Model: FINAL-Bench/Darwin-35B-A3B-Opus

The result speaks for itself: GPQA Diamond 90.0%, versus Father (Qwen3.5-35B-A3B) at 84.2% and Mother (Claude 4.6 Opus Distilled) at 85.0%. That's +6.9% over Father and +5.9% over Mother. Not a tradeoff — a genuine leap. Meanwhile, MMMLU sits at 85.0% (Father: 85.2%), multimodal is fully intact, and all 201 languages are preserved.

How? Model MRI changed everything. Traditional merging is guesswork. Darwin V4 added evolution. Darwin V5 added X-ray vision. Model MRI scans each parent layer by layer and discovers: Mother's L34–L38 is the reasoning engine (peak cosine distance), 50–65% of Mother's experts are dead (killed by text-only distillation), and Father is a healthy generalist with every expert alive. The prescription: transplant Mother's reasoning brain at L38 (90% weight), replace her dead experts with Father's living ones, and let Father's router handle the output layer. Reasoning went up. Versatility stayed intact. No tradeoff — just evolution.

35B total, 3B active (MoE) · GPQA Diamond 90.0% · MMMLU 85.0% (201 languages) · Multimodal Image & Video · 262K native context · 147.8 tok/s on H100 · Runs on a single RTX 4090 (Q4) · Apache 2.0
Darwin V5's full algorithm and technical details will be released alongside an upcoming paper.

🚀 Live Demo: FINAL-Bench/Darwin-35B-A3B-Opus

🏆 FINAL Bench Leaderboard: FINAL-Bench/Leaderboard

📊 ALL Bench Leaderboard: FINAL-Bench/all-bench-leaderboard

Built by VIDRAFT · Supported by the Korean Government GPU Support Program