18 1 8

Jeremy Haschal

JermemyHaschal

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

unsloth/GLM-5.2-GGUF

new activity 9 days ago

TheDrummer/Rocinante-XL-16B-v1-GGUF:Query

new activity 2 months ago

cstr/parakeet-tdt-0.6b-v3-GGUF:CrispASR repo is gone - What to do?

View all activity

Organizations

None yet

liked a model 4 days ago

unsloth/GLM-5.2-GGUF

Text Generation • 754B • Updated 2 days ago • 22.6k • 202

New activity in TheDrummer/Rocinante-XL-16B-v1-GGUF 9 days ago

Query

#3 opened 12 days ago by

Dafurias

New activity in cstr/parakeet-tdt-0.6b-v3-GGUF 2 months ago

CrispASR repo is gone - What to do?

#1 opened 2 months ago by

JermemyHaschal

New activity in nvidia/audio-flamingo-next-hf 2 months ago

FP8 version

#1 opened 2 months ago by

JermemyHaschal

reacted to albertvillanova's post with 🤗 4 months ago

Post

2948

🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0

reacted to OzTianlu's post with 🤗 4 months ago

Post

3465

O(1) inference is the foundational design of Spartacus-1B-Instruct 🛡️ !

NoesisLab/Spartacus-1B-Instruct

We have successfully replaced the KV-cache bottleneck inherent in Softmax Attention with Causal Monoid State Compression. By defining the causal history as a monoid recurrence, , the entire prefix is lossily compressed into a fixed-size state matrix per head.

The technical core of this architecture relies on the associativity of the monoid operator:

Training: parallel prefix scan using Triton-accelerated JIT kernels to compute all prefix states simultaneously.
Inference: True sequential updates. Memory and time complexity per token are decoupled from sequence length.
Explicit Causality: We discard RoPE and attention masks. Causality is a first-class citizen, explicitly modeled through learned, content-dependent decay gates.

Current zero-shot benchmarks demonstrate that Spartacus-1B-Instruct (1.3B) is already outperforming established sub-quadratic models like Mamba-1.4B and RWKV-6-1.6B on ARC-Challenge (0.3063). Recent integration of structured Chain-of-Thought (CoT) data has further pushed reasoning accuracy to 75%.

The "Spartacus" era is about scaling intelligence, not the memory wall ♾️.

New activity in TheDrummer/Rocinante-X-12B-v1-GGUF 4 months ago

Comparison with Rivermind-Lux-12B-v1b?

#1 opened 4 months ago by

JermemyHaschal

reacted to Reubencf's post with 🔥 5 months ago

Post

2224

📢 New release! World_events Dataset now available featuring global events spanning 2023 through 2025
🌍 https://huggingface.co/collections/Reubencf/world-events

🚀 2026 dataset dropping soon