Community Blog & Articles

Community Articles

Bringing Nunchaku 4-bit Diffusion Inference to Diffusers

The State of Simulation for Physical AI: An Overview

Grabette: an open system to record robot-manipulation data

Newer Models, Same Advantage

Security incident disclosure — July 2026

What building Shippy taught us about building agents

Model Routing Is Simple. Until It Isn’t.

Welcome Inkling by Thinking Machines

Introducing Real World VoiceEQ: Measuring the human quality of voice AI

Profiling in PyTorch (Part 3): Attention is all you profile

Native-speed vLLM transformers modeling backend

From Hugging Face to Amazon SageMaker Studio in one click

Hugging Face Models on Foundry Managed Compute

Run AI workloads on any cloud, store on Hugging Face: zero-egress storage with SkyPilot

NEW Articles from Team or Enterprise organizations will get promoted to the main section.

Community Blog & Articles

Fine-tune video and image models at scale with NVIDIA NeMo Automodel and 🤗 Diffusers

Kimi K3 Model Overview: 2.8T Parameters, MXFP4 Quantization, and What the Open Weights Mean for the Community

NVIDIA Nemotron 3 Embed Ranks #1 Overall on RTEB, Advancing Agentic Retrieval

Introducing Cosmos 3 Edge

Aether-7B-5Attn: A 100% Open-Source Sovereign Foundation Model — and a Controlled Experiment in Heterogeneous Attention

Be Ready Before the Attack: A Practical Guide to Self-Hosting an Open Model for Cyber Defense

One Adapter, Both Modalities: Field Notes from Building and Serving a Multimodal Reranker

KV Caching Explained: Optimizing Transformer Inference Efficiency

POCKET: a 35-billion-parameter model that runs on your iPhone — and on your PC with no GPU

J-Space: Yet Another LLM Mind Reader?

The influx of specialist models on the Open SLM Leaderboard

Uncensor any LLM with abliteration

Introduction to State Space Models (SSM)

Code a simple RAG from scratch

Hugging Face on AMD Instinct MI455X: First Transformers Results

When will language models be good enough?

Tokenization is Killing our Multilingual LLM Dream

Introducing North Mini Code: Cohere’s First Model For Developers

Data for Agents

Small Language Models (SLM): A Comprehensive Overview

Bringing Nunchaku 4-bit Diffusion Inference to Diffusers

The State of Simulation for Physical AI: An Overview

Grabette: an open system to record robot-manipulation data

Newer Models, Same Advantage

Security incident disclosure — July 2026

What building Shippy taught us about building agents

Model Routing Is Simple. Until It Isn’t.

Welcome Inkling by Thinking Machines

Introducing Real World VoiceEQ: Measuring the human quality of voice AI

Profiling in PyTorch (Part 3): Attention is all you profile

Native-speed vLLM transformers modeling backend

From Hugging Face to Amazon SageMaker Studio in one click

Hugging Face Models on Foundry Managed Compute

Run AI workloads on any cloud, store on Hugging Face: zero-egress storage with SkyPilot

Fine-tune video and image models at scale with NVIDIA NeMo Automodel and 🤗 Diffusers

Kimi K3 Model Overview: 2.8T Parameters, MXFP4 Quantization, and What the Open Weights Mean for the Community

NVIDIA Nemotron 3 Embed Ranks #1 Overall on RTEB, Advancing Agentic Retrieval

Introducing Cosmos 3 Edge

Aether-7B-5Attn: A 100% Open-Source Sovereign Foundation Model — and a Controlled Experiment in Heterogeneous Attention

Be Ready Before the Attack: A Practical Guide to Self-Hosting an Open Model for Cyber Defense

One Adapter, Both Modalities: Field Notes from Building and Serving a Multimodal Reranker

KV Caching Explained: Optimizing Transformer Inference Efficiency

POCKET: a 35-billion-parameter model that runs on your iPhone — and on your PC with no GPU

J-Space: Yet Another LLM Mind Reader?

The influx of specialist models on the Open SLM Leaderboard

Uncensor any LLM with abliteration

Introduction to State Space Models (SSM)

Code a simple RAG from scratch

Hugging Face on AMD Instinct MI455X: First Transformers Results

When will language models be good enough?

Tokenization is Killing our Multilingual LLM Dream

Introducing North Mini Code: Cohere’s First Model For Developers

Data for Agents

Small Language Models (SLM): A Comprehensive Overview