Instructions to use kernels-community/flash-attn2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Kernels
How to use kernels-community/flash-attn2 with Kernels:
# !pip install kernels from kernels import get_kernel kernel = get_kernel("kernels-community/flash-attn2") - Notebooks
- Google Colab
- Kaggle
`metadata.json` missing required fields on torch 2.7/2.8/2.9 build variants — breaks `kernels>=0.14`
Under kernels>=0.14, loading this kernel fails on older torch build variants because their build/<variant>/metadata.json is missing the now-required name / id fields (and in some cases the file is missing entirely). Newer variants (torch 2.10+) have already been migrated correctly, so the breakage is silent and depends on the host's torch version.
Repro
On any environment that resolves to a torch 2.8 / cu128 build (e.g. the pytorch/pytorch:2.8.0-cuda12.8-cudnn9-runtime image):
from kernels import get_kernel
get_kernel("kernels-community/flash-attn2")
ValueError: Cannot parse metadata from
`.../build/torch28-cxx11-cu128-x86_64-linux/metadata.json`:
missing field `name` at line 4 column 1
Current state of metadata.json across variants (main)
| Variant | Status |
|---|---|
torch27-cxx11-cu118/126/128 |
missing file |
torch28-cxx11-cu126/128/129 |
present but only {"version": 1, "python-depends": []} — missing name, id, license, backend |
torch29-cxx11-cu128-x86_64-linux |
partially migrated — has license/backend but still missing name/id |
torch210-*, torch211-*, torch212-* |
fully migrated ✓ |
For reference, a correct file (from torch210-cxx11-cu128-x86_64-linux):
{
"name": "flash-attn2",
"id": "_flash_attn2_cuda_042c80b",
"version": 1,
"license": "BSD-3-Clause",
"python-depends": [],
"backend": {"type": "cuda", "archs": ["10.0", "12.0", "8.0", "9.0"]}
}
Impact
Anyone on torch 2.7 / 2.8 / 2.9 — including the official pytorch/pytorch:2.8.0-cuda12.8 base image, which is widely used in CI and HF Jobs — can no longer load this kernel under kernels==0.14. transformers --attn-implementation kernels-community/flash-attn2 breaks on those hosts. Workarounds (pin kernels<0.14, or bump torch to 2.10) both work but require user-side changes.
Ask
Backfill metadata.json (with name, id, license, backend) on the older torch build variants so the kernel loads under kernels>=0.14 regardless of torch version. The torch210+ entries can serve as the template.
Environment
kernels==0.14.0torch==2.8.0, CUDA 12.8, Linux x86_64kernels-community/flash-attn2@main(commit46e3484at time of writing)