view article Article FlashHead: Accelerating Language Model Inference ~ *Efficient drop-in replacement for the classification head* 4 days ago โข 1
view article Article Benchmarks + Report: Optimized Cosmos-Reason2 (Qwen3-VL) for on-device inference on 8GB RAM (Jetson Orin Nano Super) 15 days ago