embedl/Cosmos-Reason2-2B-W4A16-Edge2 Image-Text-to-Text ⢠2B ⢠Updated 1 day ago ⢠14.6k ⢠11
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. ⢠8 items ⢠Updated 2 days ago ⢠4
EdgeN Collection Quantization strategy where most weights are converted to INT4, activations remain in FP16, and sensitive layers are preserved in FP16. ⢠5 items ⢠Updated 2 days ago ⢠1
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. ⢠19 items ⢠Updated 2 days ago ⢠1
view post Post 362 Pro tip: If you are finetuning any model with tensorboard logs enabled, be sure to upload them to HF Hub as event artifacts, they can be viewed instantly. š I previously remembered this done in the notus model release: argilla/notus-7b-v1Examples: AINovice2005/ModernBERT-base-lora-cicflow-1m-r8 AINovice2005/ModernBERT-base-lora-cicflow-1m-r4 AINovice2005/ModernBERT-base-lora-cicflow-1m-r16cc: @davidberenstein1957 See translation š 2 2 + Reply
view reply š¤ Want to learn more about FlashHead? Check out this blog post: https://huggingface.co/blog/JonnaMat/flashhead
view article Article FlashHead: Accelerating Language Model Inference ~ *Efficient drop-in replacement for the classification head* 3 days ago ⢠1