Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

sagar007ย 
posted an update 2 days ago
view post
Post
3917
๐Ÿš€ I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP!

Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results!

๐Ÿ”ง What I Built:
A vision-language model that can understand images and answer questions about them, combining:
- Google Gemma-3-270M (language)
- OpenAI CLIP ViT-Large/14 (vision)
- LoRA fine-tuning for efficiency

๐Ÿ“Š Training Stats:
- 157,712 training samples (full LLaVA dataset)
- 3 epochs on A100 40GB
- ~9 hours training time
- Final loss: 1.333 training / 1.430 validation
- Only 18.6M trainable params (3.4% of 539M total)

๐Ÿ“ˆ sagar007/multigemma
Benchmark Results:
- VQA Accuracy: 53.8%
- Works great for: animal detection, room identification, scene understanding



๐Ÿ”— **Try it yourself:**
- ๐Ÿค— Model: sagar007/multigemma
- ๐ŸŽฎ Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma
- ๐Ÿ’ป GitHub: https://github.com/sagar431/multimodal-gemma-270m

Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD!

Would love to hear your feedback! ๐Ÿ™

#multimodal #gemma #clip #llava #vision-language #pytorch
ยท
Ujjwal-Tyagiย 
posted an update about 20 hours ago
view post
Post
1227
So, Koreans are also doing great progress behind Chinese,
Their two open source ai models that are actually good in coding. upstage/Solar-Open-100B skt/A.X-K1
DawnCย 
posted an update 1 day ago
view post
Post
1814
VividFlow: Complete AI Image Transformation Platform ๐ŸŽฌ๐ŸŽจโœจ
Three powerful creative tools in one streamlined workspace. VividFlow combines professional video generation, intelligent background replacement, and artistic style transfer to transform your images with precision and creativity.

๐ŸŽญ Triple Creative Powers
- Cinematic Video Generation transforms static images into smooth motion sequences from 0.5 to 5 seconds. Eight curated motion categories cover portraits, products, landscapes, and artistic content with precision-tuned templates.

- Intelligent Background Replacement generates photorealistic scenes from 24 professionally crafted presets spanning studios, natural environments, urban settings, and seasonal atmospheres. Advanced edge refinement handles complex subjects, while the built-in Touch Up tool eliminates artifacts through AI-powered inpainting for flawless results.

- Artistic Style Transfer converts photographs into stunning interpretations across six distinct styles including 3D Cartoon, Anime, Watercolor, and Oil Painting. Five balanced style blends create unique hybrid aesthetics, with optional Face Restore preserving subject identity during transformation.

โšก Optimized Performance
Video generation completes in approximately 4 minutes with ongoing optimization targeting sub-60-second processing. Background replacement finishes in 30-40 seconds, while style transfer delivers results in 20-30 seconds. The independent three-tab architecture ensures smooth workflow without performance conflicts.

๐ŸŽฏ Professional Control
Seed-based reproducibility guarantees consistent results across all features. Background generation offers flexible composition modes, adjustable edge softening, and instant mask preview. Comprehensive parameter controls enable precise creative direction.

๐Ÿ‘‰ Try it now: DawnC/VividFlow

Support with a โค๏ธ โ€” your engagement drives priorities!
#AI #DeepLearning #ImageToVideo #StyleTransfer #CreativeAI
ZennyKennyย 
posted an update 2 days ago
view post
Post
2804
๐Ÿ˜Ž My new personal website is live! Check out https://kennethhamilton.me to chat with an LLM about my professional skills and personal projects.

๐Ÿ™ˆ Think of it like a really, really vain version of ChatGPT.
ยท
MonsterMMORPGย 
posted an update 3 days ago
view post
Post
3797
Compared Quality and Speed Difference (with CUDA 13 & Sage Attention) of BF16 vs GGUF Q8 vs FP8 Scaled vs NVFP4 for Z Image Turbo, FLUX Dev, FLUX SRPO, FLUX Kontext, FLUX 2 - Full 4K step by step tutorial also published

Full 4K tutorial : https://youtu.be/XDzspWgnzxI

Check above full 4K tutorial to learn more and see uncompressed original quality and size images

It was always wondered how much quality and speed difference exists between BF16, GGUF, FP8 Scaled and NVFP4 precisions. In this tutorial I have compared all these precision and quantization variants for both speed and quality. The results are pretty surprising. Moreover, we have developed and published NVFP4 model quant generator app and FP8 Scaled quant generator apps. The links of the apps are below if you want to use them. Furthermore, upgrading ComfyUI to CUDA 13 with properly compiled libraries is now very much recommended. We have observed some noticeable performance gains with CUDA 13. So for both SwarmUI and ComfyUI solo users, CUDA 13 ComfyUI is now recommended.
ยท
danielhanchenย 
posted an update about 10 hours ago
marksverdheiย 
posted an update about 7 hours ago
view post
Post
333
Inspired by the heroes of day zero quants ( @TheBloke @danielhanchen @shimmyshimmer @bartowski ), I decided to join the race by releasing the first FP8 quant of glm-4.7-flash! Not as easy as i expected, but I'm happy i was still able to have it working within a few hours after the original model was released! Interested in feedback if anyone wants to try it out!

marksverdhei/GLM-4.7-Flash-FP8

Note: If my PR to vLLM isn't merged yet you might have to use my fork. Cheers! ๐Ÿค—
ZomiLanguageย 
posted an update about 16 hours ago
view post
Post
324
๐Ÿง ๐ŸŒ Zomi Language AI โ€” Community-Driven, Open-Source

![Zomi Language AI โ€“ From Community to Model]

The **Zomi language** carries identity, faith, and history for its people, yet it remains underrepresented in modern AI systems.

This project introduces a **community-driven, open-source AI translation framework** that enables Zomi to be trained into AI systems **ethically, transparently, and sustainably**โ€”by native speakers, for future generations.

### ๐Ÿ” How It Works
๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘ Community Texts โ†’ ๐Ÿ“ฆ Open Datasets โ†’ ๐Ÿค– AI Training โ†’ ๐Ÿ“Š Evaluation โ†’ ๐Ÿ” Community Review

### ๐Ÿ”“ Why Open-Source Matters
- ๐Ÿค Community ownership
- ๐Ÿ•Š๏ธ Cultural & faith integrity
- โ™ป๏ธ Long-term sustainability
- ๐Ÿ” Transparent datasets & models

This initiative demonstrates how **low-resource languages can shape the future of inclusive AI** through open collaboration.

> *No language should be digitally invisible.*

**@Zomi Language | fb.com/ZomiLanguage**

### ๐Ÿท๏ธ Tags
#OpenSourceAI #LowResourceLanguages #NLP #MachineTranslation #LanguagePreservation #CommunityAI #ZomiLanguage
  • 1 reply
ยท
AdinaYย 
posted an update about 23 hours ago
view post
Post
1307
Z.ai just released a powerful lightweight option of GLM 4.7

โœจ 30B total/3B active - MoE

zai-org/GLM-4.7-Flash
MikeDoesย 
posted an update 1 day ago
view post
Post
2455
How do you prove your new, specialized AI model is a better solution? You test it against the best.

That's why we were excited to see the new AdminBERT paper from researchers at Nantes Universitรฉ and others. To show the strength of their new model for French administrative texts, they compared it to the state-of-the-art generalist model, NERmemBERT.

The direct connection to our work is clear: NERmemBERT was trained on a combination of datasets, including the Pii-masking-200k dataset by Ai4Privacy.

This is a perfect win-win for the open-source community. Our foundational dataset helps create a strong, general-purpose benchmark, which in turn helps researchers prove the value of their specialized work. This is how we all get better.

๐Ÿ”— Great work by Thomas Sebbag, Solen Quiniou, Nicolas Stucky, and Emmanuel Morin on tackling a challenging domain! Check out their paper: https://aclanthology.org/2025.coling-main.27.pdf

๐Ÿš€ Stay updated on the latest in privacy-preserving AIโ€”follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset