view article Article Deploying Open Source Vision Language Models (VLM) on Jetson nvidia β’ Feb 24 β’ 37
view article Article NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI nvidia β’ Jan 5 β’ 64
view article Article Introducing NVIDIA Cosmos Policy for Advanced Robot Control nvidia β’ Jan 29 β’ 48
Running on Zero Agents 183 HunyuanWorld-Mirror π 183 Universal 3D World Reconstruction with Any Prior Prompting
view article Article SmolVLM - small yet mighty Vision Language Model +3 andito, merve, mfarre, eliebak, pcuenq β’ Nov 26, 2024 β’ 418
view post Post 3928 The new Qwen-2 VL models seem to perform quite well in object detection. You can prompt them to respond with bounding boxes in a reference frame of 1k x 1k pixels and scale those boxes to the original image size.You can try it out with my space maxiw/Qwen2-VL-Detection 6 replies Β· π 14 14 π 5 5 π€ 1 1 + Reply
view article Article Welcome PaliGemma 2 β New vision language models by Google +2 merve, andsteing, pcuenq, ariG23498 β’ Dec 5, 2024 β’ 166
Running on Zero Agents Featured 517 Florence2 + SAM2 π₯ 517 Segment and label objects in images or videos using text prompts