Moondream 2 has been very useful for me: I've been using it to automatically label object detection datasets for novel classes and distill an orders of magnitude smaller but similarly accurate CNN.
One oddity is that I haven't seen the claimed improvements beyond the 2025-01-09 tag - subsequent releases improve recall but degrade precision pretty significantly. It'd be amazing if object detection VLMs like this reported class confidences to better address this issue. That said, having a dedicated object detection API is very nice and absent from other models/wrappers AFAIK.
Looking forward to Moondream 3 post-inference optimizations. Congrats to the team. The founder Vik is a great follow on X if that's your thing.
One oddity is that I haven't seen the claimed improvements beyond the 2025-01-09 tag - subsequent releases improve recall but degrade precision pretty significantly. It'd be amazing if object detection VLMs like this reported class confidences to better address this issue. That said, having a dedicated object detection API is very nice and absent from other models/wrappers AFAIK.
Looking forward to Moondream 3 post-inference optimizations. Congrats to the team. The founder Vik is a great follow on X if that's your thing.