Hacker News new | past | comments | ask | show | jobs | submit login

this is very cool, I've been playing around in the same space with a simple tracked robot and a 2dof gripper. you seem to be quite a bit ahead of me in functionality.

https://imgur.com/a/WAHUIjQ

I'm using PaliGemma2 and MobileSAM for the vision part and Gemma for the thinking part. I'm hoping to stick with weights-available models as it's just a toy project.

for what it's worth this contraption cost under £200, but I'm using a desktop and a 3090 as the brains.




Super cool, congrats man!

This is how it started for us too! Check this out: https://x.com/ax_pey/status/1853462975216234851

And like you did, a SAM + VLM is the first thing we tried and it felt high-potential already. It takes a lot of software work to put the right pieces together though, but we think we now ended up with something promising, scalable and extendable for a lot of people.

And on the price: same, our initial prototype was around $250 but I had to connect it to my computer. It's unclear to many others in the field whether we'll be able to offload compute with latency low enough to a computer somewhere else in the house or even in the cloud. In the meantime at least, we decided to have onboard compute so that you can get started quickly. Even for you it would be useful, just because we did the work of putting all the hardware and electronics together, it's a pretty good computer onboard :)


I forgot to mention, there's a raspberry pi 4ish on board, but yes, latency is something I'm trying to optimise for right now :-D


Same for us back then! If I did it today though I would love to try using a RPi 5, these look incredible. But honestly NVIDIA just released their new Jetson Nano Super for 250$ and I think at this point it's a no-brainer to use this instead of an rpi.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: