Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd check out the OpenCV documentation and examples. This is basically what I use for face recognition in videos[0]; for recognising cars or other objects, you'd probably want to either train your own model or use something like OpenCV's YOLOv3 (example: [1] but you'd need to steal the video reading code from the first link[0])

[0] https://github.com/ageitgey/face_recognition/blob/master/exa...

[1] https://github.com/deveth0/python-opencv/tree/master/objectD...





Thanks. Also just kinda wondering if there's been any leaps lately, as I guess this is the same way as one would have done it a few years ago as well. But now that one can upload images and chat about them to multi modal LLMs, wondering if there's easier ways now (but preferable not uploading a million images to chatgpt api and paying the cost).

Like, could I avoid training or specifying much or becoming very knowledgeable in this domain, are we there yet?

Could I say "detect the frames of every car when it passes position X in the video, and then grab the frame when the same car passes position Y", and then I could calculate the frame difference to know the speeds. Or would I have to do loads of code and training still for something like this?

(I know I'm asking for much here, just curious what the SOTA is in this right now)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: