Edge Detection for Image Processing

idorosen · on Jan 12, 2024

Canny edge detection is optimal in some settings. Often, though, edge detection is in service of another goal, such as segmentation. Nowadays, computers are big/fast enough (GPUs) that building models that learn edge detection implicitly and do segmentation or the end task directly is feasible.

Still fun to learn about Hough transforms, canny edge detection, and all that 1990s computer vision though!

TacticalCoder · on Jan 12, 2024

> Nowadays, computers are big/fast enough (GPUs) that building models that learn edge detection implicitly and do segmentation or the end task directly is feasible.

Wouldn't re-using Meta's SAM (Segment Anything Model) be sufficient here? (which is freely available and, I think, free to use?) Or do you think you need to build your own model specifically for detection bills/sheets of paper?

idorosen · on Jan 13, 2024

Yeah, where licensing allows it, reusing existing model weights (possibly continuing to train on your specific task) is reasonable. I was just pointing out that these methods aren’t SOTA anymore.

eurekin · on Jan 12, 2024

To be fair, I have no idea how I could use modern techniques instead of the Hough Transform. One use case is the speed and RPM car dials recognition. Using the Hough Transform, it's trivial to reliably detect the slope with a high degree of accuracy.

Have vision models started offering better alternatives for such use cases? It's a genuine question; it's been a while since I last looked.

chintler · on Jan 12, 2024

You can pretty much solve this using modern DL models. There are options depending on how accurate you want your model and how much compute you have.

There is an entire spectrum of models, from something like Mask-RCNN, U-Net family upto something like Meta's SAM, which you can use without even training.

eurekin · on Jan 12, 2024

Thanks! Would they also help with reading the exact rpm and speed values?

sgu999 · on Jan 12, 2024

In this kind of cases I'd probably use the Hough-based algo as ground truth to see if you can indeed fine-tune a DNN on that regression task. If it does with reasonable accuracy, then you have a baseline that could be improved in multiple ways to surpass the original.

That said, there are not that many shapes of speedometer and wheels, and the view point is likely controlled, so your old school method is probably the better way ;)

eurekin · on Jan 12, 2024

For the purpose of learning, would you recommend some tutorials, articles or videos that help achieve that? Accuracy aside, this would make a great learning experience!

Is it better to look in the PyTorch community, or that's where some Tensorflow approaches shine? (CUDA is ok)

sgu999 · on Jan 12, 2024

PyTorch is much nicer to play with in my opinion. Maybe start with their official tutorial, I've also heard good things about Karpathy's YouTube channel from beginners.

litlogit · on Jan 12, 2024

What if these filters are explicitly used as preprocessing steps when training a segmentation model? Would that at least save some epochs, if not increase accuracy?

eurekin · on Jan 12, 2024

I'm suspecting it could be similar to the learned vs. predefined positional embeddings in GPTs. That is, the learned version is a "warped and distorted" version of the exact predefined pattern, and yet somehow it performs a bit better, and no one knows exactly why.

nomel · on Jan 12, 2024

> Still fun to learn about Hough transforms, canny edge detection, and all that 1990s computer vision though!

In my domain of metrology, this part isn't acceptable:

> you will get a model that can predict the exact edges of every input image.

Prediction is a nice way to limit the space I have to consider for a proper edge detection, but usually it's better to just go straight to something more deterministic.

jocoda · on Jan 12, 2024

What's the practical accuracy achievable using models?

eurekin · on Jan 12, 2024

Just a single anecdotal data point, how good the result can be:

https://www.youtube.com/watch?v=Nnpm-rJfFjQ

The accuracy is probably quite bad, when looked at individual manipulations, but the overall result is jaw dropping - even, if it's only a demonstration

papruapap · on Jan 12, 2024

No really related, but I personally dislike GaussianBlur as the default noise removal method in all CV articles. In a lot of cases is more useful to filter by color (like BLACK letters in a white paper) otherwise bilateral filter (also available our of the box in OpenCV) usually works better especially if you want to do edge-detection (gaussian also blur edges). Yeah, GaussianBlur may be better performance-wise, but that is something you should consider if you real-time app is short of cpu-power.

TacticalCoder · on Jan 12, 2024

> As for our case, we need to apply edge detection in order to find the edges of documents in order for our Document Scanner SDK to detect the document in real time and scan it.

I've seen some kind of edge detection also used in software used in automated KYC/AML pipelines, to detect obvious cut/paste in utility bills / passport scans etc.

Forgot the name of the (commercial) software but it's a thing.

It's of course possible to fool these but a great many dumb copy/paste which do look perfect to a human will actually become obvious after applying simple edge detection.

FWIW Gimp has Sobel edge detections for sure (don't know about the others), so you can "play" with edge detection if you're on Linux.

HanClinto · on Jan 22, 2024

Are there good / standard pre-trained models that exist for doing edge detection? Surely the only good answer isn't "train one yourself from scratch" or "use the Docutain SDK" -- what good pre-trained models have people used for this task?

ImHereToVote · on Jan 12, 2024

Is this an ad?

eurekin · on Jan 12, 2024

Yes, but still valuable. It's a good read and I learned something new, despite doing exactly that project ~10 years ago :)

hasoleju · on Jan 12, 2024

I assume that canny and sobel could also detect the document edges better if a different kernel to image resolution ratio is choosen.

One way this could be done is downsampling the image and then use sobel or canny edge detection.

ploprof · on Jan 12, 2024

Do none of these large organizations using this software own document scanners anymore?

eurekin · on Jan 12, 2024

I can't say for sure now, but one time I was helping with a similar project for bank agents working "in the field." The core requirement was that data was strictly confidential and could not be processed anywhere but where the bank approved. These were scanned documents of a credit form application. The value added by this software is that internal processes could start right away after sending pictures. Using this application, the bank assumed the chain of trust was maintained. Without it, the credit application waited for manual form verification.

EDIT: I remembered one anecdote. At some point, we had poor accuracy with the white paper on a white background. I really liked how it was solved: agents were equipped with a branded black silk cloth to use as a background.

passion__desire · on Jan 12, 2024

Similar to how self driving is possible in restricted specialized environment. e.g. containers loading, unloading and Amazon warehouses.

docsproX · on Jan 12, 2024

most used approaches for document edge detection, their strengths and their weaknesses and recommendation for the best document detection technique.