Check also http://www.vlfeat.org/. It is quite used in the research community, w...

liuliu · on June 30, 2012

I admire their work, http://libccv.org/post/call-for-a-new-lightweight-c-based-co...

Their documentation is very nice, and back then when I implementing linear MSER for OpenCV, I learned a lot from their semilinear MSER implementation.

But, I never understand why they want to mimick object-oriented interface in their implementation. Especially, their abstraction strikes me as "weird". Each algorithm implementation is an "object" to encapsulate some intermediate data, or to retain some parameter settings? It is a far stretch to what these computer vision algorithm does.

In my mind, computer vision algorithms are functions, which takes image/image sequence in, and emit semantic information out. Reusable intermediate data should be implementation detail, it shouldn't overshadow how the interface looks like.

Sorry if I am acting too opinionated on this topic, I am too passionate about good interface design.

dpsatch · on July 1, 2012

In vlfeat there is a strong focus on the algorithms, since it is mainly developed by a researcher for his own research. This is the reason of the "weird" object-oriented interface.

Keeping the "status" of the algorithm in an object gives the following advantages:

- The interface encourages cleaner code: computer vision algorithms often come with many parameters. One has to use his own data structure, or many variables or constants to keep them in memory, and then call a function with a very long signature. This produces not so clean code, and the long signature could lead to mistakes in the parameter values. Using an object with get/set functions make easier to define in an iterative manner the parameters and call the function in a clean an less error-prone way.

- Many computer vision algorithms are iterative. For this kind of algorithms, having an object representing the status of the algorithms allows to stop/restart/continue it easily. The possibility to check the convergence of certain algorithms is vital in computer vision (SVMs for example).

You are right when you say that implementation details shouldn't interfere with the interfaces. But vlfeat doesn't have a general "extractFeatures" function, it implement different famous algorithms. So it makes sense to assume that if one uses SIFT, have a basic knowledge of how it works and prefers to exploit it at its best (in the documentation you can find basic information of any algorithm anyway).

So I think the two library are different, libccv focus on those developers that want to use computer vision algorithms without a deep knowledge of them, while vlfeat requires the developer to have a proper computer vision knowledge, but gives more power to him.