One advantage to this approach is that there is less compiler magic going on. I use a similar approach, but I prefer using type safe upcasting or model checked downcasting via inline functions or explicit references to base members, instead of direct C style casting.
This also makes it easier to develop a uniform resource management strategy with allocator abstraction. Being able to easily switch between tuned bucket, pool, or bump allocation strategies can do wonders for optimization.
It's possible to model check that downcasting is done correctly, by adding support for performing type checks at analysis time. In this case, a type variable is added to the base type that can be compared before casting. Since this is an analysis only variable, it can be wrapped in a macro so that it is eliminated during normal compilation. Static assertions checked by the model checker during analysis time may need to be refactored to extract the type information as a proof obligation made by the caller. This technique actually works quite well using open source model checkers like CBMC.
Either way, some C OOP is not only useful to provide some optimization knobs, but it's also quite useful for introducing annotations that can help to formally verify C but that don't actually incur any runtime overhead.
The biggest advantage of C-style polymorphism vs, say, C++ is that it actually offers much better encapsulation.
Having private methods declared in the header file which is supposed to be the public contract for the class is such an anti-pattern. And the usual solutions offered for this problem are ugly in their own right (*pImpl).
I only properly learnt to appreciate the power and beauty of OOP by reading people’s C code.
Excellent writeup and straight to the point. As the author demonstrated one can get quite a lot of OOP constructs using C primitives.
what seems to be impossible to implement (as least to me) was something like interfaces, a way to decouple in a way such that high level functions don't need to know about the low level building blocks.
Um, this is how software was built in the olden days.
C++ literally began as "we could write a preprocessor to automate the tricks everyone uses to implement polymorphism in C" (CFront).
And prior to that, the same tricks were used in assembly language programming.
So this article has recreated the history of OOP, which was to create tooling to better support programming techniques already widely used. It wasn't some religion invented by the priests and sent forth on tablets, although due to humans loving them some cult, it became that eventually.
Yes, You can. The entire Linux device interface, to name just one example, if full of interfaces. The way to accomplish this is via pointer to functions, and to have it as an object, have these pointer to functions grouped in a struct.
GTK/Glib is notably full of these interfaces too.
Yup. In a past job I did a lot of work writing ALSA drivers for custom souncards. The ALSA interface is a good example of this. They provide an API app developers can use to do sound stuff (change the volume, for example). In your sound card driver you provide an implementation of the API to do whatever changing the volume means for your particular hardware (in my case sending an i2c message to a digital potentiometer).
One thing to look at is ffmpeg in its encoders and decoders. At the bottom of the file there's a struct with pointers to functions (among other things). Anything that wants to do decoding can just call init(), decode(), close() on an AVCodec and the internal functions do whatever they need to do. Here's one from h264.c:
When you read or write from a FILE in C, do you know or have to care about whether the FILE comes from disk, a pipe, a CD drive, or a device driver or a network mounted drive? What do you think FILE is if not an interface?
Hmm, I don't have much to disagree with for this link, unlike many things from that site.
One minor point - the method implementations should not be `static`, so that you can support further subclassing and reuse the base class implementations.
Note that to support both virtual and non-virtual method binding, the dispatcher also needs to be exported (with the same signature). This is already the case in the linked code but a point isn't made of it; it can be tempting to abuse `inline` but remember that is primarily about visibility [1].
It also doesn't mention how to implement `dynamic_cast` (practically mandatory for multimethod-like things), which can be quite tricky, especially in the multiple-inheritance case and/or when you don't know all the subclasses ahead of time and/or when you have classes used in across shared libraries. There are cases where you really do need multiple vtables.
Virtual inheritance, despite its uses, is probably a mistake so it's fine that it ignores that.
Is there anything you need multimethods for that can't be patched with visitors and other design patterns? They have always seemed to me like a neat feature that are devilishly tricky to implement and difficult to reason about for the average programmer.
You don't need multimethods per se, but you need something and `dynamic_cast` is usually easiest (and with reasonable restrictions, most efficient).
Overloaded operators is a major category of problem here. The "which subclass (if any) is more derived" might be done by the compiler proper, but that still needs to use the cast internally. And of course if you ignore operator overloading, you're just pulling a Java and mandating extra verbosity; the user's problem still has to be solved the exact same way.
(most other use cases for multimethods I don't find compelling)
>Object oriented programming, polymorphism in particular, is essential to nearly any large, complex software system. Without it, decoupling different system components is difficult. (Update in 2017: I no longer agree with this statement.)
The author doesn't seem to elaborate on this. I was taught OOP in university and then promptly learned that it's frowned upon in performance sensitive code, which is my main interest in programming.
(And that it apparently doesn't even achieve its stated goal of making the code easier to understand -- I've certainly had the experience of wading through a deep inheritance hierarchy (or call stack) looking for the "actual code that actually runs"...)
I'd love to hear an elaboration on that idea (OOP is essential for decoupling components) and its counterargument (decoupling can apparently be done just fine without OOP?).
Central to OOP is message passing between objects. Few languages utilize message passing, so it seems you can decouple components without OOP just fine.
All OOP languages use message passing, but perhaps you mean languages with objects that are not oriented?
While not completely identical to Smalltalk's message passing design, the Qt project once went to all the trouble of building their own compiler just to be able to graft message passing onto C++. I think that goes to show that there really is a difference – otherwise, why not use the standard constructs C++ offered?
Whether or not that difference makes for better software is debatable. It does seem that at one time it did lend itself exceptionally well to GUI programming. NeXTSTEP/macOS/iOS also would never have been what they are without OOP. But we've also learned some programming tricks along the way, so it may not even shine there anymore. Swift, for example, has given up on OOP (except where @objc mode is enabled) and it seems like it manages to do quite well with GUIs (granted, having @objc mode to fall back on clouds that somewhat).
I mean that, to me, the difference between message passing and method calling is not significant enough to say that these languages are following different programming paradigms - like you say OOP vs languages with objects.
Afaict the difference between Smalltalk and Objective-C style message passing and Java and C# style method calling is purely syntactic.
> Afaict the difference between Smalltalk and Objective-C style message passing and Java and C# style method calling is purely syntactic.
doesNotUnderstand:/forwardInvocation: isn't different? That is not just syntactical. How would you even begin to orient your objects without like functionality?
I agree that if you squint really hard they look the same. But if we say "they are all the same", what are you trying to communicate when you say OOP? Virtually all programming languages we use have objects. You may as well drop the OO and just use "programming". It would communicate the same intent.
>doesNotUnderstand:/forwardInvocation: isn't different? That is not just syntactical. How would you even begin to orient your objects without like functionality?
Not sure what you mean. Could you please elaborate?
> what are you trying to communicate when you say OOP?
The idea of using dynamic dispatch as a means of taming complexity.
> Not sure what you mean. Could you please elaborate?
I'm not sure where your understanding falls short. Which part are you unsure of?
> The idea of using dynamic dispatch as a means of taming complexity.
Erlang utilizes the idea of dynamic dispatch as a means of taming complexity. What is it in particular about its dynamic dispatch mechanisms that you want me to know when you call attention to its OOP properties?
C++ also utilizes the idea of dynamic dispatch. When you call attention to its OOP properties, are the particulars being pointed to the same as in Erlang, or does it mean something different in the context of that language?
Yeah, the callee doesn’t have to have the method defined for it to be called and Smalltalk objects have a default you can use to do things with messages you don’t handle for example forward them on to another object.
You can definitely do that in Java too. This is how, e.g., Spring framework handles transactional behavior. It injects a proxy for each bean that is marked as @Transactional and the proxy object handles the coordination with the transaction manager and passes all the arguments to the real object.
Right but it’s not a core part of the language, anything Turing complete can implement this sort of thing. The comment I was replying to was asking about the difference between method calls and message passing.
It's a little different, and a little better. But not that much. In fact, right after that famous quote, Alan goes on to say: "I have many of the same feelings about Smalltalk".
For me, the problem with OO as is is that it really is just a Better Old Thing, not an actual New Thing, as the "object" part is quite underdeveloped.
When you have an object-oriented system, it is objects that are connected (somehow) and that then communicate (somehow).
But our object-oriented languages really only still have algorithms (procedures) and data structures. We do get to group and scope those procedures with the data structures, but that's not really too different from what we do in procedural coding.
So when we build a true OO system, we don't have any language support with it, so the program we write is a meta program that constructs the OO system procedurally. The system is not visible in the program text, it is created as a side effect of running the procedures and remains invisible, unless we develop tooling to make it visible.
And when that OO system runs, after you've built it procedurally, how is the communication between object mediate? Also procedures. Again, if there are other communication patterns, they can only be implemented using procedures in the language, they cannot be expressed in the language.
So with current programming languages (even OO ones), a good OO system will, by necessity, be highly indirect compared to the program text. A good OO system will also have sufficient benefits that this trade-off is very much worthwhile, but it is a significant trade-off. And when systems are not good the trade-off is not worth it. What's worse, people get confused and see the indirection not as the trade-off, but as the point of OO. I think those are the examples that people who are extremely jaded by OO have been exposed to: layers upon layers of indirection without a point. Indirection for indirection's sake.
And so they say that this is all BS and you should just not use OO. And they have a point, though they are not correct. Good OO developers handle this tradeoff by getting the benefits of OO with the minimum amount of indirection needed. Tooling like that found in modern Smalltalk systems can help you interact with the OO system that is not visible in the program text.
My approach to the tradeoff is to remove the indirection in the program text by making it possible to directly express components, connectors and systems in the program text, rather than having to build them all procedurally.
> so the program we write is a meta program that constructs the OO system procedurally. The system is not visible in the program text, it is created as a side effect of running the procedures and remains invisible,
Great way of stating it! Most people who cargo-cult "OO is bad" don't get this.
> a good OO system will, by necessity, be highly indirect compared to the program text. A good OO system will also have sufficient benefits that this trade-off is very much worthwhile,
Very true. This is the reason OOD/OOP has been a great success that has led to the explosion of software that we take for granted today.
> My approach to the tradeoff is to remove the indirection in the program text by making it possible to directly express components, connectors and systems in the program text, rather than having to build them all procedurally.
At language source level (eg. DSL) or binary component level (needs runtime support a la COM) ?
> Most people who cargo-cult "OO is bad" don't get this [that OO programs are meta programs that build the system].
Alas, many people who advocate OO don't get this either, and in particular they don't see this as a problem to be solved.
Having to do things indirectly is not a good state of affairs. See "goto statement considered harmful". [1]
It is similar to the way we had to work with text editors made for printing terminals: "...requires a mental skill like that of blindfold chess; the user must keep a mental image of the text he is editing, which he cannot easily see, and calculate how each of his editing command `moves' changes it." [2]
> OOD/OOP has been a great success that has led to the explosion of software that we take for granted today.
Yes, people forget that the problems we are now having are the ones that are due to OO success.
> At language source level (eg. DSL) or binary component level (needs runtime support a la COM)
Language level. At the systems level we know how to build these types of system (COM, Smalltalk, Objective-C, Unix pipes and filter, REST, notification systems, ...), what we lack is the ability to express them in the program text: https://objective.st/
> Having to do things indirectly is not a good state of affairs.
One key point to note here is that "Indirection" is often the mechanism used to design and express an "Abstraction". Given that Abstraction is the key to taming complexity and building large Systems they often go together i.e. Indirection becomes a key element in the design of Abstractions. It is only the non-designer of the Abstraction/Indirection who finds it hard to understand the System in the absence of Documentation/Communication.
Yes, indirection in the source "abstraction" is the way to get to a new abstraction.
However, the result has to be an actual abstraction for that to work, and it often isn't. Often it's just indirection. And once you have the new abstraction, you have to be able to express yourself using that abstraction with little or no leakage.
For example, the "procedure" abstraction works that way, we really and truly don't have to care how it is constructed from assembly language instructions roughly 99.999% of the time.
But here's the kicker: since the actual abstraction mechanism in our languages is procedural, we can only create essentially procedural abstractions. For all other kinds of abstractions, the leakage is close to 100% and we are left with just indirection.
In the code.
Which means we need to take our abstractions and hand-compile them to fit our procedural languages, which we do with varying degrees of deftness. And then mechanisms of communicating the actual but mostly implicit "source" program become crucial, as you point out.
I'd rather be able to create and express those abstractions in the code itself. And then be able to program with those abstractions, rather than having to program in the target language of the human compiler.
> But here's the kicker: since the actual abstraction mechanism in our languages is procedural, we can only create essentially procedural abstractions. For all other kinds of abstractions, the leakage is close to 100% and we are left with just indirection.
Not quite true. "Abstraction" is what you imagine/design according to your needs and treat as a "Single Concept" (eg. a Design Pattern). The fact that it is made out of more granular elements i.e. machine instructions/language statements/procedures for a state machine does not change your ability to reason and work with abstractions at a higher level. While there maybe some "leakage" (mainly when things fail/need for performance) it is by no means total.
Also many people often equate Abstraction with Indirection since the latter is so prevalent in defining the former. But they are different concepts and have to be treated as such.
> I'd love to hear an elaboration on that idea (OOP is essential for decoupling components) and its counterargument (decoupling can apparently be done just fine without OOP?).
When programmers faced the "Software Crisis" in the 1960s (https://en.wikipedia.org/wiki/Software_crisis) the idea of "Software as Components" was born as the solution (https://en.wikipedia.org/wiki/Component-based_software_engin...). This was a direct result of research on "Separation of Concerns/Information Hiding/Modularization/Structured Programming" design requirements. The idea was to make it similar to how "Components" were used in the Hardware Industry where you could substitute different components across different products/product families and all can be developed independently but used in a drop-in/plug-and-play manner to build up an entire System.
OOD/OOP turned out to be a natural architecture for Components since they provided the necessary support for the above-mentioned design requirements. There are two aspects to this architecture a) Source/Language level b) Binary/Usage level. The latter was the "holy grail" and people invented complete binary runtime architectures like COM/DCOM/CORBA/etc.(eg. https://en.wikipedia.org/wiki/Component_Object_Model) with language-neutral interfaces defined via a IDL (https://en.wikipedia.org/wiki/Interface_description_language). You could now have binary software components publish well defined interfaces which clients could call at runtime to discover and use services. The OO idea of encapsulation of data+procedures in a single "Object" keeps the model clean. But note that OO itself is merely a way of composing/structuring procedural code with a strong emphasis on a certain method of architecting the whole System.
Thus OOD/OOP helps greatly in designing systems where components can be kept well decoupled and extensible. The same can be done in a non-OO way (depending on your definition of OO) but is much harder. The design requirements mentioned above have to be satisfied whichever path you take. The role of a OO language merely facilitates the ease with which you can/cannot architect such a System.
I've found OOP can increase coupling. I actually have this problem at work.
Say I have class A and B. I could write `int x = A:FunctionOfAandB(B b)`, but in this case. So now A depends on B.
It would be much preferable to have free function of A and B. A and B are no longer dependent on each other, just the function to compute the result dependent on A and B.
IMO: The real thing of value OOP provides is calling with object.method. A lot of OOP code looks like `void obj.mutate()`. If you pulled that to what's really happening in "procedural" syntax, `mutate(obj)`, that exposes it for the bad code it is.
>The real thing of value OOP provides is calling with object.method. A lot of OOP code looks like `void obj.mutate()`. If you pulled that to what's really happening in "procedural" syntax, `mutate(obj)`, that exposes it for the bad code it is.
I'm confused here, are you saying there is value in the object.method() syntax? D language for example has Uniform Function Call Syntax[0] where f(x) can be written as x.f()
(I missed this syntax in C, and tried to approximate it by putting functions in structs, but you still have to pass the caller as an arg so it ends up being player.update(player) which looks stupid, so at that point player_update(player) seems an acceptable alternative)
Later you say obj.mutate() is bad code, are you referring only to methods which mutate the object?
I've also seen template hierarchies so deep it was a major effort trying to figure out which template did anything besides forward to another template.
Yep, that's because indirection is a necessary tradeoff for implementing OO systems in languages that don't fully support it, which surprisingly includes all current OO languages.
People then get confused and think that indirection is the point. It's not.
It would be better if the indirection weren't needed and we could express more than just procedural abstraction in our PLs.
I had the great fortune to work briefly on the MS Word codebase, and I remember some ancient C code that manually implemented vtables. Probably not uncommon for that era.
One domain that a little OO seems to map to without too much pain is GUI libraries. The first OO-flavoured API I ever used was Sunview, the early GUI I used on Sun-3 workstations with SunOS. It was a beautiful API; I was never tempted to mess with the verbose, complex "Intrinsics-based" toolkits that followed it. It carried on with xview; that's what I'd try if I wanted to write a GUI in C today.
This also makes it easier to develop a uniform resource management strategy with allocator abstraction. Being able to easily switch between tuned bucket, pool, or bump allocation strategies can do wonders for optimization.
It's possible to model check that downcasting is done correctly, by adding support for performing type checks at analysis time. In this case, a type variable is added to the base type that can be compared before casting. Since this is an analysis only variable, it can be wrapped in a macro so that it is eliminated during normal compilation. Static assertions checked by the model checker during analysis time may need to be refactored to extract the type information as a proof obligation made by the caller. This technique actually works quite well using open source model checkers like CBMC.
Either way, some C OOP is not only useful to provide some optimization knobs, but it's also quite useful for introducing annotations that can help to formally verify C but that don't actually incur any runtime overhead.