I've had to deploy lots of soft real time apps on GCed environments over the years, and it's always a problem. You can work around it with things like object pools, but some library or API will assume that the GC is OK and will be quietly spitting out objects continuously which will lead to a GC pause.
It's worth pointing out the Android devs finally started noticing this for Lollipop (probably due to their animations) and the API now has lots of places where it passes Java primitives instead of objects, which is the distinction between passing by value and by reference. Even if you're in C++ modern compilers can only make the most out of it if you pass by value, as this enables all sorts of other optimisations to kick in.
The key benefit of reference counting is it's predictable. Real time systems are also not strictly the lowest latency, they are defined by predictability. This becomes a preoccupation with minimising your worst case scenario.
Jellybean and Lollipop didn't get better smoothness by replacing objects with primitives, the Android API is fixed by backwards compatibility requirements. They did it through a mix of better graphics code and implementing a stronger GC.
If you look at the most advanced garbage collectors like G1 you can actually give them a pause time goal. They will do their best to never pause longer than that. If pauses are getting too long they increase memory usage to give more breathing room. If pauses are reliably less, they shrink the heap and give memory back to the OS.
Reference counting is not inherently predictable and can sometimes be less predictable than GC. The problem with refcounting is it can cause deallocation storms where a large object graph is suddenly released all at once because some root object was de-reffed. And then the code has to go through and recursively unref the entire object graph and call free() on it, right at that moment. If the user is dragging their finger at that time, tough cookies. GC on the other hand can take a hint from the OS that it'd be better to wait a moment before going in and cleaning up .... and it does.
It gets even worse when you consider that malloc/free are themselves not real time. Mallocs are allowed to spend arbitrary amounts of time doing bookkeeping, collapsing free regions etc and it can happen any time you allocate or free. With a modern GC, an allocation is almost always just a pointer increment (unless you've actually run out of memory).
The problem Apple has is that their entire toolchain is based on early 1990's era NeXT technology. That was great, 25 years ago. It's less great now. Objective-C manages to excel at neither safety nor performance and because it's basically just C with extra bits, it's hard to use any modern garbage collection techniques with it. For instance Boehm GC doesn't support incremental or generational collection on OS X and I'm unaware of any better conservative GC implementation.
Some years ago there was a paper showing how to integrate GC with kernel swap systems to avoid paging storms, which has been the traditional weakness of GC'd desktop apps. Unfortunately the Linux guys didn't do anything with it and neither has Apple. If you spend all day writing kernels "just use malloc" doesn't seem like bad advice.
Objective-C uses autorelease pools so deallocation doesn't happen immediately when the refcount goes to zero. Its reference counting implementation is smarter than a simple naïve one.
Apple's GC implementation wasn't a Boehm GC [1].
It's true that it's hard to use a tracing GC with Objective-C, because of the C. But, if you want interoperability with C, you're kind of stuck.
> The problem with refcounting is it can cause deallocation storms where a large object graph is suddenly released all at once because some root object was de-reffed
This only happens if you choose to organize the data this way. This is a big difference from GC, where the whole memory layout and GC algorithm is out of your control.
Depends on the language. Go, for instance, gives you a lot of freedom when it comes to memory layout, and allows you stack allocate objects to avoid GC.
Im not arguing in favor of GC. The argument was that a GC takes away memory control, but memory control is up to the language.
Go doesn't restrict you to stack/heap allocation. You can create structs which embeds other structs. This simplifies the job the GC has to do, even if you don't allocate on the stack.
You can do something similar with Struct types in C#.
Better graphics code does mean what I'm on about. It's about removing any triggers for GC, which means removing allocations.
If you're blocking your UI thread with deallocating a giant graph of doom then you have other problems. Deferring pauses, however, is not a realistic option.
I was referring to the triple buffering, full use of GL for rendering and better vsyncing when I talked about graphics changes, not GC stuff. That was separate and also makes things smoother but it's unrelated.
Deferring pauses is quite realistic for many kinds of UI interaction and animation. If your animation is continuous/lasts a long time and requires lots of heap mutation then you need a good GC or careful object reuse, but then you can run into issues with malloc/free too. But lots of cases where you need something smooth don't fit that criteria.
It's worth pointing out the Android devs finally started noticing this for Lollipop (probably due to their animations) and the API now has lots of places where it passes Java primitives instead of objects, which is the distinction between passing by value and by reference.
This is simply not true. The API has always been heavily based on Java primitives. They didn't even use enums in the older APIs, preferring instead int constants. (I hate that one personally). GC pauses have always been a point of focus for the Android platform.
Notice the introduction of methods with API level 21 that recreate existing functionality without RectF objects being allocated. Touch events, for example, still spit ludicrous amounts of crap on to the heap.
All this is why the low latency audio is via the NDK as it's basically impossible to write Android apps which do not pause the managed threads at some point. Oddly this is stuff the J2ME people got right from day one.
That's terribly ugly. Can they not do escape analysis or something to avoid allocations in obvious places? Or only allocate when the value is moved off the stack?
Doesn't Android use a different flavor of Java anyways, allowing them to make these changes?
Yes, it's possible in theory, but Dalvik/ART don't do it. HotSpot does some escape analysis and the Graal compiler for HotSpot does a much more advanced form called partial escape analysis, which is pretty close to ideal if you have aggressive enough inlining.
The problem Google has is that Dalvik wasn't written all that well originally. It had problems with deadlocking due to lock cycles and was sort of a mishmash of C and basic C++. But then again it was basically written by one guy under tight time pressure, so we can give them a break. ART was a from scratch rewrite that moved to AOT compilation with bits of JITC, amongst other things. But ART is quite new. So it doesn't have most of the advanced stuff that HotSpot got in the past 20 years.
Yes, this is why I always find sad that language performance gets thrown around in discussions forums without reference to what implementations are actually being discussed.
Reference counting does not provide any guarantees when objects get deallocated on its own either, any removed reference may make the counter zero and trigger a deallocation. It may of course be worse with a full blown garbage collector building up a huge pile of unused objects and then cleaning them up all at once. But that is not a necessary limitation, there are already garbage collectors performing the entire work in parallel with the normal application execution.
Objective-C uses pools, so deallocation doesn't happen automatically when the reference count hits zero. Apple's reference counting implementation is fairly smart.
Over on the Reddit discussion there was a comment from Ridiculous Fish who at least was an Apple developer (and probably still is) and worked on adding GC to the Cocoa frameworks,
Basically, because of interop with C, there's only so much you can do. Plus, the tracing GC wasn't on iOS so if you want unified frameworks (for those that make sense cross-platform), supporting the tracing GC along with ARC is added work.
I've had to deploy lots of soft real time apps on GCed environments over the years, and it's always a problem. You can work around it with things like object pools, but some library or API will assume that the GC is OK and will be quietly spitting out objects continuously which will lead to a GC pause.
It's worth pointing out the Android devs finally started noticing this for Lollipop (probably due to their animations) and the API now has lots of places where it passes Java primitives instead of objects, which is the distinction between passing by value and by reference. Even if you're in C++ modern compilers can only make the most out of it if you pass by value, as this enables all sorts of other optimisations to kick in.
The key benefit of reference counting is it's predictable. Real time systems are also not strictly the lowest latency, they are defined by predictability. This becomes a preoccupation with minimising your worst case scenario.