> but not for serious engineering of long-lived applications
Oh come on. Does Lucene (or Solr or Elasticsearch built on top of it) not qualify as serious engineering? Elasticsearch is quite successful, and is indeed intended to be used a long-lived application!
Does this mean that the likes of Lucene don't run into GC issues? Of course not. I've certainly diagnosed problems in Elasticsearch related to GC (which, more often then not, is a symptom of something else going wrong), but saying it's not qualified for "serious engineering" is just patently ridiculous.
And that's only one example. There are loads more!
> These slices all come with the (huge) overhead of adding a reference to the original runtime object
Yes I knew I should not pull the "serious engineering" card going in... But there I go, giving a mostly clueless answer to a high-profile HN user :-)
I don't know elasticsearch, but if this is something like a database where millions of objects are tracked (like in an RDBMS, or in a 3D Game if coded by an unexperienced coder who likes to isolate everything down to the Vertex or scalar level into "objects"), then I would assume at least one of the following applies
- The objects in the datastore are not represented as individual runtime object after all
- The GC for objects in the datastore is highly tuned (GC only done manually, at certain points),
and the memory space overhead of having individual DB objects represented by runtime objects
is just accepted.
I mean, I did finish said Java application, but I got good performance from it only after transforming it into an unreadable mess based on SOA's of int[] (which means unboxed integers, not objects) and lots of boilerplate code. Would have been easier to do in C, hands down (language was not my own choice).
> and object/GC overhead? It's GC tracked objects after all, right? (again, I admit to knowing next to nothing about Go's runtime)
Go has value semantics. So when you have a `[]T` ("slice of T"), then what you have is 24 bytes on the stack consisting of the aforementioned `SliceHeader` type. So there's no extra indirection there, but there might be a write barrier lurking somewhere. :-)
> I don't know elasticsearch, but if this is something like a database where millions of objects are tracked
Elasticsearch is built on top of Lucene, which is a search engine library, which is itself a form of database. I don't think there's any inherent requirement that a database needs to have millions of objects in memory at any given point in time. There are various things you can ask Elasticsearch to do that will invariably cause many objects to be loaded into memory; and usually this ends up being a problem that you need to fix. It exposes itself as "OMG Elasticsearch is spending all its time in GC," but the GC isn't the problem. The allocation is.
In normal operation, Elasticsearch isn't going to load a millions of objects into memory. Instead, its going to read objects that it needs from disk, and of course, the on disk data structures are cleverly designed (like any database). This in turn relies heavily on the operating system's page cache!
Oh come on. Does Lucene (or Solr or Elasticsearch built on top of it) not qualify as serious engineering? Elasticsearch is quite successful, and is indeed intended to be used a long-lived application!
Does this mean that the likes of Lucene don't run into GC issues? Of course not. I've certainly diagnosed problems in Elasticsearch related to GC (which, more often then not, is a symptom of something else going wrong), but saying it's not qualified for "serious engineering" is just patently ridiculous.
And that's only one example. There are loads more!
> These slices all come with the (huge) overhead of adding a reference to the original runtime object
Huh? This is the representation of a slice: https://golang.org/pkg/reflect/#SliceHeader --- It's pretty standard for a dynamically growable region of memory.