Your argument basically boils down to "If you write fast C++ it will be fast", which is true. But a significant fraction of code out there is not fast C++ written by experts to be fast.
This is different than "Java will be faster than C++ because of HotSpot" arguments, because java is competing with C++. This is not a competition between JS and native C++, it's a competition between JS and WASM.
A for loop is not too interesting application -- it is not what Java optimizes for, and chances are you didn't benchmark it correctly.
To optimize your program in a low level language you have to basically have a whole plan for the architecture of your program beforehand, and every major change to that will break your optimizations. Also, don't forget about non-standard object life cycles, which is really common. Complex C++ programs basically employ their own GCs, which will be inferior to any one included in the JVM.
Of course low-level programs have their place (plenty of), eg. audio processing, embedded, million other, but the average business/CRUD app will be faster* both to execute and to produce in Java, as well as better maintainable.
* With enough time a competent team could of course write a faster version of it in C++, but it's not a good use of their time, and you would be surprised how hard it, especially with ever-changing requirements.
A language either cares about low level details or not. You can’t have it both ways. And c++ is absolutely a low level language.
> I don't know any complex C++ program that employ their own GCs when C++ has RAII which is superior to GC.
RAII is not at all a replacement for GC. It is only suitable for a subset of object lifetimes. There are plenty of cases where you can’t really pinpoint a scope-exit where this given object should be reclaimed.
A GC is a necessity in many concurrent algorithms that simply could not be written without.
> Just give a try for C++11/14/17
I have and I like it. There are domains where I would not even start writing Java, and vice versa with C++.
Your CRUD app may have been a breeze but what if the requirement has changed now touching on a core of your program. You have to refactor and it will be really expensive, compared to a high level language. Every memory allocation/deallocation have to be thought out again and tested (and while rust can warn about it, you still have to write a major refactor as it is another low level lang)
Being multi-paradigm is a different axis all around. Low-level (which is by the way not a well-defined concept, C is actually also high level, only assembly is low, but that usage is not that useful) means that low level details leak into your high level description of code, making the two coupled. You can’t make them invisible.
Also, as an example, think of Qt. A widget’s lifetime is absolutely not scope-based, nor is it living throughout the whole program. You have to explicitly destruct it somewhere. And there are plenty of other examples.
And as I said, I’m familiar with RAII, it’s really great when the given object is scope-based, but can’t do anything otherwise.
> C++ is a OOP language just like Java. You do it same way as you do in Java. Use inheritance.
And if the new subclass has some non-standard object life cycle you HAVE to handle that case somewhere else, modifying another aspect of the code. It is not invisible, unless you want leaking code/memory corruption.
> And if the new subclass has some non-standard object life cycle you HAVE to handle that case somewhere else, modifying another aspect of the code. It is not invisible, unless you want leaking code/memory corruption.
The main problems with Java aren't being JITted, it's that it's not expressive enough. It doesn't have SIMD (yet) or value types (yet…?).
I would expect a JIT to not really be able to find a lot of magic optimization opportunities, though maybe there are some, and it'd actually be annoying if it could. The most important thing in a tool like that is predictability, because you can't make development decisions based on magic.
That may be part of it, but I imagine the JVM's safety obligations are also a significant factor. If the JIT can't elide array bounds checks, checks must be performed at runtime. Runtime type checks might be needed. Runtime arithmetic checks might also be needed. The JVM is also more constraining regarding concurrency gone awry, than the C/C++ memory model. [0] More broadly, the JVM's lack of undefined behaviour constrains the optimiser in ways the C/C++ approach does not (although I'm open to the idea that it's overstated how much of a performance win is owed to C and C++ having many kinds of undefined behaviour).
And of course there's the GC and Java's high object-churn, even where lifetimes are known statically. To my knowledge, escape analysis (the relevant family of JIT optimisations) still hasn't really addressed this.
The JIT can elide array bound checks really often, and most "low hanging" optimizations are solved quite cleverly (it's way out of scope for my knowledge, but I remember reading that null checks are elided by trapping segfaults? Does it make sense?).
There is no over/underflow checks so I don't know what you mean by arithmetic checks -- in pure number crunching the JVM is insanely fast.
And you are right in that many Java libs/programs are quite happy to create garbage, though with generational GCs it is really cheap. Escape analysis is great, but primitive classes in Project Valhalla will solve this last problem of object locality.
Sounds right. No need to generate instructions to perform the check if you can rely on a hardware trap, by means of signal-handling cleverness.
> There is no over/underflow checks so I don't know what you mean by arithmetic checks -- in pure number crunching the JVM is insanely fast.
Integer multiplication, addition, and subtraction, are all defined in Java to have wrapping behaviour, and are easily implemented. Whatever the input values, there's no way those operations can fail. (Incidentally, this is a terrible way of handling overflow. This turned up recently in discussion. [0]) Division is trickier. In Java, integer division by zero results in an exception being thrown. Apparently JVMs can implement this with signal-handling cleverness similar to dereferencing null references. [1] Two's complement integer division has another edge case, which is undefined behaviour in C/C++ but which, iirc, results in an exception in Java: INT_MIN / -1. I believe the JIT has to emit instructions to check for this, as it's not possible to leverage signal-handling there.
I don't know how well modern Java performs in floating-point arithmetic. Here's an old tirade about it [2] and discussion. [3]
> with generational GCs it is really cheap.
At the risk of going off topic: doesn't Java tend to perform somewhere around 60% the speed of C/C++, while using considerably more memory? Perhaps the GC isn't to blame, but clearly the blame belongs somewhere. It's like the way advocates of Electron will insist that modern HTML rendering engines are fast and efficient, the DOM is fast and efficient, and JavaScript is fast and efficient... and yet here we are, with Electron-based applications reliably taking several times the computational resources of competing solutions using conventional toolkits.
> primitive classes in Project Valhalla will solve this last problem of object locality
Interesting, sounds like the kind of ambitious initiative that will require deep changes to the JVM.
> At the risk of going off topic: doesn't Java tend to perform somewhere around 60% the speed of C/C++, while using considerably more memory?
It is hard to properly benchmark this generally, for small programs it is “at most” within 2-3X, but I believe for more complex applications it closes the gap quite well (many things can be “dynamically” inlined even between classes far from each other). Not sure how it fares with PGOs.
And yeah it does use more memory, both the runtime/JIT/GC and each object has considerable overhead, but I don’t think that comparing it to Electron is apt. Electron is slow because it adds additional steps to the picture, not because of the JS engine itself. V8 is similarly an engineering gem, and it can be stupidly fast from time to time.
As for the GC:
The GC itself is required for some program to work correctly. C/C++ codebases often create their own GC, and that will surely be slower than any of the multiple GCs found in the JVM. But for short-living programs the GC doesn’t even run (similarly to how some short lived C program leaves clean up to the OS), so rather the former is responsible for the bigger memory usage.
All in all, where ultimate control over memory/execution is not required (that is, you don’t need a low level language), Java is fast enough, especially combined with it being productive and easy (and safe) to refactor, as well as having top notch profiling tools (with so low overhead, that it can be run in production as well).
Optimizations like 'these two function arguments are always int31' in v8 or spidermonkey are 100% predictable at this point and result in all your type checks and boxing being eliminated, and with the known types it also becomes much cheaper/faster to create object instances (since now if you store those values into properties of an object, that object's shape is fully known). Various properties like this can extend out into larger parts of your JS application.
There's still a lot of magic you can't rely on, but you'd be surprised how much you CAN rely on. Asm.js was built on this observation: If you write your JS following some basic rules it's actually pretty easy to land on predictable, well-optimized paths. Of course, one of WASM's advantages is that by design you're almost always on those paths and don't have to worry.
> The most important thing in a tool like that is predictability, because you can't make development decisions based on magic.
Fortunately you've got the best profiling tools available, so you don't have to guess. And also you get to see the relative importance of the function you try to optimize, whether that actually is the bottleneck (and actually people often guess wrongly where the bottleneck is)
It surely has had support for AVX for several releases, although via the autovectorization support, and explicit SIMD has been made available as preview on Java 16.
Autovectorization is the kind of magic you can't rely on. It sort of works on a single platform but you will always run into cases it doesn't handle even if you own your own team of autovectorization engineers who tell you it's perfect.
At the other hand, the explicit Vector API will use the correct "flavor" of SIMD instructions on the platform and will gracefully fall back to non-simd version if it is not supported. And as far as I know, the SIMD story is quite bad with C.
It's pretty good in C with assembly, inline or not. SIMD usually involves a lot of aliasing violations and intrinsics have weird hard to read names, so I find assembly easier to deal with than C here.
This is different than "Java will be faster than C++ because of HotSpot" arguments, because java is competing with C++. This is not a competition between JS and native C++, it's a competition between JS and WASM.