Hacker Newsnew | past | comments | ask | show | jobs | submit | more fniephaus's commentslogin

We are working on a Python runtime as well: https://github.com/oracle/graalpython


The students used both, the community and enterprise editions of GraalVM. Indeed, G1 is an enterprise feature: https://www.graalvm.org/22.2/reference-manual/native-image/o...


It fails for some reason when reading user data from disk. The error also goes away if you nuke the user data but that's less convenient.


Just look at the 22.2 release notes [1]:

> Updated the OpenJDK release on which GraalVM Community Edition is built ...

and

> Updated the Oracle JDK release on which GraalVM Enterprise Edition is built ...

[1] https://www.graalvm.org/release-notes/22_2/


Got it, thank you @fniephaus. Really appreciate the info, and please keep up the fantastic work!


Disclaimer: I work on the GraalVM team.

The students "measured noticeable reductions in terms of memory footprint of up to 43%" [1] in some preliminary experiments. More from the accompanying blog post:

"We also hope that the Minecraft community builds on our work and helps benchmark different configurations for native Minecraft servers in more detail and in larger settings."

Please feel free to share any numbers on CPU/memory usage with us!

[1] https://medium.com/graalvm/native-minecraft-servers-with-gra...


Note that the memory usage _could_ potentially be significantly improved for the JVM by just using an alternative allocator, such as jemalloc. In our system, we saw, in some instances, native memory usage decrease by about 60%, and it also resolved a slow "leak" that we saw, since glibc was allocating memory, and not returning it to the OS. In our case it was because we were opening a lot of class loaders, and hence zip files, from different threads.


I can second what you wrote about jemalloc. Some internal services at Amazon are using it with solid outcomes. I also recommend trying out 5.3.0 version released earlier this year.


Last I did benchmarking, a vast majority of memory allocations were strings that were typically all dereferenced right away and cleaned up in GEN1 GC. I had contemplated whether string pooling would be useful or not but never got around to it. Would be interesting to see if you could get reduced memory usage and potentially better performance by decreasing pressure on the GC during the GEN1 phase.

(Side note: this was when I was co-maintaining MCPC so was typically with mods installed and they heavily use NBT which I suspect is where a lot of that string allocation was happening.)


This is very interesting. Could you share more details on this particular issue in glibc? Jar files get mapped so I'm really interested where glibc failed to release memory.


No the OP, but we had similar issue — our service was leaking when allocating native memory using JNI. We onboarded Jemalloc as it has better debugging capabilities, but the leak dissapeared and performance improved. We never got around to root causing original leak.


It's probably the same thing prestodb encountered: https://github.com/prestodb/presto/issues/8993


For performance reasons, glibc may not return freed memory to the OS. You can increase the incentive for it to do so, by reducing MALLOC_ARENA_MAX to 2. https://github.com/prestodb/presto/issues/8993


I was under the impression that most builds of the JVM used jemalloc by default.


Why is this? I thought the JVM already did somewhat decent JIT compilation ...

If I understand the article correctly, you're preempting all possibly unoptimized/expensive code paths (reflection) by attempting to literally execute all of them? While it's a cool experiment, isn't it a bit error-prone (besides being a lot of effort of course, but playing Minecraft on the side does sound pretty fun!)?


The JVM is likely to beat AOT compiled java code in almost all cases - but due to Graal having a closed-world assumption (e.g. no unknown class can be loaded, so a non-final class knows that it won’t be overridden allowing for better optimizations, limited reflection allows for storing less metadata on classes, etc) it does allow for significant memory reduction. Also, escape analysis is easier in an offline manner.


can't that all be done speculatively with de-optimization /s


JIT compilation requires additional CPU and memory resources at run-time, which AOT compilation can avoid. This also means that for a native executable, the compilation work only needs to be done once at build-time and not per process.


This is the first time I see someone bring up extra cpu and memory usage as a downside of JIT. It might matter in the embedded world but it's Java we're talking about so the cost is minuscule compared to what you're getting for it.


You’re not wrong, but it is funny how we got here from Gosling’s Oak addressing set top boxes.

The thing was built to address the burgeoning embedded w/ a little horsepower market with its variety of hardware and OSes.

Now it runs Enterprise server software… and Minecraft.


Well, it does make sense - a controlled runtime failure is much better than a segfault, or worse, a silent failure corrupting heap. Pair it with decent performance even back than, increased developer productivity and the best observability tools, which is again helped by the VM-semantics.


Those are usually pretty trivial as they are judiciously handed out based on hot code paths by the JVM.

There are certainly pathological cases where it could cause major issues.

AOT suffers from not having runtime information, so anything involving dynamic dispatch (which is REALLY heavily used in java) will be a lot harder to optimize. JITs get to cheat because they know that the `void foo(Collection bar)` method is always or usually called with an `ArrayList`. PGO is the AOT world's answer to this problem, but it generally explodes build times and requires real world usage.

In java land, there's also the option of "AppCDS" which can cut down a large portion of that compilation time between processes.


GraalVM does have a better optimizer in certain conditions than C2 in vanilla JDK which can can lead to better performance. Basically the only way to know if GraalVM will give you better performance or not is to try it and/or run benchmark your code.

https://www.graalvm.org/22.2/examples/java-performance-examp...


Is there any benefit to simply running/JIT the client and server on GraalVM instead of JVM?


I don't think the student looked into that at all, but I guess it depends on what the Java client uses for drawing. GraalVM Native Image currently doesn't support AWT on Linux/JDK17+, but we are working on fixing that soon.


> it depends on what the Java client uses for drawing

AFAIK, the Java client uses LWJGL, which is a native library.


Thanks for the info! Seems like it's worth trying to compile the Java client with GraalVM Native Image then, given that this exists: https://github.com/chirontt/lwjgl3-helloworld-native


Apparently, someone has managed to compile the Minecraft client to native: https://medium.com/@kb1000/what-youve-done-with-the-server-i...


Do you have any feedback on how we could improve the docs? If so, please let us know.

I believe the easiest way to start a new Truffle language implementation is to fork SimpleLanguage [1] and turn it into your language. Did you try to do that?

[1] https://github.com/graalvm/simplelanguage


Yes please.

My major blockers were -

1. Starting post ast generation was an abrupt start. An end to end tutorial - from parsing to working compiler for a tiny language like Lua, Lox, Wren etc would be very helpful. You need not deep dive into the parsing part - just give enough that we can follow along. Another option can be to continue an existing tutorial series like Lox from "crafting interpreters". That way you don't have to focus on parts which are not Truffle specific, yet users can follow along.

2. Just going through existing Java code of the Simple Language was extremely difficult for a newbie to Java, like me. I would much prefer a readable tutorial which explains all concepts in more details

3. More language examples please. As I said before, if possible do add a couple more languages like Lua. I believe Lua is already a Truffle language. Just an accompanying tutorial is missing. I remember when I tried to read through the codes of Ruby, Lua and simple language, they all started off very differently and I just got lost.


Adding 2 more points -

4. More tutorials. Outside of the main docs, I found only 1 comprehensive tutorial. I think it will be great if the key members make it a priority to add smalllish tutorials on things like forth, brainfuck etc in other blogs and articles.

5. Tutorials in Kotlin! I am new to the jvm, but I am digging Kotlin as a saner alternative. I think having some tuts in Kotlin would be a great help.


I think the issue here is that the Truffle docs:

a. Point people to the papers to understand how the 'magic' happens.

b. Are really intended for people who already know how to build language VMs from scratch.

If you've never implemented a language interpreter at all, then you're going to struggle. Arguably not Truffle's fault, but hey, you can't drop the cost of making SOTA language runtimes by a couple orders of magnitude and then be surprised when a whole lot of newbies show up wanting to try their hand at it :)


I agree! Our work on TruffleSqueak [1] and on Polyglot Live Programming [2] has shown that it's possible to build language-agnostic live programming tools with GraalVM. If Espresso implements the required Truffle APIs correctly, our tools should also just work for Java. :)

[1] https://github.com/hpi-swa/trufflesqueak/ [2] https://github.com/hpi-swa/polyglot-live-programming


Espresso's README.md contains some interesting details, for example:

> ... it already passes >99.99% of the Java Compatibility Kit ...

> Running HelloWorld on three nested layers of Espresso takes ~15 minutes.

Link: https://github.com/oracle/graal/tree/6dd83cd94763b8736b42063...


Regarding performance: "Note that current raw performance of Java on Truffle isn’t representative of what it will be capable of in the near future. The peak performance is several times lower than running the same code in the usual JIT mode. The warmup also hasn’t been optimized yet. We focused in this initial release entirely on the functionality, compatibility, and making Java on Truffle open source available for a broader community. ... Expect performance, both warmup and peak performance to increase rapidly in each of our upcoming 21.x releases."


> Running HelloWorld on three nested layers of Espresso takes ~15 minutes.

It’s like how time works in “Inception”.



Normally that would make us mark this one as a dupe, but the posts are subtly different enough, and the material so interesting, that maybe we can leave this one up. Hopefully people will investigate the actual contents of said zoo and comment on those, rather than just generically about Smalltalk.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: