Last week I posted Part 1 of this series of blogposts about GraalVM.
We looked at GraalVM and what it can do. We dove into the native-image
command and transformed a simple HelloWorld application into a native application (running in Docker).
This time I want to go beyond “Hello World” and build something useful (despite all the limitations listed below). A CRUD microservice with REST API and database access.
TLDR; If you’re just intersted in the end result, go right to the results
Limitations
So what are some of the limitations that GraalVM currently has? (using version 1.0.0 RC6 at time of writing)
Well, the Java VM is a very complete and dynamic system able to handle a lot of dynamic changes. This is something a native application can’t do as easily. GraalVM needs to analyse your application up front and discover all these things. Therefor, things like reflection and dynamic classloading are very hard to do (but not impossible).
To ‘emulate’ a dynamic running JVM the GraalVM project is shipped with Substrate VM.
Substrate VM is a framework that allows ahead-of-time (AOT) compilation of Java applications under closed-world assumption into executable images or shared objects (ELF-64 or 64-bit Mach-O).
The Java VM for example has a garbage collector, when you eliminate the JVM, you’ll still need to free your objects. This is something Substrate VM (written in Java!) does for you.
To read about the limitations of Substrate VM, look at this LIMITATIONS.md.
Spring Boot
We want to do more than just ‘HelloWorld’, how about an entire microservice?
If you talk about microservices in Java, most people will immediately say: Spring Boot. While you can certainly discuss the micro-part, it is by far the most popular way of writing (enterprise) Java applications at the moment. The problem is that most Spring Boot applications quickly grow, having Docker images of 600+ mb and runtime memory usage of 250+ mb is to be expected.
So it seems this is a perfect candidate to turn into a native application.
But there is some instant bad news: It won’t work
At least, at this moment in time. The developers of Spring Boot are working very hard with the developers of GraalVM to fix all the problems. A lot of work has already been done, you can check out the progress in this Spring issue.
Micronaut.io
A framework that also covers the entire spectrum and does work with GraalVM is micronaut.io. If you want something that works out of the box, check out their entire stack.
But I’d like to do it enirely myself, find the pitfalls and learn about the limitations of GraalVM at the moment. This is very useful to understand what you can and can’t do. I’m going to build my own GraalVM stack!
Web alternative: SparkJava
Instead of turning to Spring Boot or micronaut, let’s keep our application really micro and use something else. Spark Framework is a small web framework. And it works like a charm using native-image
.
First I created a simple Maven project and to use the native-image
command I needed all the Maven dependencies as a JAR file on the class path after compilation. To have this I added the following plugin to the pom.xml
:
Next I added the following dependency:
Now we can write our code and run the HelloWorld web application:
Building everything is the same as in Part 1.
There are two differences though. The base image can no longer be FROM scratch
because we need access to networking. Second, we need to expose the port of the web application to the outside using EXPOSE 4567
.
Also we need to add the following option to native-image
:
This option eliminates some problems during the analysis phase out of the way.
Running this Dockerfile results in a “Hello World” in the browser at http://localhost:4567/hello
. Using just 4 mb of runtime memory (!).
Dependency Injection: Google Guice
Another big problem at the moment using GraalVM is trying to use dependency injection. First I tried to use Google Guice. It is marketed as a ‘low overhead’ dependency injection framework. When I fired up the native-image
command I got the following exception (amongst many others):
It seems that Google Guice internally uses a way to call Integer.parseInt using reflection, but GraalVM doesn’t understand this. But luckely, we can help GraalVM a bit.
To fix this first problem I added the following file to my project:
And during the build of native-image
I pass the following option:
Now we have instructed Substrate VM/GraalVM that our application will do a reflection lookup to Integer.parseInt (amongst other calls). It now understands this and loads everything.
For some reason though, for each dependency I kept getting the following exception:
The application works great running it from java
but not after using native-image
. Time for something different!
Dependency Injection: Dagger 2
Instead of Google Guice or Spring I decided to go with another framework: Dagger 2
This framework has one big advantage over all the others: It works compile-time.
Sure, setting it up takes a little bit more time. You’ll need to include a Maven plugin that does all the magic during compilation. But it is a perfect solution (currently) for GraalVM’s native-image. All the injection-magic is done during compilation, so when running the application everything is already nicely wired up and static.
Database access (Hibernate/Oracle)
Finally, to complete my CRUD application I tried to access our Oracle database. GraalVM and the database are both created and maintained by Oracle so I hoped this would work out of the box…. but (spoiler): It didn’t.
The main problem here is the code in Oracle’s JDBC driver, this turned out to be a very hard thing to get working, this took about an entire day!
First off, there are a lot of static initializer blocks and some of those are starting Threads. This is something the SubstrateVM’s analyzer can’t handle (again: see LIMITATIONS.md).
It was throwing errors like:
Again, like before, the exception itself does provide a solution. The issue here are static initializer blocks starting threads during analysis, but this can be countered by delaying the class initialization. This isn’t trivial here because I don’t have access to the code in Oracle’s JDBC driver. But in the end I managed to get it working by adding the following parameters to the native-image
command:
The next problem was getting the persistence.xml
file to load. Hibernate is using ClassLoader.getResources() for this during runtime and for some reason I couldn’t get this to work. I knew there was a way to add resources into the native image but I struggled to get it working, the flag is called -H:IncludeResources=
and you can add a regex here.
It wasn’t until I browsed the Substrate VM source code and extracted the parsing from ResourcesFeature.java. Running this code locally showed me everything I tried was wrong.
Things I tried that didn’t work:
This finally worked (including all XSD’s, properties and everything in META-INF):
It turns out the listing goes through all the JAR files and directories and matches them against a relative path which in our case is:
- /logging.properties
- /META-INF/persistence.xml
- etc
So having META-INF/persistence.xml
or logging.properties
isn’t enough. This cost me way more time than it should have. A nice feature to add to GraalVM would be to list all the resources being added because at some point I was convinced it should just work and the problem was in my code somewhere.
Next problem: Xerces
This library gave me nightmares before as a Java developer, but luckily this time the problems could be fixed easily by adding more reflection exceptions to our -H:ReflectionConfigurationFiles=reflection.json
.
Also xerces needed a resource bundle to load:
Sigh, okay, still making progress.
Now again a problem with Hibernate and resources. I got the following StringIndexOutOfBoundsException
running the application:
It turns out due to the way GraalVM labels its resources we get to a point where Hibernate is confused. It calls the following code with the wrong parameters:
Input:
- url: “META-INF/persistence.xml”
- entry: “/META-INF/persistence.xml”
With this input entry.length()
=25 is bigger than url.length()
=24 resulting in file.substring(0, -1)
, sigh.
To fix this I’ve created a so called shadowed class. This is a class with the exact same signature that I am compiling and adding to the classpath. Because my version of the class is loaded before the Hibernate version of the class, my version is used, I’m overriding the Hibernate version. This is obviously very ugly, but it does the job surprisingly well!
I used Math.max(0, file.length() - entry.length())
to fix getting a ‘-1’ in the substring:
And of course, a new problem pops up. Again with the resources, GraalVM seems to have put resource
as a protocol of all the loaded resources. Opening the resource using an java.lang.URL
caused more problems in ArchiveHelper
because GraalVM doesn’t recognise ‘resource’ as a valid protocol (huh?). This meant I needed to make another small patch in the shadowed ArchiveHelper
:
The next big problem was getting oracle.jdbc.driver.OracleDriver
to accept the fact that GraalVM doesn’t support JMX (and might never support it). The driver tried to load MXBeans and MBeans from a static initializer block, this caused major headaches…. but in the end I managed to solve this again by shadowing another class:
Still the initial static initializer block in oracle.jdbc.driver.OracleDriver
wouldn’t load and broke the native compilation. Browsing the decompiled code I noticed the following lines which might cause a problem:
This class isn’t on the classpath oddly enough, so I decided to create a dummy/shadow class again, just in case:
The next problem is Hibernate’s dynamic runtime proxies. This is something GraalVM can’t handle so we need to make sure Hibernate’s magic is done before we start our application. Luckily there is a Maven plugin which does just that:
Now we have everything in place and we can use the EntityManager to access our database and execute queries…. right?
Well it turns out, the application does start, and it comes a long way. But there is one thing I wasn’t able to fix, loading the definitions.
Hibernate has two ways of loading/mapping the actual class to the database:
- Hbm files
- Annotations
First I tried to use annotations, but this failed because at runtime the native (pre-loaded) classes don’t have any knowledge left of the annotations they once had.
The second method is using an HBM xml file, but this too failed. Reading the XML file again needs support for annotations and JAXB failed on me.
So we’ll have to stop here. The JDBC driver was working, so probably plain old SQL would work perfectly. Hibernate for now eludes me.
Update: Previously I mentioned Hibernate was working, this was a mistake on my part!
Docker: Multi-stage build
After Part 1 some people suggested my Dockerfiles could be made cleaner with a so called ‘multistage’ build. More information can be found here: Docker multistage
After some changes I now have a single Dockerfile with two FROM
sections in it. The first section is the builder-part, the second part is the host-part. My docker build
command now uses that first image to build, passes everything to the second image and builds our resulting Docker container. Really nice, I’m learning new techniques every day.
Compilation speed
One thing I noticed during this entire process, the analysis time used by native-image
grew exponentially. The HelloWorld application from Part 1 took just a couple of seconds, but look at the following output:
It now took a whopping 60 minutes to compile to native! And during the creation of this project I needed to add a single line the reflecion.json
and restart the entire build a lot of times. Transforming a project to work with native-image
right now is a really time-consuming endeavour.
When compiling on my MacBook for MacOS, the compilation time is much shorter, just a couple of minutes. The problem here seems to be Docker and building for Ubuntu.
Final result: a working native microservice
I’m proud to say I now have a simple CRUD native microservice, written in Java, using Java libraries…. that is almost working. With a bit more work on the GraalVM/SubstrateVM side I’m pretty sure this could work in the near future.
It uses:
- SparkJava
- GSON
- Dagger 2
- SLF4J + java.util.logging
- Hibernate/Oracle (almost working…)
This allows me to serve REST/JSON objects from the database with some Java code in between, what most microservices do.
All the sources are on GitHub: check it out
Check out all the code on GitHub and try it out for yourself. To get it up and running all you need it to set up a database and put the connection information in persistence.xml
.
Start up time
The microservice has a very fast startup time:
Compare this to the Java 8 version (same code):
With 486ms compared to 3406ms the native version starts 7x faster.
Memory consumption
The Java version consumes 267mb of memory, while the native version takes just 20.7mb, so it is 13x smaller.
Conclusion
At the moment GraalVM is still in its infancy stage, there are still a lot of areas that can use some improvement. Most problems I’ve encountered have to do with resource loading or reflection. The build/analysis-cycle becomes pretty long when you add more and more classes and if you need to add reflection-configuration exceptions each build, this process is quite cumbersome.
Frameworks and libraries will start to notice GraalVM (once it gains more traction) and they will change their code to work better with Graal. For example the team behind Spring Boot is already actively working together with the GraalVM team to get their framework working.
Now some people are shouting to their laptops:
Why not just use Go/Rust/some other language that compiles to native by default!?
That is a very good point. If you want to use Go or Rust, go ahead!
Java is the most popular programming language in the world (according to the tiobe index). It is the language with most libraries and frameworks (except for Javascript probably). Changing entire development teams to learn Go and/or Rust, changing companies to a new language is very hard to do. Using native-image
might be a more accessible way of transitioning to native backends IMO.
I’ll for sure be keeping track of GraalVM, not just because of the native-image
capabilities, but also because the amazing speed of their VM.
Did you know the Oracle Database has GraalVM support? You can create queries which use Javascript functions or Java methods!
Did you know there is a Graal AOT compiler inside your JDK right now? (see: JEP-295)
Sources: All the code from this blogpost can be found here on GitHub.