In the previous post I showed a nice snippet/trick at the bottom of the post to find all the neighbors in a 2D array.
Last week I used this snippet to solve a word search puzzle a colleague created for us as a “Back to work in 2019”-challenge:
(Note: Some fellow developer colleagues think that solving the puzzle with code is unfair… heh. really?!)
Reading the lexicon
Here is the complete code:
First we’ll need to read in all the possible words we’re looking for. And using the new Java streams this is very easy. Just download one of the many online word-lists and do the following:
It was never easier in Java to do File IO and manipulate collections. Next I embedded the entire puzzle and turned it into an array of Strings.
Finding all the words
Finally we need to traverse all the possible combinations a word can be made and check against the Set of target words. I’m doing this the dumbest way possible.
First I go over each possible start location by enumerating all x/y combinations:
Next I check each direction using the neighbor algorithm from the previous post.
Now we start building the possible word, using two new pointers (tx and ty):
While we are within the bounds of the puzzle we must travel in the current ‘direction’. This is done by adding the direction offset (from previous post) to the tx and ty pointers.
If the word in-progress matches one we’re searching for, add it to the found words list!
Finally let’s close everything off, for developers with OCD like me:
This prints the words, sorted, just as we want. Q.E.D..
The way we now iterate over all the possible word hiding places is very very dumb.
There are some obvious improvements to be made here, for example one could place the entire lexicon in a tree-like structure instead of a set of words and stop iterating if the certain branch doesn’t exist.
However, for this puzzle, this challenge, the speed it runs at is good enough!
You can download the complete listing here: PuzzleSolver.java
Pathfinding for Tower Defense
Last week though I was reading this tutorial, Pathfinding for Tower Defense.
He explains a method for creating a single vector/flow field for game AI’s to follow. This allows a single calculation to be used by a lot of AI bots, which is great for a Tower Defense game.
A new game…
“What is the longest path possible? / What is the highest number I can reach?”
After I found out I can move the start position around, I came up with the following configuration:
The maze is 113 long… but it turns out this is far from optimal. After coding up a little solver and playing around I found the following configuration:
This is a solution that goes up to 124 (near the top left corner). I’ve also found one later with a score of 125.
It seems that diagonals are the best way to pack the longest maze into there, but the problem is, one diagonal cuts the field in two smaller parts. Zig-zagging works great, but there is no ‘easiest’ way to zig-zag.
It doesn’t seem to me that there is a trivial solution? I haven’t found one at least, is there? Are there simple optimal patterns for given boxes, NxN?
Searching for a solution seems to be NP-hard and even 20x10 is out of the question, the amount of possible configurations of walls and starting positions is huge and grows exponentially. Coding up a solver is pretty easy though, using some simple heuristics and some random searches I got my 125 solution.
Maybe this would be a fun puzzle for a future AZsPCs contest…?
Can you get a better configuration than 125? I’d love to know.
When I wrote a score checker for the puzzle above I needed a way to get the neighbors in an 2D array. I’ve gotten used to code 2D mazes/puzzles and I always use the following way to calculate all the neighbors.
I find this to be easier and more readable that the alternatives, going over all the -1/+1 combinations or having two nested loops.
How does this work?
First we list all directions with a single integer, 0 to 9, and we’ll need to skip the middle number four, this points to ourself.
From this single direction we’re able to easily calculate both the new row value (n_row) and the new col value (n_col). To do this we’ll need to divide by three and use modulo three.
Neighbor row value:
To find the new row value we start our by using modulo 3. The only thing left to do is to subtract 1 and we’ll get the new row value for each direction:
Neighbor col value:
The same thing can be done for the col value. Instead of using modulo 3, we’ll be using divide by 3. And the same as with the row value, we’ll also need to subtract 1 to get the following new column values:
And there you go, list all neighbors in a 2D array without using two nested loops and dirty checks.
First of all, best wishes for 2019 from me.
Time for me to reflect on the last couple of months.
Last year has been a crazy busy year for me. I’ve been to and spoken at to J-Spring, J-Fall, Devoxx Belgium, Devoxx Poland and JavaOne… errr… Oracle CodeOne on various topics. From quantum computing to software architecture.
Here is a list of all my conference talks.
I’m a big fan of sharing knowledge and getting people together to have fun. This is why I’ve decided to create a new JUG (Java User Group) this year in the city of Rotterdam, the RotterdamJUG. In November we had our first meeting and I’ve already got more meetups planned.
The biggest achievement this year is probably being named an Oracle Java Champion. There are no more than 300 Java Champions in the world and it is a great honor to be added to this group.
So what’s next for 2019? I’m going to continue with the RotterdamJUG, we’ve already got a Kotlin-workshop planned for this month and I’m looking for a venue for a February meetup (anyone?).
I’m also going to go back to math and puzzles, I’ve got some idea’s planned and some tricks/algorithms to explain, more on that in the coming days/weeks hopefully!
We looked at GraalVM and what it can do. We dove into the
native-image command and transformed a simple HelloWorld application into a native application (running in Docker).
This time I want to go beyond “Hello World” and build something useful (despite all the limitations listed below). A CRUD microservice with REST API and database access.
TLDR; If you’re just intersted in the end result, go right to the results
So what are some of the limitations that GraalVM currently has? (using version 1.0.0 RC6 at time of writing)
Well, the Java VM is a very complete and dynamic system able to handle a lot of dynamic changes. This is something a native application can’t do as easily. GraalVM needs to analyse your application up front and discover all these things. Therefor, things like reflection and dynamic classloading are very hard to do (but not impossible).
To ‘emulate’ a dynamic running JVM the GraalVM project is shipped with Substrate VM.
Substrate VM is a framework that allows ahead-of-time (AOT) compilation of Java applications under closed-world assumption into executable images or shared objects (ELF-64 or 64-bit Mach-O).
The Java VM for example has a garbage collector, when you eliminate the JVM, you’ll still need to free your objects. This is something Substrate VM (written in Java!) does for you.
To read about the limitations of Substrate VM, look at this LIMITATIONS.md.
We want to do more than just ‘HelloWorld’, how about an entire microservice?
If you talk about microservices in Java, most people will immediately say: Spring Boot. While you can certainly discuss the micro-part, it is by far the most popular way of writing (enterprise) Java applications at the moment. The problem is that most Spring Boot applications quickly grow, having Docker images of 600+ mb and runtime memory usage of 250+ mb is to be expected.
So it seems this is a perfect candidate to turn into a native application.
But there is some instant bad news: It won’t work
At least, at this moment in time. The developers of Spring Boot are working very hard with the developers of GraalVM to fix all the problems. A lot of work has already been done, you can check out the progress in this Spring issue.
A framework that also covers the entire spectrum and does work with GraalVM is micronaut.io. If you want something that works out of the box, check out their entire stack.
But I’d like to do it enirely myself, find the pitfalls and learn about the limitations of GraalVM at the moment. This is very useful to understand what you can and can’t do. I’m going to build my own GraalVM stack!
Web alternative: SparkJava
Instead of turning to Spring Boot or micronaut, let’s keep our application really micro and use something else. Spark Framework is a small web framework. And it works like a charm using
First I created a simple Maven project and to use the
native-image command I needed all the Maven dependencies as a JAR file on the class path after compilation. To have this I added the following plugin to the
Next I added the following dependency:
Now we can write our code and run the HelloWorld web application:
Building everything is the same as in Part 1.
There are two differences though. The base image can no longer be
FROM scratch because we need access to networking. Second, we need to expose the port of the web application to the outside using
Also we need to add the following option to
This option eliminates some problems during the analysis phase out of the way.
Running this Dockerfile results in a “Hello World” in the browser at
http://localhost:4567/hello. Using just 4 mb of runtime memory (!).
Dependency Injection: Google Guice
Another big problem at the moment using GraalVM is trying to use dependency injection. First I tried to use Google Guice. It is marketed as a ‘low overhead’ dependency injection framework. When I fired up the
native-image command I got the following exception (amongst many others):
It seems that Google Guice internally uses a way to call Integer.parseInt using reflection, but GraalVM doesn’t understand this. But luckely, we can help GraalVM a bit.
To fix this first problem I added the following file to my project:
And during the build of
native-image I pass the following option:
Now we have instructed Substrate VM/GraalVM that our application will do a reflection lookup to Integer.parseInt (amongst other calls). It now understands this and loads everything.
For some reason though, for each dependency I kept getting the following exception:
The application works great running it from
java but not after using
native-image. Time for something different!
Dependency Injection: Dagger 2
Instead of Google Guice or Spring I decided to go with another framework: Dagger 2
This framework has one big advantage over all the others: It works compile-time.
Sure, setting it up takes a little bit more time. You’ll need to include a Maven plugin that does all the magic during compilation. But it is a perfect solution (currently) for GraalVM’s native-image. All the injection-magic is done during compilation, so when running the application everything is already nicely wired up and static.
Database access (Hibernate/Oracle)
Finally, to complete my CRUD application I tried to access our Oracle database. GraalVM and the database are both created and maintained by Oracle so I hoped this would work out of the box…. but (spoiler): It didn’t.
The main problem here is the code in Oracle’s JDBC driver, this turned out to be a very hard thing to get working, this took about an entire day!
First off, there are a lot of static initializer blocks and some of those are starting Threads. This is something the SubstrateVM’s analyzer can’t handle (again: see LIMITATIONS.md).
It was throwing errors like:
Again, like before, the exception itself does provide a solution. The issue here are static initializer blocks starting threads during analysis, but this can be countered by delaying the class initialization. This isn’t trivial here because I don’t have access to the code in Oracle’s JDBC driver. But in the end I managed to get it working by adding the following parameters to the
The next problem was getting the
persistence.xml file to load. Hibernate is using ClassLoader.getResources() for this during runtime and for some reason I couldn’t get this to work. I knew there was a way to add resources into the native image but I struggled to get it working, the flag is called
-H:IncludeResources= and you can add a regex here.
It wasn’t until I browsed the Substrate VM source code and extracted the parsing from ResourcesFeature.java. Running this code locally showed me everything I tried was wrong.
Things I tried that didn’t work:
This finally worked (including all XSD’s, properties and everything in META-INF):
It turns out the listing goes through all the JAR files and directories and matches them against a relative path which in our case is:
logging.properties isn’t enough. This cost me way more time than it should have. A nice feature to add to GraalVM would be to list all the resources being added because at some point I was convinced it should just work and the problem was in my code somewhere.
Next problem: Xerces
This library gave me nightmares before as a Java developer, but luckily this time the problems could be fixed easily by adding more reflection exceptions to our -H:ReflectionConfigurationFiles=
Also xerces needed a resource bundle to load:
Sigh, okay, still making progress.
Now again a problem with Hibernate and resources. I got the following
StringIndexOutOfBoundsException running the application:
It turns out due to the way GraalVM labels its resources we get to a point where Hibernate is confused. It calls the following code with the wrong parameters:
- url: “META-INF/persistence.xml”
- entry: “/META-INF/persistence.xml”
With this input
entry.length()=25 is bigger than
url.length()=24 resulting in
file.substring(0, -1), sigh.
To fix this I’ve created a so called shadowed class. This is a class with the exact same signature that I am compiling and adding to the classpath. Because my version of the class is loaded before the Hibernate version of the class, my version is used, I’m overriding the Hibernate version. This is obviously very ugly, but it does the job surprisingly well!
Math.max(0, file.length() - entry.length()) to fix getting a ‘-1’ in the substring:
And of course, a new problem pops up. Again with the resources, GraalVM seems to have put
resource as a protocol of all the loaded resources. Opening the resource using an
java.lang.URL caused more problems in
ArchiveHelper because GraalVM doesn’t recognise ‘resource’ as a valid protocol (huh?). This meant I needed to make another small patch in the shadowed
The next big problem was getting
oracle.jdbc.driver.OracleDriver to accept the fact that GraalVM doesn’t support JMX (and might never support it). The driver tried to load MXBeans and MBeans from a static initializer block, this caused major headaches…. but in the end I managed to solve this again by shadowing another class:
Still the initial static initializer block in
oracle.jdbc.driver.OracleDriver wouldn’t load and broke the native compilation. Browsing the decompiled code I noticed the following lines which might cause a problem:
This class isn’t on the classpath oddly enough, so I decided to create a dummy/shadow class again, just in case:
The next problem is Hibernate’s dynamic runtime proxies. This is something GraalVM can’t handle so we need to make sure Hibernate’s magic is done before we start our application. Luckily there is a Maven plugin which does just that:
Now we have everything in place and we can use the EntityManager to access our database and execute queries…. right?
Well it turns out, the application does start, and it comes a long way. But there is one thing I wasn’t able to fix, loading the definitions.
Hibernate has two ways of loading/mapping the actual class to the database:
- Hbm files
First I tried to use annotations, but this failed because at runtime the native (pre-loaded) classes don’t have any knowledge left of the annotations they once had.
The second method is using an HBM xml file, but this too failed. Reading the XML file again needs support for annotations and JAXB failed on me.
So we’ll have to stop here. The JDBC driver was working, so probably plain old SQL would work perfectly. Hibernate for now eludes me.
Update: Previously I mentioned Hibernate was working, this was a mistake on my part!
Docker: Multi-stage build
After some changes I now have a single Dockerfile with two
FROM sections in it. The first section is the builder-part, the second part is the host-part. My
docker build command now uses that first image to build, passes everything to the second image and builds our resulting Docker container. Really nice, I’m learning new techniques every day.
One thing I noticed during this entire process, the analysis time used by
native-image grew exponentially. The HelloWorld application from Part 1 took just a couple of seconds, but look at the following output:
It now took a whopping 60 minutes to compile to native! And during the creation of this project I needed to add a single line the
reflecion.json and restart the entire build a lot of times. Transforming a project to work with
native-image right now is a really time-consuming endeavour.
When compiling on my MacBook for MacOS, the compilation time is much shorter, just a couple of minutes. The problem here seems to be Docker and building for Ubuntu.
I’m proud to say I now have a simple CRUD native microservice, written in Java, using Java libraries…. that is almost working. With a bit more work on the GraalVM/SubstrateVM side I’m pretty sure this could work in the near future.
- Dagger 2
- SLF4J + java.util.logging
- Hibernate/Oracle (almost working…)
This allows me to serve REST/JSON objects from the database with some Java code in between, what most microservices do.
All the sources are on GitHub: check it out
Check out all the code on GitHub and try it out for yourself. To get it up and running all you need it to set up a database and put the connection information in
Start up time
The microservice has a very fast startup time:
Compare this to the Java 8 version (same code):
With 486ms compared to 3406ms the native version starts 7x faster.
The Java version consumes 267mb of memory, while the native version takes just 20.7mb, so it is 13x smaller.
At the moment GraalVM is still in its infancy stage, there are still a lot of areas that can use some improvement. Most problems I’ve encountered have to do with resource loading or reflection. The build/analysis-cycle becomes pretty long when you add more and more classes and if you need to add reflection-configuration exceptions each build, this process is quite cumbersome.
Frameworks and libraries will start to notice GraalVM (once it gains more traction) and they will change their code to work better with Graal. For example the team behind Spring Boot is already actively working together with the GraalVM team to get their framework working.
Now some people are shouting to their laptops:
Why not just use Go/Rust/some other language that compiles to native by default!?
That is a very good point. If you want to use Go or Rust, go ahead!
native-image might be a more accessible way of transitioning to native backends IMO.
I’ll for sure be keeping track of GraalVM, not just because of the
native-image capabilities, but also because the amazing speed of their VM.
Did you know there is a Graal AOT compiler inside your JDK right now? (see: JEP-295)
Sources: All the code from this blogpost can be found here on GitHub.
One of the most amazing projects I’ve learned about this year is GraalVM.
I’ve learned about this project during Devoxx Poland (a Polish developer conference) at a talk by Oleg Šelajev. If you’re curious about everything GraalVM has to offer, not just the native Java compilation, please watch his video.
GraalVM is a universal/polyglot virtual machine. This means GraalVM can run programs written in:
- Python 3
- JVM-based languages (such as Java, Scala, Kotlin)
- LLVM-based languages (such as C, C++).
In short: Graal is very powerful.
There is also the possibility to mix-and-match languages using Graal, do you want to make a nice graph in R from your Java code? No problem. Do you want to call some fast C code from Python, go ahead.
In this blogpost though we’ll look at another powerful thing Graal can do:
Instead of explaining what it is, let’s just go ahead, install GraalVM and try it out.
To install GraalVM, download and unpack, update PATH parameters and you’re ready to go. When you look in the /bin directory of Graal you’ll see the following programs:
Here we recognise some usual commands, such as ‘javac’ and ‘java’. And if everything is setup correctly you’ll see:
Hello World with native-image
Next up, let’s create a “Hello World” application in Java:
And just like your normal JDK, we can compile and run this code in the Graal virtual machine:
But the real power of Graal becomes clear when we use a third command:
This command takes your Java class(es) and turns them into an actual program, a standalone binary executable, without any virtual machine! The commands you pass to
native-image very similar to what you would pass to
java. In this case we have the classpath and the Main class:
Now we have an executable that prints “Hello World”, without any JVM in between, just 5.6mb. Sure, for this example 5mb isn’t that small, but it is much smaller than having to package and install an entire JVM (400+mb)!
Docker and native-image
So what else can we do? Well, because the resulting program is a binary, we can put it into a Docker image without ANY overhead. To do this we’ll need two different Dockerfile’s, the first is used to compile the program against Linux (instead of MacOS or Windows), the second image is the ‘host’ Dockerfile, used to host our program.
Here is the first Dockerfile:
This image can be created as follows:
Using this image we can create a different kind of executable. Let’s create our application using the just created docker image:
This results in an executable ‘app’, but this is one I can’t start on my MacBook, because it is a statically linked Ubuntu executable. So what do all these commands mean? We’ll let’s break it down:
But we can do something cool with it using the following, surprisingly empty, Dockerfile:
We start with the most empty Docker image you can have,
scratch and we copy in our
app executable and finally we run it. Now we can build our helloworld image:
We’ve now turned our Java application into a very small Docker image with a size of just 6.77MB!
In the next blogpost Part 2 we’ll take a look at Java applications larger than just HelloWorld. How will GraalVM’s native-image handle those applications, and what are the limitations we’ll run into?
Yesterday I compared different JDK versions and OpenJ9 versus HotSpot on memory and speed. The memory part of the test was realistic if you ask me, an actual working Spring Boot application that served REST objects.
The speed/CPU test however was… lacking. Sorting some random arrays, just one specific test.
Today I decided to test OpenJ9 and HotSpot a bit more using an actual benchmark: SPECjvm2008.
SPEC (Standard Performance Evaluation Corporation) has a couple of well defined benchmarks and tests, including an old JVM benchmark called SPECjvm2008. This is an elaborate benchmark testing things like compression, compiling, XML parsing and much more. I decided to download this and give it a spin versus OpenJ9 and HotSpot. This should be a much fairer comparison.
Initially I encountered some issues, some of the tests didn’t work against Java 8 and the tests wouldn’t even start against Java 9+. But eventually I got it working by excluding a couple of benchmarks with the following parameters:
The Docker images used in these tests are both Java 8 with OpenJDK8, but one with HotSpot underneath, the other with OpenJ9:
Again I started the Docker image with a directory linked to the host containing the SPEC benchmark:
- Start Docker:
- Go to the correct directory:
- Run the (working) tests:
After waiting a long time for the benchmark to finish, I’ve got the following results:
The graph is measured in ops/m, higher is better. Results may vary of course depending on hardware.
In most cases HotSpot is faster than OpenJ9, and in two cases HotSpot is much faster, crypto and derby. It appears this is a case where HotSpot is doing something special that J9 isn’t doing (yet?). This might be important to know if you’re working on applications that do a lot of cryptology, for example high performance secured endpoints.
One place where OpenJ9 came out on top is XML validation. Parsing/validation is also an important part in most modern applications, so this could be a case where J9 makes up some lost ground in actual production code.
Is there a real conclusion from this? I don’t think so.
The real lesson here is: Experiment, measure and you’ll know. Never decided anything based on some online benchmark.
If there is anything else you’d love me to test, send me a tweet: royvanrijn
OpenJ9 and IBM J9 are a different JVM implementation from the default Oracle HotSpot JVM. With the modern adoptopenjdk pre-made Docker images it is easy to swap and test different combinations and pick the right JVM for you.
The rumours seem to be true, OpenJ9 seems to blow HotSpot away on memory usage. HotSpot seems to have the edge CPU-wise.
In the Java world most people are familiar with OpenJDK. This is a complete JDK implementation including the HotSpot JVM engine. Not a lot of developers know or try alternatives to HotSpot. Asking around some colleagues remembered the name JRockit, nobody mentioned IBM J9 and/or Eclipse OpenJ9.
I’ve read that OpenJ9 is very good with memory management and is tailered for usage in the cloud/in containers. OpenJ9 is an independent implementation of the JVM. It’s origins are IBM’s Java SDK/IBM J9 which can trace its history back to OTI Technologies Envy Smalltalk (thanks Dan Heidinga!).
With the current rise in microservice usage (and most services are not so micro in Java). I recon this could become a hot topic again!
Before the Docker-era it was relatively hard to compare different JVMs, versions. You needed to download, install, script and run everything. But now a lot of pre-made images are available online.
Here is my idea on how to test the JVMs:
- Create a simple Spring Boot application
- Start the application in various Docker Images
- Measure memory usage after startup and GC
- Measure the time it takes to run a small CPU-intensive test
This is by no means a thorough test or benchmark, but it should give us a basic idea of what we can expect from the virtual machines.
Spring Boot application
The Spring Boot application I created contains the following endpoints:
- A REST endpoint that calls the GC (trying to make it fair)
- A REST endpoint that creates 1000 large random arrays and sorts them, returns the runtime (in ms)
Here is the listing of the CPU-test:
Again, we can argue endlessly about if this test makes sense and is even remotely relevant… but still it should give us some basic idea of what kind of performance we can expect. If the rumoured memory improvements are true, might there be a performance hit? Is there a performance trade-off?
I’ve decided to test the following images.
First we have the (slim) openjdk images for 8/9/10/11:
Next there are the adoptopenjdk images for 8/9/10:
Then we have OpenJ9, again provided by adoptopenjdk for 8, 9 and a nightly build of 9 (see my previous blogpost):
And I decided to include IBM’s own J9 image as well:
Testing with Docker
After building my Spring Boot application I launched each Docker image using the following command:
I’m mapping my “spring-boot-example” project folder to “/apps/spring-boot-example” so I can start the JAR file inside the container. Also I’m forwarding port 8080 back to my host so I can call the endpoints.
Next, inside the container, I launch the Spring Boot application:
After waiting a bit, calling the endpoints a couple of times and performing a GC I measured the memory usage.
After that I called the “/loadtest” endpoint containing the array-sorting test and waited for the results.
Here are the results of the memory used by the simple Spring Boot application:
At first you can see that the memory usage for Java 8 is much higher than for Java 9 and 10, good!
But the biggest shock is how much less memory OpenJ9 and J9 are using, almost 4x less memory if you compare Java 8 with OpenJ9. I’m amazed, how does this even work? Now we can almost call our Spring Boot service micro!
I’ve also experimented with running some production Spring Boot code (not just simple examples) and here I’ve seen improvements up to 40-50% decrease in memory usage.
Online I’ve read that OpenJ9 isn’t as good as HotSpot if you look at CPU intensive tasks. That is why I created a small test for this as well.
1000 arrays with 1000000 random long values being sorted. This takes around 100 seconds, this should give the JVM enough time to adjust and optimize. I’ve called the benchmark twice for each tested image. I’ve recorded the second time trying to eliminate warmup times.
In the chart we can see that indeed the J9 and OpenJ9 images are slower, not by much max 18%. It seems for this particular testcase Java 8 beats most Java 9 implementations (except coupled with OpenJ9).
My current project has a lot more memory issues than CPU issues on production (frequently running out of memory while having 1-2% CPU usage). We are definitely thinking about switching to OpenJ9 in the near future!
We did already encounter some issues during testing:
- Hessian: (binary protocol) has a build-in assumption that System.identityHashCode always returns a positive number. For HotSpot this is true but OpenJ9/J9 can also return negative numbers. This is an open issue and the Hessian project hasn’t fixed this in a couple of years, seems to be dead? Our solution is to move away from Hessian altogether
- Instana: We love our monitoring tool Instana, but it had some problems connecting their agent to OpenJ9/J9. Luckily the people at Instana helped us identify a bug and a fix should be published today (and is automatically updated, w00t!)
Open questions I haven’t looked in to:
- Can you still get/use jmap/hprof information etc in OpenJ9?
- How will it hold up during longer production runs?
- Will we find other weird bugs? It feels tricky…
Have you tried OpenJ9/J9? Let me know in the comments.
Is there anything else you’d love to see tested? The best way to contact me is to send me a tweet royvanrijn.
Java and Docker aren’t friends out of the box. Docker can set memory and CPU limitations that Java can’t automatically detect. Using either Java Xmx flags (cumbersome/duplicated) or the new experimental JVM flags we can solve this issue.
Docker love for Java is in its way in newer versions of both OpenJ9 and OpenJDK 10!
Mismatch in virtualization
The combination of Java and Docker isn’t a match made in heaven, initially it was far from it. For starters, the whole premise of the JVM, Java Virtual Machine, was that having a Virtual Machine makes the underlying hardware irrelevant from the program’s point of view.
So what do we gain by packaging our Java application inside a JVM (Virtual Machine) inside a Docker container? Not a lot, for the most part you are duplicating JVMs and Linux containers, which kills memory usage. This just sounds silly.
It does make it easy to bundle together your program, the settings, a specific JDK, Linux settings and (if needed) an application server and other tools as one ‘thing’. This complete container has a better level of encapsulation from a devops/cloud point of view.
Problem 1: Memory
Most applications in production today are still using Java 8 (or older) and this might give you problems. Java 8 (before update 131) doesn’t play nice with Docker. The problem is that the amount of memory and CPUs available to the JVM isn’t the total amount of memory and CPU of your machine, it is what Docker is allowing you to use (duh).
For example if you limit your Docker container to get only 100MB of memory, this isn’t something ‘old’ Java was aware of. Java doesn’t see this limit. The JVM will claim more and more memory and go over this limit. Docker will then take action into its own hands and kill the process inside the container if too much memory is used! The Java process is ‘Killed’. This is not what we want…
To fix this you will also need to specify to Java there is a maximum memory limit. In older Java versions (before 8u131) you needed to specify this inside your container by setting -Xmx flags to limit the heap size. This feels wrong, you’d rather not want to define these limits twice, nor do you want to define this ‘inside’ your container.
Luckily there are better ways to fix this now. From Java 9 onwards (and from 8u131+ onwards, backported) there are flags added to the JVM:
These flags will force the JVM to look at the Linux cgroup configuration. This is where Docker containers specify their maximum memory settings. Now, if your application reaches the limit set by Docker (500MB), the JVM will see this limit. It’ll try to GC. If it still runs out of memory the JVM will do what it is supposed to do, throw an OutOfMemoryException. Basically this allows the JVM to ‘see’ the limit that has been set by Docker.
From Java 10 onwards (see test below) these experimental flags are the new default and are enabled using the -XX:+UseContainerSupport flag (you can disable this behaviour by providing -XX:-UseContainerSupport).
Problem 2: CPU
The second problem is similar, but it has to do with the CPU. In short, the JVM will look at the hardware and detect the amount of CPU’s there are. It’ll optimize your runtime to use those CPU’s. But again, Docker might not allow you to use all these CPU’s, there is another mismatch here. Sadly this isn’t fixed in Java 8 or Java 9, but was tackled in Java 10.
From Java 10 onwards the available CPUs will be calculated in a different way (by default) fixing this problem (also with UseContainerSupport).
Testing Java and Docker memory handling
As a fun exercise, lets verify and test how Docker handles out of memory using a couple of different JVM versions/flags and even a different JVM.
First we create a test application, one that simply ‘eats’ memory and doesn’t free it.
We can start Docker containers and run this application to see what will happen.
Test 1: Java 8u111
First we’ll start with a container that has an older version of Java 8 (update 111).
We compile and run the MemEat.java file:
As expected, Docker has killed the our Java process. Not what we want (!). Also you can see the output, Java thinks it still has a lot of memory left to allocate.
We can fix this by providing Java with a maximum memory using the -Xmx flag:
After providing our own memory limits, the process is halted correctly, the JVM understands the limits it is operating under. The problem is however that you are now setting these memory limits twice, for Docker AND for the JVM.
Test 2: Java 8u144
As mentioned, with the new flags this has been fixed, the JVM will now follow the settings provided by Docker. We can test this using a newer JVM.
(this OpenJDK Java image currently contains, at the time of writing, Java 8u144)
Next we compile and run the MemEat.java file again without any flags:
The same problem exists. But we can now supply the experimental flags mentioned above:
This time we didn’t set any limits on the JVM by telling it what the limits are, we just told the JVM to look at the correct settings! Much better.
Test 3: Java 10u23
Some people in the comments and on Reddit mentioned that Java 10 solves everything by making the experimental flags the new default. This behaviour can be turned off by disabling this flag: -XX:-UseContainerSupport.
When I tested this it initially didn’t work. At the time of writing the AdoptAJDK OpenJDK10 image is packaged with jdk-10+23. This JVM apparently doesn’t understand the ‘UseContainerSupport’ flag (yet) and the process was still killed by Docker.
Testing the code (and even providing the flag manually):
Test 4: Java 10u46 (Nightly)
I decided to try the latest ‘nightly’ build of AdoptAJDK OpenJDK 10. Instead of Java 10+23 it includes 10+46.
There is a problem in this nightly build though, the exported PATH points to the old Java 10+23 directory, not to 10+46, we need to fix this.
Succes! Without providing any flags Java 10 correctly detected Dockers memory limits.
Test 5: OpenJ9
I’ve also been experimenting with OpenJ9 recently, this free alternative JVM has been open sourced from IBMs J9 and is now maintained by Eclipse.
Read more about OpenJ9 in my next blogpost.
It is fast and is very good with memory management, mindblowlingly good, often using up to 30-50% less memory for our microservices. This almost makes it possible to classify Spring Boot apps as ‘micro’ with a 100-200mb runtime nstead of 300mb+. I’m planning on doing a write-up about this very soon.
To my surprise however, OpenJ9 doesn’t yet have an option similar to the flags currently (backported) in Java 8/9/10+ for cgroup memory limits. For example if we apply the previous testcase to the latest AdoptAJDK OpenJDK 9 + OpenJ9 build:
And we add the OpenJDK flags (which are ignored by OpenJ9) we get:
Oops, the JVM is killed by Docker again.
I really hope a similar option will be added soon to OpenJ9, because I’d love to run this in production without having to specify the maximum memory twice. Eclipse/IBM is working on a fix for this, there are already issues and even pull requests for this issue.
UPDATE: (not recommended hack)
A slightly ugly/hacky way to fix this is using the following composed flag:
In this case the heap size is limited to the memory allocated to the Docker instance, this works for older JVMs and OpenJ9. This is of course wrong because the container itself and other parts of the JVM off the heap also use memory. But it seems to work, appearantly Docker is lenient in this case. Maybe some bash-guru will make a better version subtracting a portion from the bytes for other processes.
Anyway, don’t do this, it might not work.
Test 6: OpenJ9 (Nightly)
Someone suggested using the latest ‘nightly’ build for OpenJ9.
This will get the latest nightly build of OpenJ9, and it has two things:
- Another broken PATH parameter, fix that.
- The JVM has support for the new flag UseContainerSupport (like Java 10 will)
TADAAA, a fix is on the way!
Oddly it seems this flag isn’t enabled by default in OpenJ9 like it is in Java 10 though. Again: Make sure you test this is you want to run Java inside a Docker container.
IN SHORT: Be aware of the mismatch, the limitations. Test your memory settings and JVM flags, don’t assume anything.
If you are running Java inside a Docker container, make sure that you have Docker memory limits AND limits in the JVM or a JVM that understands these limits.
If you’re not able to upgrade your Java version set your own limits using -Xmx.
For Java 8 and Java 9, update to the latest version and use:
For Java 10, make sure it understands the ‘UseContainerSupport’ (update to latest) and just run it.
For OpenJ9 (which I highly recommend for bringing down your memory footprint in production) for now set the limits using -Xmx, but soon there will be a version that understands the ‘UseContainerSupport’ flag.
This week I’ve been stuggling with the following question:
When is the right time to upgrade?
By upgrading, I mean everything: libraries, tools, Java versions, application servers, MQ servers…
My current project uses a reactive upgrade policy, we upgrade for four reasons:
- Something is broken and fixed in a later version
- We need or want to use a new feature
- Support for a version we’re using is being dropped
- The old version we’re using has a known security issue/CVE
The first two reasons are entirely up to the programmers to decide. The third reason is up to the company that gives us support. For the fourth upgrade reason, security, we have some automation in place. We are using the OWASP dependency checker Maven plugin for our libraries.
- Is a reactive update policy good enough?
- Are there any pro-active strategies?
- Do you want to invest in keeping your microservices up-to-date?
- Do you let your services deteriorate and dispose, replace them in the future with new technologies?
More on this in my (kind-of-weekly) vlog:
It seems that the adoptation of Java 9 is slow, very slow. In a Twitter poll this week I asked: “Which version of Java are you using in production?”
The poll got almost 300 replies and to my surprise just 3% of the respondents are using Java 9 at the moment. Most are using Java 8 and there are even more people using Java 6… So what is holding people back? Why is almost no-one using Java 9?
It turns out a lot of people are scared to upgrade to Java 9. They think it’ll be hard to do, libraries and tools will fail them. They are scared for another Jar-hell.
That is why I took one of my larger projects, that has been running in production for a long time, and try to upgrade it to Java 9.
Upgrading: Walk in the park or walk in jar-hell?
Here are my results and findings, in a new format: a vlog
If you like the vlog and follow my website, please subscribe to my channel on YouTube!
Today I got to present at Devoxx Poland on being agile and managing your architecture.
One of the points I made during the talk had to do with Conway’s Law. For those unfamiliar with it, here it is:
organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations M. Conway
This basically means that the code you create is likely to reflect the way the people and teams in your company communicate. For example, banks are usually pretty strict and have a tendency to document things. The code written at banks in turn have the tendency to use SOAP for example, verbose and strict.
The project I’m currently working on has a very agile mindset, we change a lot and have open communication. This reflects in our codebase as well, all the code is easy to understand and easy to change (thankfully!).
But lately we’ve started to move away from our old monolith and we’ve began cutting it up into microservices. That’s when I noticed something odd, which I call the Inverse Conway’s Law (maybe it is Roy’s Law?):
organisations which radically change their system design should expect changes in communication structure Roy van Rijn
What does this mean?
We’ve started to move towards a microservice environment, where each team has a responsibility over a couple of services. They’ve developed it and they own it. This ownership completely changed the culture in our team. Up to now if the monolithic backend was broken or offline, everyone was hurting and everyone wanted to fix this as soon as possible. With microservices, this is gone. The first thing we now do is check which microservice failed. If it isn’t theirs, it is not their problem, it’s someone elses problem now!
This is just one example on how changing the design of our system changed the culture and team dynamic. It is one thing that sounds logical when you think about it, but I didn’t expect this before we started the transition.
Today I got an email with the subject: “Congratulations JavaOne 2016 Rock Star!”. It turns out the JavaOne 2016 Ignite session was voted enough to receive a JavaOne Rock Star Award!
I was so excited, elated even!
But I quickly found out that the JavaOne Rock Star lifestyle isn’t as much fun as it sounds.
- Being 3 hours late for a meeting, even a Rock Star can’t do that.
- Peeing in the corner of your office while holding a bottle of Jack Daniels, still not appreciated.
- Wearing ripped jeans, a leather jacket and other than that just chest hair, is considered too casual.
- The groupies? It is like answering random Stack Overflow questions, non-stop, face to face.
- After each coding session, smashing your MacBook on the floor… quickly becomes expensive.
YouTube, the future of television. I’ve got a lot of subscriptions to YouTube channels that deliver quality content, and those shows are ‘cast’ (using my Chromecast) to my TV. Another thing I often do is look up information, for example I watch talks from programming conferences like Devoxx using YouTube.
This gave me an idea, what if I can take some free information (like Wikipedia, all creative commons) and use that to create content for YouTube? Maybe I’ll even get some views :-)
So this is what I’ve come up with, the following video is generated completely automatically:
Step 1: Wikipedia API
The first step in creating these videos was to get some actual freely available content. For this I chose to use Wikipedia. This can be accessed using JSON/REST, and to quickly hack this into place I decided to use a framework called ‘Unirest’. This makes the while prototype easy:
The resulting JSON (see this example) contains the Wikipedia page in simple processed HTML. From this HTML I used a library called ‘Jsoup’ to extract the plain text. The first alinea of text contains the introduction of the topic, this is what I’ve decided to use as content for the video. Initially I had all the content of the page in the videos, but they quickly turned out to be 30+ minutes long and very boring.
This is the code I used to go from HTML to the first alinea of text:
Step 2: Stunning visuals with Java 2D
We have some text, now how do we turn this into a video? We need visuals! I decided to cut the content into seperate lines, and for each line I have a single ‘slide’. This is an (buffered) image created using Java’s 2D API and saved to file.
Step 3: Text-To-Speech
Instead of using a Java Text-To-Speech (TTS) library, which aren’t very clear or good, I decided to use the build-in OS/X command ‘say’.
On the command line you can execute:
say Hello world!
The ‘say’ command supports different voices and also has the ability to store the output in a file. Which is perfect for us!
Step 4: Putting it all together
What do we have now? A list of sentences from Wikipedia, for each sentence we have an image and we have an audio snippet created with TTS. Now it is time to put everything together and create a video. The easiest way to do this is to use ffmpeg, a powerful command line video tool. To make this easier I created a small method ‘executeCommandLine’ which does the same thing as the snippet with ‘Runtime.getRuntime().exec’ shown above.
Combine the image (png file) and the TTS audio (mp4 file) into a series of MPG files:
We now have a series of small MPG files. They need to be combined into a single video. For this we need a text file which contains a list of all the MPG files. This is stored on disc and next we execute the following command:
Next we convert the video from one large MPG to MP4 file (which is a nicer, compressed format):
And to add a little more ‘production value’ I’ve gone online and found some soothing (free, creative commons) background music. This music is looped into a large audio file of 50 minutes (enough for all videos). Again using ffmpeg the audio is mixed in:
There it is, we have a complete video with text audio, text visuals and background music! Only one thing left to do, uploading to YouTube!
Step 5: Uploading everything to YouTube
To automate the process of uploading, I first needed a place to put the videos, so I created a new YouTube channel named: Vikipedia
Next we can use the Google YouTube Data API. It turns out this is very easy and well documented. I don’t even need to share the code I wrote here because I used the example from Google’s website itself! Check out this amazing documentation. Very detailed and does exactly what I needed it to do.
The example code takes a video file and uploads it. The only things I changed were setting the correct description, I was ready to generate all the content!
Not implemented (yet?)
Things I wanted to implement (but didn’t and probably never will):
- Add a little color to the videos (black and white is boring)
- Add waveform from ffmpeg (just to keep it visually interesing)
- Download (did this) and use (didn’t do this) the images from the Wikipedia page and add those to the video
- Yes, it is pretty easy to hack together a bot to create YouTube videos.
- No, the quality isn’t that good… yet?
- Yes, it can be made in a single afternoon!
- No, I’m not going to automate the spidering of Wikipedia and upload a million videos… (but I could easily do that!)
Up to now all the videos created are based on a small list of ‘hot topics’ I found on Wikipedia and Twitter. I don’t want to automate the entire process and flood YouTube, the content just isn’t good enough for that. But it was a fun project for a lazy afternoon! I’ve learned a lot of new tricks, maybe you can benefit from that as well.
The next thing I want to do with the 400+ YouTube videos I’ve generated is to see how (and if) people are going to watch them. Do they show up in the search results? Which topics are searched and watched? Maybe more on that in a future post!
Do you have any good ideas on how to generate videos, let me know in the comments! I’ve created a bot that generates music some years, maybe I should improve that bot to upload its own YouTube music videos? ;-)
This morning Mark Reinhold submitted three brand new JEPs (JDK Enhancement Proposal).
- JEP 300: Augment Use-Site Variance with Declaration-Site Defaults
- JEP 301: Enhanced Enums
- JEP 302: Lambda Leftovers
These proposals are enhancements to the JDK (Java Development Kit) and OpenJDK. A long term roadmap for the JDK projects, a look into what the future of Java might hold.
Let’s dive right in and take a quick look on what these proposals actually are!
JEP 300: Augment Use-Site Variance with Declaration-Site Defaults
When you currently use Java Generics you probably already know about wildcards. It is possible to set lower and upper bounds to generics using the keywords ? extends and ? super. There are two parts to using wildcards, declaration side and use side, this JEP focusses mainly on making the use side easier and more powerful.
It is possible to set a bound with a wildcard on the declaration side:
In this example we’ve declared that the Shelter has to contain Animals. It is possible to create an Shelter<Cat> or Shelter<Dog> now, but not Shelter<Bike>.
The second way to use wildcards is on the use side. There we can speak of so called in and out-variable wildcards.
The following code uses the ‘extends’ keyword to create the in-variable:
There is also the ‘super’ keyword (not very common) to create an out-variable use side:
So what is JEP 300?
The motivation of the JEP states:
Since invariant uses of these type arguments are less flexible than their wildcard equivalents, while providing no extra power, a reasonable practice is to always use a wildcard when mentioning the type.
It is almost always more powerful and equivalent to use wildcards on the use side. The problem is that this is verbose and adds a lot of noise to your code. The proposal wants to make it possible to declare (at the declaration side) what the default wildcard strategy should be.
For example look at the following code:
The compiler can now automatically treat every use of the type e.g., Function<String, Number> as if it had wildcards Function<? super String, ? extends Number>.
On the use side the proposal says:
Rather than making changes to subtyping, we preprocess the source code so that types like Function<String, Number> are implicitly expanded to Function<? super String, ? extends Number>.
This should make it more powerful and just as readable to using wildcards by default without even noticing them.
JEP 301: Enhanced Enums
The next proposal Mark has submitted is about using Enums with Generics. Look at the following example:
We have an enum with three pets. These pets have an instance of an animal inside them. And we have a way of retrieving them.
But one thing we can’t do it call the method “.bark()” on BELLA, because we don’t know this is a Dog.
The proposal JEP 301 wants to make it possible to correlate a specific type to an enum constant:
Having coupled the Dog type to enum BELLA it should now we possible to call:
JEP 302: Lambda Leftovers
This is turning out to be a rather long blogpost but we’ve reached the final proposed JEP of the day: Lambda Leftovers
One of the problems people are having with the lambdas at the moment, and this JEP wants to improve, are accidental ambiguities.
Why are we having this problem? Both methods are possible candidates of the lambda. If we look further and see the return type ‘boolean’ it should be clear we wanted to call the Predicate method and not the Function (which should return a String). Right now this doesn’t work, but it is something the compiler could figure out for us.
Up to Java 8 it was perfectly acceptable to use the underscore character ‘_’ as a variable name. But this has been changed, all leading up to JEP 302. In most languages with lambda’s it is possible to denote certain inputs as unused. This is done using the underscore. In Java this causes problems because it is right now a valid variable name.
Starting with Java 8 the use of an underscore as lambda argument caused a warning and from Java 9 the use of an underscore became an error. This allows future Java versions beyond 9 to be able to use the underscore for other purposes. And one of those purposes is to bring it back as a default variable name. Not causing collisions if used multiple times and it can’t be used as variable by calling _.toString() for example.
Shadowing of parameters
Another small change that is proposed is allowing to shadow parameters. For example look at the following code:
In both cases ‘key’ is already defined, can’t be used as a parameter or local variable inside the lambda. The proposal wants to lift this restriction, allowing the use of ‘key’ as a parameter or as a local inside the lambda using so called ‘shadowing’.
One possible drawback is readability, and I too think that this might be a problem. For example:
The second variable “theShadowKey” is taking the place of the first “theInitialKey”. This can be quite confusing because if you delete the second declaration nothing would break. The other variable would come from the shadows and take its place. This feels dangerous and confusing to me, not very ‘Java’-like.
Another day, three new JEPs submitted. Java is moving forward and trying to improve the user experience with genetics and lambdas. These proposals look like good, small, low-impact improvements that can really benefit the common programmer.
Interruptions are a software developers worst nightmare. This is what you often hear. There is nothing worse than being interrupted while working hard on a problem, while you’re in the zone. You’ll lose your train of thought and the world collapses.
The idea is that in software development you are sometimes very deep into a problem, analysing code:
This orchestrator calls that service, it is managed by this class and talks to that queue. So this generator has these parameters and it all depends on… “hey I still need your time sheet”
Poof… everything gone. Your mind is now filled with rage and lost the entire train of thought. This is the main reason programmers hate interruptions.
Root cause analysis
During the JavaOne 2016 conference I attended a talk by Holly Cummins called “Euphoria Despite the Despair”. In this talk she went in depth into what makes work fun and stressed the point that interruptions should be avoided at all cost.
She even talked about a team that has a dedicated duo that had ‘interruption duty’. A Slack-bot was created so you could ask which colleagues you could interrupt each day.
But… are we judging too quickly here? Are all interruptions bad? I don’t think they are. Time to dig a little bit deeper.
Sync versus async and good versus bad
Interruptions are synchronous and blocking. Someone walks up to you and stands near your desk trying to get your attention. You might have a few milliseconds to finish what you are doing but you’ll need to reply. Other forms of communication are asynchronous, for example mail or chat clients (Slack). Someone asks a question and when you have the time to reply, you reply.
Our team has evolved to use this to our advantage. Some problems ask for direct action, but most can be handled asynchronous. This is something we talk about and discuss.
For example: If my manager wants a time sheet? Ask me asynchronous, it can wait. If a colleague has a question but is able to do other things first? Ask asynchronous, it can wait. If a colleague is stuck and can’t continue to work? I’d be happy to drop what I’m doing and assist! This is what teams do, they work together and communicate!
Don’t throw all interruptions on one big pile and label them as bad, distinguish between the good and bad interruptions.
Mount the headphones, get in the zone
Sometimes, as a programmer, you just want to concentrate and solve some deep problem alone.
In our team this can be done in two ways:
- Work at home or in a small dedicated room in our office
- Put on your headphones, if they are on, you signal you don’t want to be disturbed
I do this too, mostly using my headphones. Most of the time I don’t even have music playing and I can hear everything the team says (this is important, more on this below!).
In my opinion though we need to rethink this. Is it really the best way to program when you are ‘in the zone’?
I think you are much more likely to solve problems when you openly share and collaborate. Pairing isn’t just for trivial work, instead: the trivial things are things you should do alone! Teams should have no secrets, at each point in time you should have a pretty good idea what each of your team buddies is working on. Actively ask questions, even trivial things.
Near-subliminal team updates
What? “Near-subliminal” team updates? Yes, I’m so sorry, I can’t think of a better name. Can you? Please tell me.
One thing that is very important in software development is the use of what I’ve just called near-subliminal team updates/mumblings. It is the main reason I have headphones on without any music, and why I don’t like working at different locations (at home or in cubicles).
You are working on something and you feel the need to refactor a class. At this moment you say/mumble out loud:
“Hey team, I’m going to refactor classname because reason okay?”
This is what I call a near-subliminal team update, and they are very important. They can solve a lot of problems!
Maybe you don’t agree with reason, this is the perfect time to talk about that. You don’t want to wait until a code review and tell them it was a bad idea later the same day.
Maybe you are also working on something in classname and it might collide with your work. You don’t want to wait until you git pull and need to merge, tackle it right away!
Maybe it isn’t relevant to you at all (which is most of the time)… Well, most likely you don’t even really notice the update. It didn’t contain any trigger words. Terms you are working on or have an opinion about. You aren’t really interrupted and can continue to work.
- Don’t treat all interruptions as evil, they are not…
- If you are annoyed about interrupts, talk about it
- Speak up, mumble, use those near-submliminal team updates!
Did I miss anything? How do you cope or deal with interruptions?
And please: If there is someone with a better name for these short near-subliminal status updates, let me know!