Most software developers have never heard about SAT solvers, but they are amazingly powerful and the algorithms behind them are actually very easy to understand. The whole concept blew my mind when I learned about how they worked. This will be the first post in a series about SAT solving where we’ll end up building a simple solver in Java later on.
In this first part of the series we’ll cover:
- Look at what SAT is
- Explain the SAT DIMACS format
- Solve a puzzle: Peaceable Queens
What is SAT?
SAT is short for ‘satisfiability’, what people mean when they say SAT is: Boolean satisfiability problem.
It turns out you can write a lot of ‘problems’ in a specific way as a special boolean formula. These problems use boolean
variables that can be either
false. When a variable is assigned true or false, it is called a
An example of a boolean expression:
(A or B or ~C) and (B or C) and (~B) and (~A or C)
Conjunctive normal form (CNF)
The statement above is written in a particular way, it is called the
conjunctive normal form (or CNF). All boolean statements can be rewritten/transformed into this form.
First we’ll need to learn some more terminology. The variables have a label here, for example
A. There is also the option to have a NOT A, this is written as
The line above can be broken up into four pieces:
Each line is called a ‘clause’. Those clauses are connected together using
AND statements. A
clause can only contain variables (or negated variables) joined together by
Because each clause will have
OR between the
variables and the end of the clause is always an
AND statement, those things can be removed from the statement, turning it into:
If we do one more step you’ll end up with a format called
DIMACS, which is supported by almost all SAT solvers. The last step is to replace the name of variable
2 etc. The
NOT statement is just writing the number negative, so
Now add a
0 to the end of each line (as separator) and add a header which explains the amount of variables (= 3) and the amount of clauses (= 4) and you get:
This is a valid input file for almost all SAT solvers. You can run this for example in minisat:
And print the output:
Now we can see what a SAT solver actually does. It takes a boolean expression and tries to find a ‘solution’. It tries to find a particular combination of
literals so all the clauses are
true. If it is able to find one, it stops and returns
SAT. When it is unable to find a valid assignment it returns
UNSAT, the problem can’t be solved.
Most modern SAT solvers will not only return
UNSAT, they will also return “proof” of the UNSAT, a listing of all the clauses that conflict with eachother and cause the UNSAT. This is very powerful and can be used to verify the correctness of the solvers.
In our case we have a valid
1 -2 -3.
When we assign
A=true, B=false, C=false we satisfy our entire boolean expression.
How can we use this?
The example above is very abstract with A’s and B’s. Lets look at some other problems that can be solved by SAT solvers.
A lot of ‘problems’ can be turned into big
CNF statements which means a lot of problems can be solved by a general SAT solver! Which is kind of amazing to me, translate a hard problem into SAT/CNF and a (rather simple) general purpose solver can solve it!
Calling SAT solvers ‘rather simple’ isn’t a mistake. Sure, some implementations are very complicated pieces of finely tuned optimized software. But the algorithms and general ideas behind SAT solving are surprisingly simple. In the next coming blogpost(s), once I write them, I’ll show you how to make one in Java.
Most tutorials use a SAT solver to solve Sudoku puzzles. It is an easy problem to understand and it shows the power of SAT. It involves translating a Sudoku puzzle and all its rules into one big CNF file. Next the SAT solver will calculate a solution.
Some other (random) things SAT solvers can be used for:
- Hardware/software verification
- Scheduling (sport events)
- Routing/planning problems
Peaceable queens problem
Instead of doing what most tutorials do (writing a boolean expression for Sudoku’s) I want to do something different.
The challenge: Find solutions for a puzzle from a recent Numberphile video: Peaceable queens.
The problem is:
Given a N x N chess board, what is the highest amount of black and white queens you can place so they don’t attack each other?
How can we encode such a problem into one big
A chess board
First we need a way to describe our board. It is a chess board and can hold queens either white or black. So I decided to encode this using a variable for each square, first NxN white queens, next NxN black queens.
We don’t have to put anything in our
DIMACS file yet, but the SAT
output will contain these variables set to TRUE or FALSE.
White or black pieces
Now it is time to encode all the ‘rules’ of this puzzle. First a very simple rule: If a square has a white queen it CAN’T have a black queen.
Let’s take the first square, top left. If
TRUE we have a white queen, if
TRUE we have a black queen there. So we want either 1 or 10, not both. How do we encode a rule for this in our SAT input?
How does this work? For this
clause to be
TRUE we must have ‘NOT 1’ or ‘NOT 10’, so either one of them or both must be
Rows, columns, diagonals
To have a valid solution we must add a lot more rules, for example a row can only have either a white queen or black queen or none.
How do we do this? Let’s look at row
10,11,12 (the top row).
First we’re going to introduce two new
variables, I’ll just continue using free numbers,
21. Those will represent the top row for white and black.
Let’s analyse this. The first
clause says we must have
NOT 1 or the white row
TRUE. Next we state the same for
NOT 2 and
NOT 3. So if we have a white piece on row
1 2 3 means
20 will be
TRUE. The same is done for black, if there is a black piece on the top row
21 will be
Now we can encode the final rule for the top row just like we did with the white/black queen on the same square:
This can be done for each row, column and diagonal.
As you can imagine, this will quickly become too large to write manually. So to enumerate all possibilities I wrote a piece of Java code to generate a
DIMACS file for me instead.
This happens a lot when using SAT solvers, most of the time you’ll want to create a piece of software to output all the rules in
DIMACS for a solver to solve.
At least N-queens
If we run the SAT solver with the generated file above we encounter a small problem. It will solve very quickly by placing NO queens at all. Very ‘peaceable’ indeed, no queens are attacking.
We must tell the solver that we want to have a certain number of queens set to true. This turned out to be harder than I thought. There are algorithms online that do
at most N booleans. I decided to implement the LTseq method from this scientific paper.
Having rules like this quickly explodes the amount of clauses needed. But
LTseq seemed to be the best one.
This is what I’ve ended up with:
How can we use
at most N for this puzzle? Well in case of N=10 we want to have both white and black to have
at least 14 queens.
So I switched
at least 14 of white and black squares
TRUE to be
at most (N*N - 14) of white and black squares
This is still not very good because with (NxN - 14) we have a very large
k, thus making a LOT of clauses. I have the feeling this can be done much better, but up to now I haven’t found a proper way. Perhaps forcing the sequential counter to overflow (instead of protecting it)? This didn’t seem to work for me. If you have SAT experience and know more about encoding problems, let me know!
I ended up doing the thing that works:
And it worked like a charm. I’m now able to ‘solve’ N=10 in under two seconds using modern SAT solver glucose. When I generate my ‘rules’ it comes down to about a staggering
17336 variables in
34518 clauses (far from optimal?).
But glucose is able to find a valid solution for N=10, k=14 in 1.2 seconds.
Below is the output where I’ve omitted most variables, we are only interested in the first 200 (10 x 10 x 2) that represent the board:
And if you change this back into a chess board we can see valid solution for N=10 with 14 queens each!
I’ve uploaded all the code for this series to GitHub and called it: Boolshit
You can find the code to generate the Peaceable Queens puzzle here.
I hope this gives you an idea on how powerful SAT solvers are and how you can encode a problem into
In the next blogpost we’ll write an actual SAT solver from scratch in Java! Although it won’t be a very fast one…
The first day was about clean code and the second day was about architecture.
The main attraction of this conference: Robert C. Martin, also known as Uncle Bob. He is one of the founders of the Agile manifesto and wrote numerous very good books on topics like ‘Clean Code’ and ‘Clean Architecture’. He’s also the main advocate for the SOLID design principles.
This presentation of mine… turned out to be my worst public speaking experience yet.
One thing I want to make clear though: I absolutely don’t want to discredit the Utrecht JUG or Rabobank in any way, they are doing awesome things for the community. I just want to share my personal experience. Also: Robert C. Martin (Uncle Bob) is a great speaker and writer, his talks are inspirational, entertaining and just very very good.
I was asked to present one of the breakout session. Those sessions had a rather unique form:
Two speakers are presenting at the same time on the same stage. The speakers and the entire audience had to wear a silent disco-headset. The audience could switch to whichever side they wanted to listen to. Technically this worked perfectly (kudos to the organisers), I have never seen this kind of setup work as well as it did.
The only other place I’ve seen headsets at a conference was in the overflow room of JavaZone. The overflow room there has a huge screen with all presentations and using a headset you can switch to whichever talk you want, great!
Initially this setup didn’t scare me, I’m not an anxious person, it sounded novel and quite frankly fun/entertaining! I did ask two fellow presenters whom presented the day before about the setup, they said it was a bit weird at first… but they got used to it.
Nobody prepared me for what followed though, my worst speaking experience to date (by far).
The first complication of the day: fever.
I woke up with a splitting headache. My kids had been ill for a couple of days and now it was my turn. Mucus filled sinuses and a bit of fever, I’ll spare the details.
Surely nothing a couple of painkillers couldn’t solve!
When I was setting up my laptop before the presentation a colleague of mine made a joke. A joke about his height and the small wall which was being set up on stage. This wall divided the stage into the two parts for the breakout sessions.
One of the organisers (again: who did a great job, it isn’t easy to organize such an event, I love the Utrecht JUGs effort!) came to me and gave me a firm and surprising warning:
“Roy, one thing, I have to warn you: Don’t mention the wall. Don’t mention any wall, not this wall, not the Trump-wall. Don’t even say the word ‘wall’ in your presentation.
Don’t mention Trump or guns or anything political please. Just don’t joke about it. This was one of the requirements from Uncle Bob. He is very sensitive and clear about this, no mentions or jokes about his political views, we don’t want to upset him.
If you ignore this advice, we’ll switch all the headsets to the other presenter and you are done.”
This… was very suprising to me. Uncle Bob is one of the most outspoken people I have on my Twitter timeline. He often gets into public discussions making his views very clear, engaging in debate, for example:
I voted for Trump because I thought it was the better of two extremely bad options. Trump says crazy things. He’s casual about the truth. But many of the policies he has enacted have been positive. I support the good things. I don’t support the bad.— Uncle Bob Martin (@unclebobmartin) June 18, 2018
And (surprisingly) this:
The forces of political censorship that are gathering on the web are a toxin that must be resisted and eradicated with extreme prejudice.— Uncle Bob Martin (@unclebobmartin) January 9, 2019
I actually wanted to start my talk with a small joke about the wall to break the ice, to point out the irony of having a wall on stage at a conference named after Uncle Bob. Because I’ve seen people attacking him, saying he’s a dumb Trump supporter.
It turns out other breakout-session presenters got a similar warning before their talk, it was not just me.
It was time…
I put on the headset and microphone and waited behind a closed door. The announcer called out my name and that of other presenter, we both made a grand entrance. Him entering the stage from the left, me from the right. We met in the middle for a friendly handshake and…
Ready to rumble!
… silence …
… just a deafening silence.
Normally you hear noise, people clap, cheer or laugh or… at least they breath. Now there was one big nothing. With the headset on I felt so alone, inside my little bubble. It was like recording a webinar or podcast, only my own voice, heavy breathing and my headache. Nothing else.
This was something I just couldn’t cope with. Whenever I made a joke, nothing, just the echo in my sinuses. 400+ people wearing headphones looking at me apathetically. Some of them (with blue-LED headphones) to the stage next to me, some (with green-LED headphones) looking directly at me… still hearing nothing. There was a complete disconnect with the audience…
My style of presenting is entirely based around the audience, around instant feedback and interaction. I want the audience to have a good time, to be entertained and learn something new at the same time. I’m not someone that (re-)plays a recording, throwing informantion into a void.
I just wanted to crawl into a corner and cry.
I’m still not sure what caused this horrible feeling, maybe it was the censorship, maybe it was the headphones/audience disconnect, maybe it was the fever, my exploding sinuses and the numbing painkillers… probably a combination of all those things.
It was a first for me in many ways:
I never did a headphone/silent disco-style presentation… and I’m not sure I want to do this again.
It was also the first time I experienced censorship at a meetup; being warned that some topics and/or jokes were off the table.
A couple of friends warned me: “Are you sure you want to connect your name to Uncle Bob, speaking at his conference”.
I thought about it back then and accepted the invite: Just because I’m speaking at an event bearing his (nick-) name doesn’t mean I agree with him or his political/world views. I’m free to have my own view and opinions.
Now that I know about the weird censorship, I’m not sure I was right.
In the previous post I showed a nice snippet/trick at the bottom of the post to find all the neighbors in a 2D array.
Last week I used this snippet to solve a word search puzzle a colleague created for us as a “Back to work in 2019”-challenge:
(Note: Some fellow developer colleagues think that solving the puzzle with code is unfair… heh. really?!)
Reading the lexicon
Here is the complete code:
First we’ll need to read in all the possible words we’re looking for. And using the new Java streams this is very easy. Just download one of the many online word-lists and do the following:
It was never easier in Java to do File IO and manipulate collections. Next I embedded the entire puzzle and turned it into an array of Strings.
Finding all the words
Finally we need to traverse all the possible combinations a word can be made and check against the Set of target words. I’m doing this the dumbest way possible.
First I go over each possible start location by enumerating all x/y combinations:
Next I check each direction using the neighbor algorithm from the previous post.
Now we start building the possible word, using two new pointers (tx and ty):
While we are within the bounds of the puzzle we must travel in the current ‘direction’. This is done by adding the direction offset (from previous post) to the tx and ty pointers.
If the word in-progress matches one we’re searching for, add it to the found words list!
Finally let’s close everything off, for developers with OCD like me:
This prints the words, sorted, just as we want. Q.E.D..
The way we now iterate over all the possible word hiding places is very very dumb.
There are some obvious improvements to be made here, for example one could place the entire lexicon in a tree-like structure instead of a set of words and stop iterating if the certain branch doesn’t exist.
However, for this puzzle, this challenge, the speed it runs at is good enough!
You can download the complete listing here: PuzzleSolver.java
Pathfinding for Tower Defense
Last week though I was reading this tutorial, Pathfinding for Tower Defense.
He explains a method for creating a single vector/flow field for game AI’s to follow. This allows a single calculation to be used by a lot of AI bots, which is great for a Tower Defense game.
A new game…
“What is the longest path possible? / What is the highest number I can reach?”
After I found out I can move the start position around, I came up with the following configuration:
The maze is 113 long… but it turns out this is far from optimal. After coding up a little solver and playing around I found the following configuration:
This is a solution that goes up to 124 (near the top left corner). I’ve also found one later with a score of 125.
It seems that diagonals are the best way to pack the longest maze into there, but the problem is, one diagonal cuts the field in two smaller parts. Zig-zagging works great, but there is no ‘easiest’ way to zig-zag.
It doesn’t seem to me that there is a trivial solution? I haven’t found one at least, is there? Are there simple optimal patterns for given boxes, NxN?
Searching for a solution seems to be NP-hard and even 20x10 is out of the question, the amount of possible configurations of walls and starting positions is huge and grows exponentially. Coding up a solver is pretty easy though, using some simple heuristics and some random searches I got my 125 solution.
Maybe this would be a fun puzzle for a future AZsPCs contest…?
Can you get a better configuration than 125? I’d love to know.
When I wrote a score checker for the puzzle above I needed a way to get the neighbors in an 2D array. I’ve gotten used to code 2D mazes/puzzles and I always use the following way to calculate all the neighbors.
I find this to be easier and more readable that the alternatives, going over all the -1/+1 combinations or having two nested loops.
How does this work?
First we list all directions with a single integer, 0 to 9, and we’ll need to skip the middle number four, this points to ourself.
From this single direction we’re able to easily calculate both the new row value (n_row) and the new col value (n_col). To do this we’ll need to divide by three and use modulo three.
Neighbor row value:
To find the new row value we start our by using modulo 3. The only thing left to do is to subtract 1 and we’ll get the new row value for each direction:
Neighbor col value:
The same thing can be done for the col value. Instead of using modulo 3, we’ll be using divide by 3. And the same as with the row value, we’ll also need to subtract 1 to get the following new column values:
And there you go, list all neighbors in a 2D array without using two nested loops and dirty checks.
First of all, best wishes for 2019 from me.
Time for me to reflect on the last couple of months.
Last year has been a crazy busy year for me. I’ve been to and spoken at to J-Spring, J-Fall, Devoxx Belgium, Devoxx Poland and JavaOne… errr… Oracle CodeOne on various topics. From quantum computing to software architecture.
Here is a list of all my conference talks.
I’m a big fan of sharing knowledge and getting people together to have fun. This is why I’ve decided to create a new JUG (Java User Group) this year in the city of Rotterdam, the RotterdamJUG. In November we had our first meeting and I’ve already got more meetups planned.
The biggest achievement this year is probably being named an Oracle Java Champion. There are no more than 300 Java Champions in the world and it is a great honor to be added to this group.
So what’s next for 2019? I’m going to continue with the RotterdamJUG, we’ve already got a Kotlin-workshop planned for this month and I’m looking for a venue for a February meetup (anyone?).
I’m also going to go back to math and puzzles, I’ve got some idea’s planned and some tricks/algorithms to explain, more on that in the coming days/weeks hopefully!
We looked at GraalVM and what it can do. We dove into the
native-image command and transformed a simple HelloWorld application into a native application (running in Docker).
This time I want to go beyond “Hello World” and build something useful (despite all the limitations listed below). A CRUD microservice with REST API and database access.
TLDR; If you’re just intersted in the end result, go right to the results
So what are some of the limitations that GraalVM currently has? (using version 1.0.0 RC6 at time of writing)
Well, the Java VM is a very complete and dynamic system able to handle a lot of dynamic changes. This is something a native application can’t do as easily. GraalVM needs to analyse your application up front and discover all these things. Therefor, things like reflection and dynamic classloading are very hard to do (but not impossible).
To ‘emulate’ a dynamic running JVM the GraalVM project is shipped with Substrate VM.
Substrate VM is a framework that allows ahead-of-time (AOT) compilation of Java applications under closed-world assumption into executable images or shared objects (ELF-64 or 64-bit Mach-O).
The Java VM for example has a garbage collector, when you eliminate the JVM, you’ll still need to free your objects. This is something Substrate VM (written in Java!) does for you.
To read about the limitations of Substrate VM, look at this LIMITATIONS.md.
We want to do more than just ‘HelloWorld’, how about an entire microservice?
If you talk about microservices in Java, most people will immediately say: Spring Boot. While you can certainly discuss the micro-part, it is by far the most popular way of writing (enterprise) Java applications at the moment. The problem is that most Spring Boot applications quickly grow, having Docker images of 600+ mb and runtime memory usage of 250+ mb is to be expected.
So it seems this is a perfect candidate to turn into a native application.
But there is some instant bad news: It won’t work
At least, at this moment in time. The developers of Spring Boot are working very hard with the developers of GraalVM to fix all the problems. A lot of work has already been done, you can check out the progress in this Spring issue.
A framework that also covers the entire spectrum and does work with GraalVM is micronaut.io. If you want something that works out of the box, check out their entire stack.
But I’d like to do it enirely myself, find the pitfalls and learn about the limitations of GraalVM at the moment. This is very useful to understand what you can and can’t do. I’m going to build my own GraalVM stack!
Web alternative: SparkJava
Instead of turning to Spring Boot or micronaut, let’s keep our application really micro and use something else. Spark Framework is a small web framework. And it works like a charm using
First I created a simple Maven project and to use the
native-image command I needed all the Maven dependencies as a JAR file on the class path after compilation. To have this I added the following plugin to the
Next I added the following dependency:
Now we can write our code and run the HelloWorld web application:
Building everything is the same as in Part 1.
There are two differences though. The base image can no longer be
FROM scratch because we need access to networking. Second, we need to expose the port of the web application to the outside using
Also we need to add the following option to
This option eliminates some problems during the analysis phase out of the way.
Running this Dockerfile results in a “Hello World” in the browser at
http://localhost:4567/hello. Using just 4 mb of runtime memory (!).
Dependency Injection: Google Guice
Another big problem at the moment using GraalVM is trying to use dependency injection. First I tried to use Google Guice. It is marketed as a ‘low overhead’ dependency injection framework. When I fired up the
native-image command I got the following exception (amongst many others):
It seems that Google Guice internally uses a way to call Integer.parseInt using reflection, but GraalVM doesn’t understand this. But luckely, we can help GraalVM a bit.
To fix this first problem I added the following file to my project:
And during the build of
native-image I pass the following option:
Now we have instructed Substrate VM/GraalVM that our application will do a reflection lookup to Integer.parseInt (amongst other calls). It now understands this and loads everything.
For some reason though, for each dependency I kept getting the following exception:
The application works great running it from
java but not after using
native-image. Time for something different!
Dependency Injection: Dagger 2
Instead of Google Guice or Spring I decided to go with another framework: Dagger 2
This framework has one big advantage over all the others: It works compile-time.
Sure, setting it up takes a little bit more time. You’ll need to include a Maven plugin that does all the magic during compilation. But it is a perfect solution (currently) for GraalVM’s native-image. All the injection-magic is done during compilation, so when running the application everything is already nicely wired up and static.
Database access (Hibernate/Oracle)
Finally, to complete my CRUD application I tried to access our Oracle database. GraalVM and the database are both created and maintained by Oracle so I hoped this would work out of the box…. but (spoiler): It didn’t.
The main problem here is the code in Oracle’s JDBC driver, this turned out to be a very hard thing to get working, this took about an entire day!
First off, there are a lot of static initializer blocks and some of those are starting Threads. This is something the SubstrateVM’s analyzer can’t handle (again: see LIMITATIONS.md).
It was throwing errors like:
Again, like before, the exception itself does provide a solution. The issue here are static initializer blocks starting threads during analysis, but this can be countered by delaying the class initialization. This isn’t trivial here because I don’t have access to the code in Oracle’s JDBC driver. But in the end I managed to get it working by adding the following parameters to the
The next problem was getting the
persistence.xml file to load. Hibernate is using ClassLoader.getResources() for this during runtime and for some reason I couldn’t get this to work. I knew there was a way to add resources into the native image but I struggled to get it working, the flag is called
-H:IncludeResources= and you can add a regex here.
It wasn’t until I browsed the Substrate VM source code and extracted the parsing from ResourcesFeature.java. Running this code locally showed me everything I tried was wrong.
Things I tried that didn’t work:
This finally worked (including all XSD’s, properties and everything in META-INF):
It turns out the listing goes through all the JAR files and directories and matches them against a relative path which in our case is:
logging.properties isn’t enough. This cost me way more time than it should have. A nice feature to add to GraalVM would be to list all the resources being added because at some point I was convinced it should just work and the problem was in my code somewhere.
Next problem: Xerces
This library gave me nightmares before as a Java developer, but luckily this time the problems could be fixed easily by adding more reflection exceptions to our -H:ReflectionConfigurationFiles=
Also xerces needed a resource bundle to load:
Sigh, okay, still making progress.
Now again a problem with Hibernate and resources. I got the following
StringIndexOutOfBoundsException running the application:
It turns out due to the way GraalVM labels its resources we get to a point where Hibernate is confused. It calls the following code with the wrong parameters:
- url: “META-INF/persistence.xml”
- entry: “/META-INF/persistence.xml”
With this input
entry.length()=25 is bigger than
url.length()=24 resulting in
file.substring(0, -1), sigh.
To fix this I’ve created a so called shadowed class. This is a class with the exact same signature that I am compiling and adding to the classpath. Because my version of the class is loaded before the Hibernate version of the class, my version is used, I’m overriding the Hibernate version. This is obviously very ugly, but it does the job surprisingly well!
Math.max(0, file.length() - entry.length()) to fix getting a ‘-1’ in the substring:
And of course, a new problem pops up. Again with the resources, GraalVM seems to have put
resource as a protocol of all the loaded resources. Opening the resource using an
java.lang.URL caused more problems in
ArchiveHelper because GraalVM doesn’t recognise ‘resource’ as a valid protocol (huh?). This meant I needed to make another small patch in the shadowed
The next big problem was getting
oracle.jdbc.driver.OracleDriver to accept the fact that GraalVM doesn’t support JMX (and might never support it). The driver tried to load MXBeans and MBeans from a static initializer block, this caused major headaches…. but in the end I managed to solve this again by shadowing another class:
Still the initial static initializer block in
oracle.jdbc.driver.OracleDriver wouldn’t load and broke the native compilation. Browsing the decompiled code I noticed the following lines which might cause a problem:
This class isn’t on the classpath oddly enough, so I decided to create a dummy/shadow class again, just in case:
The next problem is Hibernate’s dynamic runtime proxies. This is something GraalVM can’t handle so we need to make sure Hibernate’s magic is done before we start our application. Luckily there is a Maven plugin which does just that:
Now we have everything in place and we can use the EntityManager to access our database and execute queries…. right?
Well it turns out, the application does start, and it comes a long way. But there is one thing I wasn’t able to fix, loading the definitions.
Hibernate has two ways of loading/mapping the actual class to the database:
- Hbm files
First I tried to use annotations, but this failed because at runtime the native (pre-loaded) classes don’t have any knowledge left of the annotations they once had.
The second method is using an HBM xml file, but this too failed. Reading the XML file again needs support for annotations and JAXB failed on me.
So we’ll have to stop here. The JDBC driver was working, so probably plain old SQL would work perfectly. Hibernate for now eludes me.
Update: Previously I mentioned Hibernate was working, this was a mistake on my part!
Docker: Multi-stage build
After some changes I now have a single Dockerfile with two
FROM sections in it. The first section is the builder-part, the second part is the host-part. My
docker build command now uses that first image to build, passes everything to the second image and builds our resulting Docker container. Really nice, I’m learning new techniques every day.
One thing I noticed during this entire process, the analysis time used by
native-image grew exponentially. The HelloWorld application from Part 1 took just a couple of seconds, but look at the following output:
It now took a whopping 60 minutes to compile to native! And during the creation of this project I needed to add a single line the
reflecion.json and restart the entire build a lot of times. Transforming a project to work with
native-image right now is a really time-consuming endeavour.
When compiling on my MacBook for MacOS, the compilation time is much shorter, just a couple of minutes. The problem here seems to be Docker and building for Ubuntu.
I’m proud to say I now have a simple CRUD native microservice, written in Java, using Java libraries…. that is almost working. With a bit more work on the GraalVM/SubstrateVM side I’m pretty sure this could work in the near future.
- Dagger 2
- SLF4J + java.util.logging
- Hibernate/Oracle (almost working…)
This allows me to serve REST/JSON objects from the database with some Java code in between, what most microservices do.
All the sources are on GitHub: check it out
Check out all the code on GitHub and try it out for yourself. To get it up and running all you need it to set up a database and put the connection information in
Start up time
The microservice has a very fast startup time:
Compare this to the Java 8 version (same code):
With 486ms compared to 3406ms the native version starts 7x faster.
The Java version consumes 267mb of memory, while the native version takes just 20.7mb, so it is 13x smaller.
At the moment GraalVM is still in its infancy stage, there are still a lot of areas that can use some improvement. Most problems I’ve encountered have to do with resource loading or reflection. The build/analysis-cycle becomes pretty long when you add more and more classes and if you need to add reflection-configuration exceptions each build, this process is quite cumbersome.
Frameworks and libraries will start to notice GraalVM (once it gains more traction) and they will change their code to work better with Graal. For example the team behind Spring Boot is already actively working together with the GraalVM team to get their framework working.
Now some people are shouting to their laptops:
Why not just use Go/Rust/some other language that compiles to native by default!?
That is a very good point. If you want to use Go or Rust, go ahead!
native-image might be a more accessible way of transitioning to native backends IMO.
I’ll for sure be keeping track of GraalVM, not just because of the
native-image capabilities, but also because the amazing speed of their VM.
Did you know there is a Graal AOT compiler inside your JDK right now? (see: JEP-295)
Sources: All the code from this blogpost can be found here on GitHub.
One of the most amazing projects I’ve learned about this year is GraalVM.
I’ve learned about this project during Devoxx Poland (a Polish developer conference) at a talk by Oleg Šelajev. If you’re curious about everything GraalVM has to offer, not just the native Java compilation, please watch his video.
GraalVM is a universal/polyglot virtual machine. This means GraalVM can run programs written in:
- Python 3
- JVM-based languages (such as Java, Scala, Kotlin)
- LLVM-based languages (such as C, C++).
In short: Graal is very powerful.
There is also the possibility to mix-and-match languages using Graal, do you want to make a nice graph in R from your Java code? No problem. Do you want to call some fast C code from Python, go ahead.
In this blogpost though we’ll look at another powerful thing Graal can do:
Instead of explaining what it is, let’s just go ahead, install GraalVM and try it out.
To install GraalVM, download and unpack, update PATH parameters and you’re ready to go. When you look in the /bin directory of Graal you’ll see the following programs:
Here we recognise some usual commands, such as ‘javac’ and ‘java’. And if everything is setup correctly you’ll see:
Hello World with native-image
Next up, let’s create a “Hello World” application in Java:
And just like your normal JDK, we can compile and run this code in the Graal virtual machine:
But the real power of Graal becomes clear when we use a third command:
This command takes your Java class(es) and turns them into an actual program, a standalone binary executable, without any virtual machine! The commands you pass to
native-image very similar to what you would pass to
java. In this case we have the classpath and the Main class:
Now we have an executable that prints “Hello World”, without any JVM in between, just 5.6mb. Sure, for this example 5mb isn’t that small, but it is much smaller than having to package and install an entire JVM (400+mb)!
Docker and native-image
So what else can we do? Well, because the resulting program is a binary, we can put it into a Docker image without ANY overhead. To do this we’ll need two different Dockerfile’s, the first is used to compile the program against Linux (instead of MacOS or Windows), the second image is the ‘host’ Dockerfile, used to host our program.
Here is the first Dockerfile:
This image can be created as follows:
Using this image we can create a different kind of executable. Let’s create our application using the just created docker image:
This results in an executable ‘app’, but this is one I can’t start on my MacBook, because it is a statically linked Ubuntu executable. So what do all these commands mean? We’ll let’s break it down:
But we can do something cool with it using the following, surprisingly empty, Dockerfile:
We start with the most empty Docker image you can have,
scratch and we copy in our
app executable and finally we run it. Now we can build our helloworld image:
We’ve now turned our Java application into a very small Docker image with a size of just 6.77MB!
In the next blogpost Part 2 we’ll take a look at Java applications larger than just HelloWorld. How will GraalVM’s native-image handle those applications, and what are the limitations we’ll run into?
Yesterday I compared different JDK versions and OpenJ9 versus HotSpot on memory and speed. The memory part of the test was realistic if you ask me, an actual working Spring Boot application that served REST objects.
The speed/CPU test however was… lacking. Sorting some random arrays, just one specific test.
Today I decided to test OpenJ9 and HotSpot a bit more using an actual benchmark: SPECjvm2008.
SPEC (Standard Performance Evaluation Corporation) has a couple of well defined benchmarks and tests, including an old JVM benchmark called SPECjvm2008. This is an elaborate benchmark testing things like compression, compiling, XML parsing and much more. I decided to download this and give it a spin versus OpenJ9 and HotSpot. This should be a much fairer comparison.
Initially I encountered some issues, some of the tests didn’t work against Java 8 and the tests wouldn’t even start against Java 9+. But eventually I got it working by excluding a couple of benchmarks with the following parameters:
The Docker images used in these tests are both Java 8 with OpenJDK8, but one with HotSpot underneath, the other with OpenJ9:
Again I started the Docker image with a directory linked to the host containing the SPEC benchmark:
- Start Docker:
- Go to the correct directory:
- Run the (working) tests:
After waiting a long time for the benchmark to finish, I’ve got the following results:
The graph is measured in ops/m, higher is better. Results may vary of course depending on hardware.
In most cases HotSpot is faster than OpenJ9, and in two cases HotSpot is much faster, crypto and derby. It appears this is a case where HotSpot is doing something special that J9 isn’t doing (yet?). This might be important to know if you’re working on applications that do a lot of cryptology, for example high performance secured endpoints.
One place where OpenJ9 came out on top is XML validation. Parsing/validation is also an important part in most modern applications, so this could be a case where J9 makes up some lost ground in actual production code.
Is there a real conclusion from this? I don’t think so.
The real lesson here is: Experiment, measure and you’ll know. Never decided anything based on some online benchmark.
If there is anything else you’d love me to test, send me a tweet: royvanrijn
OpenJ9 and IBM J9 are a different JVM implementation from the default Oracle HotSpot JVM. With the modern adoptopenjdk pre-made Docker images it is easy to swap and test different combinations and pick the right JVM for you.
The rumours seem to be true, OpenJ9 seems to blow HotSpot away on memory usage. HotSpot seems to have the edge CPU-wise.
In the Java world most people are familiar with OpenJDK. This is a complete JDK implementation including the HotSpot JVM engine. Not a lot of developers know or try alternatives to HotSpot. Asking around some colleagues remembered the name JRockit, nobody mentioned IBM J9 and/or Eclipse OpenJ9.
I’ve read that OpenJ9 is very good with memory management and is tailered for usage in the cloud/in containers. OpenJ9 is an independent implementation of the JVM. It’s origins are IBM’s Java SDK/IBM J9 which can trace its history back to OTI Technologies Envy Smalltalk (thanks Dan Heidinga!).
With the current rise in microservice usage (and most services are not so micro in Java). I recon this could become a hot topic again!
Before the Docker-era it was relatively hard to compare different JVMs, versions. You needed to download, install, script and run everything. But now a lot of pre-made images are available online.
Here is my idea on how to test the JVMs:
- Create a simple Spring Boot application
- Start the application in various Docker Images
- Measure memory usage after startup and GC
- Measure the time it takes to run a small CPU-intensive test
This is by no means a thorough test or benchmark, but it should give us a basic idea of what we can expect from the virtual machines.
Spring Boot application
The Spring Boot application I created contains the following endpoints:
- A REST endpoint that calls the GC (trying to make it fair)
- A REST endpoint that creates 1000 large random arrays and sorts them, returns the runtime (in ms)
Here is the listing of the CPU-test:
Again, we can argue endlessly about if this test makes sense and is even remotely relevant… but still it should give us some basic idea of what kind of performance we can expect. If the rumoured memory improvements are true, might there be a performance hit? Is there a performance trade-off?
I’ve decided to test the following images.
First we have the (slim) openjdk images for 8/9/10/11:
Next there are the adoptopenjdk images for 8/9/10:
Then we have OpenJ9, again provided by adoptopenjdk for 8, 9 and a nightly build of 9 (see my previous blogpost):
And I decided to include IBM’s own J9 image as well:
Testing with Docker
After building my Spring Boot application I launched each Docker image using the following command:
I’m mapping my “spring-boot-example” project folder to “/apps/spring-boot-example” so I can start the JAR file inside the container. Also I’m forwarding port 8080 back to my host so I can call the endpoints.
Next, inside the container, I launch the Spring Boot application:
After waiting a bit, calling the endpoints a couple of times and performing a GC I measured the memory usage.
After that I called the “/loadtest” endpoint containing the array-sorting test and waited for the results.
Here are the results of the memory used by the simple Spring Boot application:
At first you can see that the memory usage for Java 8 is much higher than for Java 9 and 10, good!
But the biggest shock is how much less memory OpenJ9 and J9 are using, almost 4x less memory if you compare Java 8 with OpenJ9. I’m amazed, how does this even work? Now we can almost call our Spring Boot service micro!
I’ve also experimented with running some production Spring Boot code (not just simple examples) and here I’ve seen improvements up to 40-50% decrease in memory usage.
Online I’ve read that OpenJ9 isn’t as good as HotSpot if you look at CPU intensive tasks. That is why I created a small test for this as well.
1000 arrays with 1000000 random long values being sorted. This takes around 100 seconds, this should give the JVM enough time to adjust and optimize. I’ve called the benchmark twice for each tested image. I’ve recorded the second time trying to eliminate warmup times.
In the chart we can see that indeed the J9 and OpenJ9 images are slower, not by much max 18%. It seems for this particular testcase Java 8 beats most Java 9 implementations (except coupled with OpenJ9).
My current project has a lot more memory issues than CPU issues on production (frequently running out of memory while having 1-2% CPU usage). We are definitely thinking about switching to OpenJ9 in the near future!
We did already encounter some issues during testing:
- Hessian: (binary protocol) has a build-in assumption that System.identityHashCode always returns a positive number. For HotSpot this is true but OpenJ9/J9 can also return negative numbers. This is an open issue and the Hessian project hasn’t fixed this in a couple of years, seems to be dead? Our solution is to move away from Hessian altogether
- Instana: We love our monitoring tool Instana, but it had some problems connecting their agent to OpenJ9/J9. Luckily the people at Instana helped us identify a bug and a fix should be published today (and is automatically updated, w00t!)
Open questions I haven’t looked in to:
- Can you still get/use jmap/hprof information etc in OpenJ9?
- How will it hold up during longer production runs?
- Will we find other weird bugs? It feels tricky…
Have you tried OpenJ9/J9? Let me know in the comments.
Is there anything else you’d love to see tested? The best way to contact me is to send me a tweet royvanrijn.
Java and Docker aren’t friends out of the box. Docker can set memory and CPU limitations that Java can’t automatically detect. Using either Java Xmx flags (cumbersome/duplicated) or the new experimental JVM flags we can solve this issue.
Docker love for Java is in its way in newer versions of both OpenJ9 and OpenJDK 10!
Mismatch in virtualization
The combination of Java and Docker isn’t a match made in heaven, initially it was far from it. For starters, the whole premise of the JVM, Java Virtual Machine, was that having a Virtual Machine makes the underlying hardware irrelevant from the program’s point of view.
So what do we gain by packaging our Java application inside a JVM (Virtual Machine) inside a Docker container? Not a lot, for the most part you are duplicating JVMs and Linux containers, which kills memory usage. This just sounds silly.
It does make it easy to bundle together your program, the settings, a specific JDK, Linux settings and (if needed) an application server and other tools as one ‘thing’. This complete container has a better level of encapsulation from a devops/cloud point of view.
Problem 1: Memory
Most applications in production today are still using Java 8 (or older) and this might give you problems. Java 8 (before update 131) doesn’t play nice with Docker. The problem is that the amount of memory and CPUs available to the JVM isn’t the total amount of memory and CPU of your machine, it is what Docker is allowing you to use (duh).
For example if you limit your Docker container to get only 100MB of memory, this isn’t something ‘old’ Java was aware of. Java doesn’t see this limit. The JVM will claim more and more memory and go over this limit. Docker will then take action into its own hands and kill the process inside the container if too much memory is used! The Java process is ‘Killed’. This is not what we want…
To fix this you will also need to specify to Java there is a maximum memory limit. In older Java versions (before 8u131) you needed to specify this inside your container by setting -Xmx flags to limit the heap size. This feels wrong, you’d rather not want to define these limits twice, nor do you want to define this ‘inside’ your container.
Luckily there are better ways to fix this now. From Java 9 onwards (and from 8u131+ onwards, backported) there are flags added to the JVM:
These flags will force the JVM to look at the Linux cgroup configuration. This is where Docker containers specify their maximum memory settings. Now, if your application reaches the limit set by Docker (500MB), the JVM will see this limit. It’ll try to GC. If it still runs out of memory the JVM will do what it is supposed to do, throw an OutOfMemoryException. Basically this allows the JVM to ‘see’ the limit that has been set by Docker.
From Java 10 onwards (see test below) these experimental flags are the new default and are enabled using the -XX:+UseContainerSupport flag (you can disable this behaviour by providing -XX:-UseContainerSupport).
Problem 2: CPU
The second problem is similar, but it has to do with the CPU. In short, the JVM will look at the hardware and detect the amount of CPU’s there are. It’ll optimize your runtime to use those CPU’s. But again, Docker might not allow you to use all these CPU’s, there is another mismatch here. Sadly this isn’t fixed in Java 8 or Java 9, but was tackled in Java 10.
From Java 10 onwards the available CPUs will be calculated in a different way (by default) fixing this problem (also with UseContainerSupport).
Testing Java and Docker memory handling
As a fun exercise, lets verify and test how Docker handles out of memory using a couple of different JVM versions/flags and even a different JVM.
First we create a test application, one that simply ‘eats’ memory and doesn’t free it.
We can start Docker containers and run this application to see what will happen.
Test 1: Java 8u111
First we’ll start with a container that has an older version of Java 8 (update 111).
We compile and run the MemEat.java file:
As expected, Docker has killed the our Java process. Not what we want (!). Also you can see the output, Java thinks it still has a lot of memory left to allocate.
We can fix this by providing Java with a maximum memory using the -Xmx flag:
After providing our own memory limits, the process is halted correctly, the JVM understands the limits it is operating under. The problem is however that you are now setting these memory limits twice, for Docker AND for the JVM.
Test 2: Java 8u144
As mentioned, with the new flags this has been fixed, the JVM will now follow the settings provided by Docker. We can test this using a newer JVM.
(this OpenJDK Java image currently contains, at the time of writing, Java 8u144)
Next we compile and run the MemEat.java file again without any flags:
The same problem exists. But we can now supply the experimental flags mentioned above:
This time we didn’t set any limits on the JVM by telling it what the limits are, we just told the JVM to look at the correct settings! Much better.
Test 3: Java 10u23
Some people in the comments and on Reddit mentioned that Java 10 solves everything by making the experimental flags the new default. This behaviour can be turned off by disabling this flag: -XX:-UseContainerSupport.
When I tested this it initially didn’t work. At the time of writing the AdoptAJDK OpenJDK10 image is packaged with jdk-10+23. This JVM apparently doesn’t understand the ‘UseContainerSupport’ flag (yet) and the process was still killed by Docker.
Testing the code (and even providing the flag manually):
Test 4: Java 10u46 (Nightly)
I decided to try the latest ‘nightly’ build of AdoptAJDK OpenJDK 10. Instead of Java 10+23 it includes 10+46.
There is a problem in this nightly build though, the exported PATH points to the old Java 10+23 directory, not to 10+46, we need to fix this.
Succes! Without providing any flags Java 10 correctly detected Dockers memory limits.
Test 5: OpenJ9
I’ve also been experimenting with OpenJ9 recently, this free alternative JVM has been open sourced from IBMs J9 and is now maintained by Eclipse.
Read more about OpenJ9 in my next blogpost.
It is fast and is very good with memory management, mindblowlingly good, often using up to 30-50% less memory for our microservices. This almost makes it possible to classify Spring Boot apps as ‘micro’ with a 100-200mb runtime nstead of 300mb+. I’m planning on doing a write-up about this very soon.
To my surprise however, OpenJ9 doesn’t yet have an option similar to the flags currently (backported) in Java 8/9/10+ for cgroup memory limits. For example if we apply the previous testcase to the latest AdoptAJDK OpenJDK 9 + OpenJ9 build:
And we add the OpenJDK flags (which are ignored by OpenJ9) we get:
Oops, the JVM is killed by Docker again.
I really hope a similar option will be added soon to OpenJ9, because I’d love to run this in production without having to specify the maximum memory twice. Eclipse/IBM is working on a fix for this, there are already issues and even pull requests for this issue.
UPDATE: (not recommended hack)
A slightly ugly/hacky way to fix this is using the following composed flag:
In this case the heap size is limited to the memory allocated to the Docker instance, this works for older JVMs and OpenJ9. This is of course wrong because the container itself and other parts of the JVM off the heap also use memory. But it seems to work, appearantly Docker is lenient in this case. Maybe some bash-guru will make a better version subtracting a portion from the bytes for other processes.
Anyway, don’t do this, it might not work.
Test 6: OpenJ9 (Nightly)
Someone suggested using the latest ‘nightly’ build for OpenJ9.
This will get the latest nightly build of OpenJ9, and it has two things:
- Another broken PATH parameter, fix that.
- The JVM has support for the new flag UseContainerSupport (like Java 10 will)
TADAAA, a fix is on the way!
Oddly it seems this flag isn’t enabled by default in OpenJ9 like it is in Java 10 though. Again: Make sure you test this is you want to run Java inside a Docker container.
IN SHORT: Be aware of the mismatch, the limitations. Test your memory settings and JVM flags, don’t assume anything.
If you are running Java inside a Docker container, make sure that you have Docker memory limits AND limits in the JVM or a JVM that understands these limits.
If you’re not able to upgrade your Java version set your own limits using -Xmx.
For Java 8 and Java 9, update to the latest version and use:
For Java 10, make sure it understands the ‘UseContainerSupport’ (update to latest) and just run it.
For OpenJ9 (which I highly recommend for bringing down your memory footprint in production) for now set the limits using -Xmx, but soon there will be a version that understands the ‘UseContainerSupport’ flag.
This week I’ve been stuggling with the following question:
When is the right time to upgrade?
By upgrading, I mean everything: libraries, tools, Java versions, application servers, MQ servers…
My current project uses a reactive upgrade policy, we upgrade for four reasons:
- Something is broken and fixed in a later version
- We need or want to use a new feature
- Support for a version we’re using is being dropped
- The old version we’re using has a known security issue/CVE
The first two reasons are entirely up to the programmers to decide. The third reason is up to the company that gives us support. For the fourth upgrade reason, security, we have some automation in place. We are using the OWASP dependency checker Maven plugin for our libraries.
- Is a reactive update policy good enough?
- Are there any pro-active strategies?
- Do you want to invest in keeping your microservices up-to-date?
- Do you let your services deteriorate and dispose, replace them in the future with new technologies?
More on this in my (kind-of-weekly) vlog:
It seems that the adoptation of Java 9 is slow, very slow. In a Twitter poll this week I asked: “Which version of Java are you using in production?”
The poll got almost 300 replies and to my surprise just 3% of the respondents are using Java 9 at the moment. Most are using Java 8 and there are even more people using Java 6… So what is holding people back? Why is almost no-one using Java 9?
It turns out a lot of people are scared to upgrade to Java 9. They think it’ll be hard to do, libraries and tools will fail them. They are scared for another Jar-hell.
That is why I took one of my larger projects, that has been running in production for a long time, and try to upgrade it to Java 9.
Upgrading: Walk in the park or walk in jar-hell?
Here are my results and findings, in a new format: a vlog
If you like the vlog and follow my website, please subscribe to my channel on YouTube!
Today I got to present at Devoxx Poland on being agile and managing your architecture.
One of the points I made during the talk had to do with Conway’s Law. For those unfamiliar with it, here it is:
organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations M. Conway
This basically means that the code you create is likely to reflect the way the people and teams in your company communicate. For example, banks are usually pretty strict and have a tendency to document things. The code written at banks in turn have the tendency to use SOAP for example, verbose and strict.
The project I’m currently working on has a very agile mindset, we change a lot and have open communication. This reflects in our codebase as well, all the code is easy to understand and easy to change (thankfully!).
But lately we’ve started to move away from our old monolith and we’ve began cutting it up into microservices. That’s when I noticed something odd, which I call the Inverse Conway’s Law (maybe it is Roy’s Law?):
organisations which radically change their system design should expect changes in communication structure Roy van Rijn
What does this mean?
We’ve started to move towards a microservice environment, where each team has a responsibility over a couple of services. They’ve developed it and they own it. This ownership completely changed the culture in our team. Up to now if the monolithic backend was broken or offline, everyone was hurting and everyone wanted to fix this as soon as possible. With microservices, this is gone. The first thing we now do is check which microservice failed. If it isn’t theirs, it is not their problem, it’s someone elses problem now!
This is just one example on how changing the design of our system changed the culture and team dynamic. It is one thing that sounds logical when you think about it, but I didn’t expect this before we started the transition.
Today I got an email with the subject: “Congratulations JavaOne 2016 Rock Star!”. It turns out the JavaOne 2016 Ignite session was voted enough to receive a JavaOne Rock Star Award!
I was so excited, elated even!
But I quickly found out that the JavaOne Rock Star lifestyle isn’t as much fun as it sounds.
- Being 3 hours late for a meeting, even a Rock Star can’t do that.
- Peeing in the corner of your office while holding a bottle of Jack Daniels, still not appreciated.
- Wearing ripped jeans, a leather jacket and other than that just chest hair, is considered too casual.
- The groupies? It is like answering random Stack Overflow questions, non-stop, face to face.
- After each coding session, smashing your MacBook on the floor… quickly becomes expensive.
YouTube, the future of television. I’ve got a lot of subscriptions to YouTube channels that deliver quality content, and those shows are ‘cast’ (using my Chromecast) to my TV. Another thing I often do is look up information, for example I watch talks from programming conferences like Devoxx using YouTube.
This gave me an idea, what if I can take some free information (like Wikipedia, all creative commons) and use that to create content for YouTube? Maybe I’ll even get some views :-)
So this is what I’ve come up with, the following video is generated completely automatically:
Step 1: Wikipedia API
The first step in creating these videos was to get some actual freely available content. For this I chose to use Wikipedia. This can be accessed using JSON/REST, and to quickly hack this into place I decided to use a framework called ‘Unirest’. This makes the while prototype easy:
The resulting JSON (see this example) contains the Wikipedia page in simple processed HTML. From this HTML I used a library called ‘Jsoup’ to extract the plain text. The first alinea of text contains the introduction of the topic, this is what I’ve decided to use as content for the video. Initially I had all the content of the page in the videos, but they quickly turned out to be 30+ minutes long and very boring.
This is the code I used to go from HTML to the first alinea of text:
Step 2: Stunning visuals with Java 2D
We have some text, now how do we turn this into a video? We need visuals! I decided to cut the content into seperate lines, and for each line I have a single ‘slide’. This is an (buffered) image created using Java’s 2D API and saved to file.
Step 3: Text-To-Speech
Instead of using a Java Text-To-Speech (TTS) library, which aren’t very clear or good, I decided to use the build-in OS/X command ‘say’.
On the command line you can execute:
say Hello world!
The ‘say’ command supports different voices and also has the ability to store the output in a file. Which is perfect for us!
Step 4: Putting it all together
What do we have now? A list of sentences from Wikipedia, for each sentence we have an image and we have an audio snippet created with TTS. Now it is time to put everything together and create a video. The easiest way to do this is to use ffmpeg, a powerful command line video tool. To make this easier I created a small method ‘executeCommandLine’ which does the same thing as the snippet with ‘Runtime.getRuntime().exec’ shown above.
Combine the image (png file) and the TTS audio (mp4 file) into a series of MPG files:
We now have a series of small MPG files. They need to be combined into a single video. For this we need a text file which contains a list of all the MPG files. This is stored on disc and next we execute the following command:
Next we convert the video from one large MPG to MP4 file (which is a nicer, compressed format):
And to add a little more ‘production value’ I’ve gone online and found some soothing (free, creative commons) background music. This music is looped into a large audio file of 50 minutes (enough for all videos). Again using ffmpeg the audio is mixed in:
There it is, we have a complete video with text audio, text visuals and background music! Only one thing left to do, uploading to YouTube!
Step 5: Uploading everything to YouTube
To automate the process of uploading, I first needed a place to put the videos, so I created a new YouTube channel named: Vikipedia
Next we can use the Google YouTube Data API. It turns out this is very easy and well documented. I don’t even need to share the code I wrote here because I used the example from Google’s website itself! Check out this amazing documentation. Very detailed and does exactly what I needed it to do.
The example code takes a video file and uploads it. The only things I changed were setting the correct description, I was ready to generate all the content!
Not implemented (yet?)
Things I wanted to implement (but didn’t and probably never will):
- Add a little color to the videos (black and white is boring)
- Add waveform from ffmpeg (just to keep it visually interesing)
- Download (did this) and use (didn’t do this) the images from the Wikipedia page and add those to the video
- Yes, it is pretty easy to hack together a bot to create YouTube videos.
- No, the quality isn’t that good… yet?
- Yes, it can be made in a single afternoon!
- No, I’m not going to automate the spidering of Wikipedia and upload a million videos… (but I could easily do that!)
Up to now all the videos created are based on a small list of ‘hot topics’ I found on Wikipedia and Twitter. I don’t want to automate the entire process and flood YouTube, the content just isn’t good enough for that. But it was a fun project for a lazy afternoon! I’ve learned a lot of new tricks, maybe you can benefit from that as well.
The next thing I want to do with the 400+ YouTube videos I’ve generated is to see how (and if) people are going to watch them. Do they show up in the search results? Which topics are searched and watched? Maybe more on that in a future post!
Do you have any good ideas on how to generate videos, let me know in the comments! I’ve created a bot that generates music some years, maybe I should improve that bot to upload its own YouTube music videos? ;-)