This morning a friend (@vgrgic) tweeted the following quote from Martin Fowler:
“So if you can keep your system simple enough to avoid the need for microservices: do.”
This is absolutely true I think, microservices are far from the silver bullet some people want you to believe. They do have a place in software architecture obviously, but there is always a ‘it depends’.
A lot of microservices ‘experts’ I’ve met are creationists. When they start designing an application (the last two years), they break the requirements down into pieces. Once they start coding they instantly create small services and connect them. But in my experience, this has a major drawback. When doing this, you’ll need to make quite a lot of assumptions upfront on what the future of the service will hold and how it will grow. Once you’ve created small services they have very strong boundaries, those boundaries really dictate and limit how your application can and will grow. This can work well, only if all your assumptions turn out to be true, but really, how often does that happen?
I’ve labelled myself as a design evolutionist. Always start with a monolith. Properly separate the concerns in your code, keep it really clean and tidy. Always be aware of what each piece of code does and if it belongs where you just wrote it. In the best scenario, you’ll end up with a simple, well build, working monolith! This is also what Martin Fowler describes in his piece on MicroservicePremium. There is a point where the monolith becomes a burden to manage. Once you reach that point, and you’ve payed proper attention to the separation of concerns, it should be trivial to cut loose a well defined microservice from your code.
Rant about the hype
Warning: The following paragraph is full of nonsense.
Until recently I couldn’t understand the microservices hype. Why is it suddenly this popular? But recently I’ve seen the light. I now know why so many people like it. Microservices has a wide appeal for many wrong reasons.
- Architects (like me) like it because it solves one kind of problem really well, managing and maintaining a huge application.
- The DevOps community likes it because they have a lot to do when you have heaps of services.
- Large SOA/ESB companies like it, they almost went bankrupt and out of fashion, but now they can sell their crappy tooling again as it ‘enables microservices’.
- The OSGi community likes it, they too can use the term microservices to sell OSGi tooling and knowledge.
- Application servers like it because they can create tooling to enable microservices.
- Monitoring companies found a new gap in the market, they can now sell microservices monitoring tools…
- Etc etc et fucking cetera!
In this post I’ll try to convince you that all the conferences are doing it wrong. JavaOne, Devoxx, and all other conferences should stop using conference program committees to select their speakers, there is a much better way.
Call for Papers
Some of you know I enjoy speaking in public and I’ve done so at many conferences. To get selected to speak at a conference you’ll need to supply a ‘paper’ to the Call for Papers (CfP). In this paper you outline the content of your talk and provide some information about yourself as speaker.
Once all these papers are submitted some people from the community will look at all the proposals and they have the daunting task of selecting the right ones. This is very hard to do, I know this for a fact since I’ve been on several program committees, for example for J-Fall.
The honor from hell
When I got asked to be on a program committee it felt like a real honor and a lot of fun. But in reality it is a painstaking task! You’ll get a lot of papers to review, sometimes up to or over a hundred. And you have to rate them, all of them! Once you are halfway through you can’t really give an honest rating compared to the first papers you’ve reviewed, plainly because you’ve seen too much and can’t remember most of them. There is something very wrong about this system.
Another popular topic these days is gender bias. Until recently in IT a lot of these program committees were dominated by males. Research has shown that an all-male committee is much more likely to pick male speakers than female speakers for example. This is why currently most conferences try to pick a committee that is as mixed as possible. But is that really solving the problem?
A third problem is the flooding. Some people have found out that submitting papers has a bit of chance/luck to it. If you write down the right words people will fall in love with it and you get picked. So instead of submitting one paper, they’ll just submit five or even ten papers, sometimes about the same topic but with other buzzwords. This actually improves their chances of getting in instead of getting a penalty!
Telescope Time assignment
This brings me to a Numberphile YouTube video called: Telescope Time without Tears. It turns out they had exactly the same problems we’re facing with program committees. There are only a couple of powerful telescopes in the world and a lot more researchers that want to get time on these telescopes. So previously they had selection committees and call for papers. They ran into exactly the same problems as outlined above, so they came up with a much better way to do this.
The solution: distributed peer review
I’ll try to explain what the Telescope committee did to solve their problems, but I really suggest you take a look at the video first. It explains everything much better than I possibly can.
Their idea is simple, if you submit a paper, you’ll receive a number of papers in return to review, for example six other papers. This is really what peer-reviewing is about, getting reviewed by actual peers. This instantly solves the problem of flooding, if you submit five papers, you’ll receive thirty (!) papers to review, a real penalty.
This way of reviewing is also much more fair, more people will vote and they only have to rate six papers each. So, now I know what you are thinking: this can’t be right, we can game this method! For example, if I see a paper that is a lot like my own paper, but better… I give it a very poor rating. There is a solution for that as well. You can do this of course, but only you will do this, others will still give it a fair higher rating. This information can then be used to give a penalty to your own submitted paper. On the other hand, if your scoring is much like other voters, you’ve done a fair and good job, your paper gets a small bonus. So people are motivated to really take a good look at the few papers they need to rank.
I really like the idea of getting rid of program committees, for some reason it is always feels like there is this group of in-crowd people and they have the tendency to pick friends and colleagues, the same speakers as always. I’m not blaming them, I’m sometimes part of that too! We don’t do it on purpose, it happens automatically! It is the system that is flawed.
So, for the next JavaOne, Devoxx or J-Fall, instead of selecting a committee and giving them the painstaking task of rating all the papers… maybe we should try distributed peer reviews? It just sounds like a much better approach.
The main YouTube video: Telescope Time without Tears - Numberphile
The ‘extra footage’ YouTube video with more details: Telescope Time (extra footage) - Deep Sky Videos
The actual research paper (containing the electoral theory): Telescope time without tears – a distributed approach to peer review
The album cover of Unknown Pleasures by Joy Division has to be one of the most iconic covers of the 70’s/early 80’s:
The artwork has been re-used and reprinted a lot and is still abundantly present as T-shirt at most music festivals.
But every time I saw this image I wondered… what do those white lines actually represent (if anything)? It turns out the story behind the white lines makes the artwork much cooler!
At Cambridge University in 1967 a 24 year old student named Jocelyn Bell was analysing a huge printout from a radio telescope. This telescope was built by Anthony Hewish and his group of students to look for quasars. Quasars are the dense centers of super massive blackholes, they emit lots of radio waves and visible light. During the analysis Jocelyn noticed something strange, there appeared to be a signal repeating very quickly, at about 1.337 seconds. Surely this had to be some interference from earth, because this isn’t something a quasar could produce.
After some research they could rule out a lot of earth-noise, microwaves, French radio signals, police radio and much more. They found out the signal also didn’t follow ‘earth time’, the normal 24 hours we follow, but rather the sidereal time (star time). In short, when they pointed the telescope to the object again (after earth had rotated) it would be in the same position after 23.9344699 hours. This is a key indicator the signal didn’t come from earth but rather from deep space. Nothing they had previously seen or read about in space could produce such a signal. Jokingly they initially named the signal ‘LGM1’, short for ‘Little Green Men #1’.
The discovery sparked interest from all over the world, and a year later (1968) scientists had come to the conclusion the signal had to come from something called a neutron star. When a star (larger than our sun) collapses it explodes (this is called a supernova), expelling a lot of material into space. This material, stardust, is literally what we are made of! No other process known in space can produce ‘heavy’ elements like iron, nikkel, gold, silver and platinum. Initially people thought this exploding star didn’t have a ‘corpse’ and all matter was blown into space.
In 1934 however, just a year after the neutron was discovered two scientists named Walter Baade and Fritz Zwicky predicted the existance of a neutron star. When large stars collapse they expel stardust but also leave behind a dense neutron star. This star, made of super heavy material, spins around producing radio waves. This is what we now call a pulsar. The signal from Bell and Hewish was the first observed pulsar, now named PSR B1919+21.
If we look at the Cambridge Encyclopedia of Astronomy there is a page on the discovered pulsar:
The image on the left page is the same as the album cover, it is the signal of our pulsar: PSR B1919+21
Update Apr 22, 2016: I’ve made a video which explains most of what is in this blogpost:
Yes, this blogpost is about paper. This has absolutely nothing to do with programming, but there is a beautiful piece of math involved.
There are two major standards of paper sizes in the world. The most widely used is ISO-216, more commonly known as the A-series, as in ‘A4’ size. This system is used almost everywhere in the world, the main exceptions are United States and Canada. In the United States they use their own US Letter standard.
The standard US Letter has a size of: 216 mm x 279 mm (8.5 by 11 inches) and a ratio of 1,291666666666667.
(I’ll explain why this ratio is important in the math part below!).
The origin of this size is very vague and lost in history. Most origins say (from Wikipedia):
The 11” length of the standard paper being about a quarter of “the average maximum stretch of an experienced vatman’s arms.”
Basically US Letter is a standard because it is a standard, don’t ask questions, just deal with it.
ISO-216, the A-standard
The letter size of the ISO standard is A4. The size is 210 mm × 297 mm and has a ratio of √2 (math!).
The ratio is what makes this standard superior. This of course isn’t a coincidence. Smart people have thought about these sizes, it became a standard because it was superior. The ‘magic’ property A4 paper size has is that is consists of exactly two A5 papers side-by-side. In turn, two of these A4 papers make up the larger A3 paper. This makes it very easy for example to make a booklet in A5 format consisting of folded A4 papers. Also if you scale up a document in A5 it’ll exactly fit A4. If you want to use a copier and print two A5 pages on A4 paper, go ahead! This is impossible to do with US Letter, you’ll end up with white spaces or stretched documents!
Aspect ratio math
How does this work mathematically? The magic hides in the fact that 2 / √2 = √2. Imagine we have a piece of paper with long side A and short side B. If we fold the long side and create a new paper size with B and C, what is the aspect ratio?
When you start with the aspect ratio of √2, the resulting folded aspect ratio is again √2. You can keep doing this.
Lets take the largest standard paper size, A0: 841 × 1189. What happens when we fold it? Well: 841 becomes the long side, 1189/2=594.5 the short side. This is indeed the A1 paper size (594 mm x 841 mm). It will keep the magic √2 aspect ratio!
Size of A0 paper
Now that we can explain the aspect ratio, we still don’t know how we ended up with the magic numbers 210 mm × 297 mm. This number originates simply because the A0 paper size is defined as having an aspect ratio of √2 and a surface area of 1 m2.
This is all you need to know, √2 and 1m2.
The first formula we need is:
area = (diagonal²) / (ratio + (1/ratio))
We started with just two simple numbers, ratio: √2 and area: 1m2 and now we’ve ended up with the A0 paper, 841 mm and 1189 mm!
To calculate the rest of the A-series you just take:
Mathematically folding it in half results in: 594 mm x 841 mm = exactly the A1 paper size!
Keep on folding
When we take an A4 paper and keep folding it, we get the following aspects (aspects fluctuate a little bit due to rounding):
But what do you end up with folding the US Letter? Let’s see:
This fluctuating ratio results in so much wasted space, ink, paper! And it is just ugly! If you decide to scale something up from A5 to A4 it instantly fits, if you scale something in US Letter… you’ll either have to crop or stretch or leave unprinted paper.
US Letter is bad for the environment!
Ban the US Letter standard, hurray for math!
What happened to your beard?
Have you lost a bet?
Who are you? How did you get past security?
The same guy as last week and the months before that, I shaved, sorry to shock you, it happens.
Why did you grow a beard in the first place?
Because I felt like it.
Why did you shave?
Same reason I decided to grow a beard, I felt like it.
But why this Sunday?
Stop it, I shaved, it isn’t special, most men do it a couple of times a week, get over it!
It looks much better this way!
Thanks for the compliment, don’t get to attached, the beard will probably return.
You look so much younger!
It looks much worse without a beard!
Why do people comment so much on facial hair, I’m not judging your hair/clothing/weight etc.
Anyway, to quote The Dude Lebowski: Yeah, well that’s just your opinion, man.
(I’m trying to follow the philosophy and lifestyle of dudeism)
Three years ago my life was pretty normal, I had a house, a wife, a car and a decent job. Everything was nice and easy, very mundane. And then, my daughter was born.
Impact of kids
Every parent will tell you two things:
- After having kids your life as you knew it… gone
- But the life you have afterwards, is much more satisfying!
When I didn’t have kids and I heard people say this I thought they were just making it sound better than it was. But now I know, it is true.
The thing they never tell you is *WHY* their life is better now, and I think I have the answer.
Ups and downs
In life you need ups and downs. If you have a life where you have all the money in the world and no set backs… it can still be a boring non-satisfying life. Also, I’ve seen people losing their job and they still rate their life as pretty satisfying. It all depends on your frame of reference! If you only have ups, it is dull, you need those mood swings.
When I didn’t have kids, these things made me happy:
- Heard a good joke in the coffee corner
- Won a volleyball match
And the worst things that happened to me:
- Missing a train…
- Running out of coffee
That is pretty much it, again: very mundane and a bit dull.
Kids amplify your life
Once I’ve had kids, all the things above seem nuances. This is what my ups and downs now look like:
- Was an hour late at work because kid was screaming: “Today, no pants!”
- The daily *catch me* kick in the nuts
- Kid broke something valuable, again
- Constant fear of kids getting hurt, disappointed etc
And of course, the obvious *I have kids* ups!
- Kid walks/talks/does something for the first time
- Good night kisses
- “Daddy! I’ve missed you!” after being gone for an hour
- Many more moments…
Kids cause greater daily downs and thankfully also bigger up moments, they amplify your feelings and make you feel alive!
A lot of times you hear experienced programmers talking about ‘smelly’ code. These ‘code smells’ are things that just look or feel wrong. Often programmers don’t immediately have a clear idea on how to fix it, but it ‘smells’! These smells often happen when the code ‘rots’.
Let’s do a quick check:
- Are there pieces of code you’d rather not change?
- Are (parts of) the application sometimes scrapped and rebuild from scratch (over and over)?
- Do pieces of code exist that turned out much more complex than you had initially imagined?
- Do you have pieces of code that feel out of place?
- Is it hard to break up some large classes and/or methods?
- Do you have a hard time coming up with names for certain classes?
If you’ve answered yes to one or more questions, you are probably suffering from code smells and maybe even advanced code rot.
What is a code smell?
Most of the time a piece of code has a smell when the design underneath is wrong. Often this is not visible at first, but slowly appears when the component gets larger. Classes get too large, methods are hard to break up, some classes or methods feel out of place. Even simple things as struggling with naming a new class is a sign there is something wrong. All of this are code smells and experienced programmers have developed a nose for this. Most of the time this is a sign there is a larger problem. It isn’t something a bit of code can fix, there is probably something wrong in the design. Is this component doing the right thing? Does it have the right responsibilities? Take a step back and look at the complete picture. What problem are these classes/methods trying to solve?
What is code rot?
When code smells, more often than not, people will continue working on it, adding functionality. Maybe they don’t notice the smell or they don’t take the much needed step back to investigate the problem. This leads to faster code rot. So what exactly is code rot you might ask? Every piece of code starts to ‘rot’ the moment it is written. Eventually the code needs to be replaced and becomes too difficult to maintain. Hopefully this happens in the far future, 20-30 years from now. Unfortunately this is often not the case, there are parts being rebuild and scrapped while the application is being developed.
Each time new code is written, added or changed, the new code starts out as rotten as the code that it depends on. This can cause, with a tiny bit of rotten/smelly code, for an entire application to be trashed! It is highly infectious.
And the answer is… refactoring!
Refactoring is the magic word here, but… it sounds easier than it actually is. At first people will deny there is a problem, sometimes they don’t see it, they don’t share your disgust. Eventually it’ll continue to rot and more people will notice the smell. And once this smell has become unbearable drastic measures seem to be needed. This brings us to rule #1 about code smell:
1. Refactor early, refactor often.
The sooner you refactor and remove some rotten code, the less likely it is to spread and the easier it is to remove. Don’t seek for approval, do something about it. But remember, take a step back first. Sometimes you are rewriting a piece of code, and the result is just as bad as before. The reason is that a design flaw is lurking in the shadows. If the responsibilities between two components is wrong you can scrap a piece of code and rebuild it, but the same problems will keep surfacing. So lets make this rule #2:
2. Before refactoring, take a step back, eliminate possible design flaws.
You’ve taken a step back, looked at the complete picture, fixed the responsibilities, time to do the refactoring! No!! There is a third *very* important rule. Refactoring means changing a piece of code without changing its behavior. How do we do this? We write tests! Only if you have proper testing in place you can start thinking about refactoring. How else will you be certain that a piece of code still has the same behavior as before?
3. Only refactor when you have proper tests.
If you follow these three simple rules, nothing can go wrong. You’re detecting and fixing code smells early and often, you’ll stop the rot as soon as possible to make sure it doesn’t spread, keeping the application healthy. There isn’t a possible design flaw hiding in the shadows, we’ve taken a step back and eliminated that. And finally: We have tests in place that ensure we don’t change any behavior that was painstakingly added to the rotting code.
If you follow these rules you’ll end up with code that is readable, easier to maintain, easy to change (agile code!).
In short: Healthy code!
This week I’ve started work on a new project. This project has strict rules regarding documentation and testing. First of all, everything needs to be modelled in EA (Enterprise Architect), from use cases to the REST API fields. Next we need to perform the actual programming, and the testers write down their tests.
What is in the documentation?
- A description of who inputs what and what the results should be
What is in the code?
- A description of who inputs what and what the results should be
What is in the test?
- A description of who inputs what and what the results should be
Come to think of it… programming IS testing IS documenting, we are all doing exactly the same thing! This sounds a bit wasteful doesn’t it?
Is this (triple) duplication a bad thing? Well, not exactly, it is a good thing the tests and the code are written twice, this greatly reduces the amount of bugs and errors that always happen when writing something. Thinking and writing those things twice is actually a good thing!
There is a major problem with having duplication: divergence.
What happens when a programmer decides to change the fields in the REST API? The moment he does that the documentation isn’t correct. In the best case all three, documentation, tests and code, get (manually) updated. But this just never happens in real life… Tests start lacking behind, documentation ends up being wrong, code ends up doing things we don’t expect. Mostly because we are human and don’t like doing things manually, we are focused on one thing, the code, or the test, or EA.
Another problem is that all three documents are ‘alive’. During design sessions people will change the documentation in EA, but this isn’t yet implemented in the code. Also the test isn’t yet updated. So at every possible moment in time none of the documents hold the entire truth. If you read the documentation you can never rely that the code is actually implemented.
Verify and automate!
If all three things (documentation, tests, code) do the same thing (namely ‘describing how the program should behave’) why don’t we automatically verify this?
If you have a REST API description with all fields and types in Enterprise Architect, why not verify this with the actual code?
A use case describes all possible paths and the expected outcomes, this is 100% the same as your (automated) tests should be!
We could (and should!) automate and verify this.
This morning the architect showed us three random examples from pieces of documentation in EA (to teach us how to work with the tool). But NONE of the three examples were complete and/or correct. Not a single example was the same in the documentation as implemented in the code. If this is the case, why even bother writing the documentation? It will keep diverging and become more and more useless.
In a previous project I did, we didn’t have any real documentation. We ‘only’ had Fitnesse tests! But those tests were just as readable as documentation.
The big advantage was that we had a complete set of tests that:
- are readable as documentation (like use cases)
- are executable and verify our code does what we expect (and visa versa)
- ensure the documentation, tests and code are the same
This fixes all the problems regarding divergence and testing. When we wrote down the specification we instantly have our use cases. How do we know the code isn’t ready yet? When this documentation is executed, it fails! Writing this initial test text should be done together with programmers, testers and the product owner.
So is Fitnesse the best tool for the job? Probably not, it isn’t as elaborate as Enterprise Architect (thankfully?) but maybe it could use some more structure. The big advantage of EA is the re-use of (partial) use cases when we create a new part of the application, you just drag and link parts together. In my experience people don’t treat Fitnesse with the proper respect (it is ‘just’ test code). For some reason Fitnesse code always ends up as spaghetti code, isn’t reusable and becomes a mess, while it should be considered the entire complete and only truth!
Most tools seem to be very developer focussed, Fitnesse and Cucumber. What other tools can help us accomplish this goal of automating and verifying this trinity? Are there better alternatives?
A couple of weeks ago was Devoxx in Antwerp again, the largest annual European Java conference. As always I was there, with my camera, to capture the amazing atmosphere:
We (me and my colleagues) had a great time, and learned some new things. But most of all, we met great people and received a lot of inspiration. This year I´ve done a short 5 minute Ignite talk on Mutation testing. This was my first ´Ignite´ session and it is hard! The format of an Ignite session is, 20 slides, 5 minutes, auto forward each 20 seconds. This means timing is everything. A 50 minute talk is much easier, you decide when you´ve told enough and press next.
Also, my J-Fall presentation on mutation testing is now available online (50 minutes, in Dutch).
⋆Brag⋆ It was voted as best session of the conference with 61 people! Also of all the visitors it was voted second ´most populair´, surpassed by one vote, but the other session had more people in the room!
Maybe it’ll be part of JDK 9, maybe it won’t… but people are working hard on creating a REPL tool/environment for the Java Development Kit (JDK). More information on the code and the project is available as part of OpenJDK: project Kulla.
Some of you might not be familiar with the terminology ‘REPL’, it is short for Read Evaluate Print Loop. It is hard to explain exactly what it does, but very easy to demonstrate:
The idea is that you have a place you can enter Java and run it, without main method, class structure. You can build instances, and alter them while you type. For example you’ll able to do Swing development while you type:
Now we have a visible frame, you can drag it around, resize it etc.
Suddenly our frame has a panel, and the panel has an empty button! You can prototype, do live coding and you have instant feedback.
Now the button has text, but pressing it still does nothing…
And there we go, a final simple lambda creates a working “Hello World!”-button.
It is also possible to load a repl script from file, allowing you to share, store and run scripts. This is done using the ‘/load’ and ‘/save’ commands. You can even ‘/dump’ all the created classes in a directory.
I’m very curious how people will be using the REPL in the future, some use cases:
If you want to try out Kulla, it took me literally 20 minutes to get up and running on my MacBook. Just follow the instructions on AdoptOpenJDK, but instead use http://hg.openjdk.java.net/kulla/dev/ as codebase. After building the JDK, go to ./langtools/repl and look at the README.
The post below is the content from my 2014 J-Fall and Devoxx Ignite presentations. You can check you the slides here:
We all do testing
In this day and age you aren’t considered a real Java developer if you are not writing proper unit tests.
We all know why this is important:
- Instant verification our code works.
- Automatic future regressions tests.
But how do we know we are writing proper tests? Well most people use code coverage to measure this. If the percentage of coverage is high enough you are doing a good job.
What is a test?
First let’s look at what a test actually is:
- Instantiate classes, setup mocks.
- Invoke something.
- Assert and verify the outcome.
Which steps are measured with code coverage? Only steps 1 and 2. And what is the most important thing for a test? It is the third and final step, the assertion, the place where you actually check if the code is working. This is completely ignored by code coverage!
I’ve seen companies where management looks at code coverage reports, they demand the programmers to write 80+ or 90+% coverage because this proves the quality is good enough. What else is a common thing in these organisations? Tests without any real assertion. Tests written purely to boost coverage and please management.
So code coverage says absolutely nothing about the quality of our tests? Well, it does tell you one thing. If you have 0% coverage, you have no tests at all, if you have 100% coverage you might have some very bad tests.
Luckily there is help around the corner, in the form of mutation testing. In mutation testing you create thousands of ‘mutants’ of your codebase. So what is this mutant you might ask? A mutation is a tiny singular change in your codebase.
For each mutant the unit tests are run and there are a couple of possible outcomes:
If you are lucky a test will fail. This means we have ‘killed’ our mutant. This is a positive thing, we’ve actually checked that the mutated line of code is correctly asserted by a test. Now we immediately see the advantage of using mutation testing, we actually verify the assertions in our tests.
Another possible outcome is that our mutant has survived, meaning no test fails. This is scary, it means the logic we’ve changed isn’t verified by a test. If someone would (accidentally) make this change in your codebase, the automatic build won’t break.
In Java (and other languages as well) there are frameworks for doing mutation testing. One of the most complete and modern frameworks for doing mutation testing in Java is called PIT. The generation of mutants and the process running the tests is fully automated and easy to use, just as easy as code coverage. There are Maven, Ant, Gradle, Sonarqube, Eclipse and IntelliJ plugins available!
What about performance?
Using mutation testing isn’t a silver bullet and it doesn’t come without any drawbacks. The major disadvantage is performance. This is the reason it never took off in the 1980s. At that time it would take an entire evening to run all your unit tests, so people could only dream of creating thousands of mutants and running the tests again. Luckily CPU’s have become a lot faster, and PIT has other tricks to speed up the process.
One thing PIT does is that it uses code coverage! Not as a measurement of the quality of your tests but as a tool. Before creating the mutants PIT calculates the code coverage of all unit tests. Now when PIT creates a mutant of a particular line of code it looks at the tests covering that line. If a line is only covered by three unit tests, it only runs those three tests. This greatly decreases the amount of tests needed to run for each mutation.
There are other tricks as well, for example PIT can track the changes in your codebase. It doesn’t need to create mutants if the code isn’t edited.
Code coverage is a horrible way of measuring the quality of your tests. It only says something about the invocations but nothing about the actual assertions. Mutation testing is much better, it gives an accurate report on the quality and you can’t ‘game’ the statistics. The only way to fake mutation coverage is to write real tests with good assertions.
Check it out now: http://pitest.org
Below is a build-log on how to build Commander Keen: Keen Dreams (which was recently released on Github) on OS/X using DOSBox and a shareware version found online.
Download and install DOSBox
Create a new folder, this will be your DOSBox mount-point.
Copy all the contents of the directories DISK1/DISK2/DISK3 from tasm5.zip to \TEMP
Install Borland C++ 3:
Copy all the contents to \BORLANDC.
Copy all the source files from Commander Keen to \KEEN.
Fire up DOSBox, mount the mount folder to C.
Put the following paths to PATH:
Go into C:\TEMP and run the installer to install TASM.
Go into directory C:\KEEN\STATIC and run ‘make.bat’
In the directory C:\KEEN and run ‘BC’ to start Borland C++.
To change the Borland directories to the correct path go to: Options > Directories and change the paths to C:\BORLANDC\*.
Compile and run! It creates the binary for me KDREAMS.EXE.
But sadly, when I run the executable it says “Can’t open KDREAMS.MAP!” :-(
It turns out you’ll need to own the game’s actual content before you can run this source code.
Thankfully a shareware version can be downloaded here: keendm.zip. This corresponds with the 1.01S version of the released source code, which is also released here.
Copy the missing files (SHL, MAP, AUD, etc) from the shareware version and play your own compiled Commander Keen!
Pretty interesting to see, I’ve already worked with him on improving the algorithm to generate these mosaics in the past. But next he set me a challenge: Find your own game thumbnail, it is in there somewhere!
This is the screenshot of my game, used as thumbnail:
So I went though the thumbnails, one time.. and a second time… then I decided to solve it like a real programmer:
How does it work? Well it is pretty simple:
- Input #1: mosaic.jpg
- Input #2: Amount of thumbnails width and height
- Input #3: screenshot.png
- The program resizes my game screenshot to the thumbnail size.
- Next it loops over all sub-images of the mosaic.
- For every sub-thumbnail: Calculate the error (+=Math.abs(mosaicPixelValue - screenshotPixelValue) for each color, for each pixel)
- Store the location of the thumbnail with the smallest error!
That is it, solved it in 10 minutes of coding!
(and another 10 minutes to make the visual conformation and make the animated gif).
Last weekend the 30th Ludum Dare competition took place. For those us you unknown with Ludum Dare, this is a very short international game programming contest. You are allowed to use any tool or language but there are strict rules:
- The theme is revealed at the start (and the game must match this theme).
- You get 48 hours, nothing more or less.
- Every image, sprite, song and/or sound effect in the game should be made within these 48 hour.
- The result is open source (but you pick the license).
- You work alone.
(There is also a ‘Jam’ version where you can work in teams, can do closed source, can use images/sounds and you have 72 hours)
The theme this year was ‘Connected Worlds’. This is a pretty broad term, so I started to think. How about a world where the main character is on one planet, and his love is on another planet. The planets are tantalisingly close (nearly touching) but out of reach. Our hero has to build a rocket to reach his love.
Drawing drawing drawing…
The decision to make a point-n-click game has a huge impact on how I get to spend my time during the contest. This type of game needs a lot of images, sprites and of course fun puzzles. In the end I think I’ve been busy drawing 90% of the time (on paper and in GIMP) and maybe 10% actual programming. After 48 hours my hands were cramped up from all the drawing instead of typing heh (I need a digital draw-tablet).
In the future somebody will inevitably invent a teleport, no doubt about that.
But how will it work?
The most likely way to teleport would be to digitalise yourself. Some yet undiscovered very high resolution MRI/CT scanner will scan every atom in your body and send this over to the receiver. This atom printer will build up your entire body again.
However, during a work lunch discussion, I came up with some scary fundamental problems with teleportation.
What would happen if, during transmission, we get a failure? We don’t want to end up with a failed teleport… which would mean the person getting teleported is dead.
That is why we need to have some kind of two-phase commit. First we digitalise the person, we send this over the line, build the person up on the other side. Once this process has been completed, we ‘delete’ the original copy. Because we don’t want to end up with thousands of clones.
Wait… what? Delete?
What would it feel like stepping into the teleporter? First a copy of you is made, this copy eventually walks out of the receiving end. But what happens to you? You’ll step into a machine, which makes a copy, and disposes you! You’ll be exterminated, killed, pushing up the daisies, your metabolic processes will be history, kicking the bucket, you’ll be an ex-parrot.
Lets not dwell too long on the loss of the old you. Of course YOU are also the one walking out of the teleporter, where nothing has happened but a successful teleport.
But is that really… you?
What defines you?
Are you just a selection of atoms clumped together?
If we make an exact copy, is that still you?
Did you know that (according to some research) every year almost 98% of the atoms currently in your body will be replaced? That would mean that a year from now, you will just be 2%… you!
Conclusion: I hope they won’t invent a teleporter while I’m alive.