Testing the Database Layer

https://www.spiraltrain.nl/test-driven-development-met-junit/

https://www.spiraltrain.nl/web-testing-met-selenium/

There’s one thing we can agree on when it comes to database tests: they ain’t easy. In this JAX Magazine preview, testing guru and JAX London speaker Colin Vipurs runs through the strengths and weaknesses of common approaches to testing databases.

Over my many years of software development I’ve had to perform various levels of testing against many different database instances and types including RDBMS and NoSQL, and one thing remains constant – it’s hard. There are a few approaches that can be taken when testing the database layer of your code and I’d like to go over a few of them pointing out the strengths and weaknesses of each.

Mocking

This is a technique that I have used in the past but I would highly recommend against doing now. In my book “Tests Need Love Too” I discuss why you should never mock any third-party interface, but just in case you haven’t read it (you really should!) I’ll go over it again.

As with mocking any code you don’t own, what you’re validating is that you’re calling the third-party code in the way you think you should, but, and here’s the important part – this might be incorrect. Unless you have higher lever tests covering your code, you’re not going to know until it hits production. In addition to this, mocking raw JDBC is hard, like really hard. Take for example the test code snippet in Listing 1.

Colin listing 1

Within this test, not only are there a huge amount of expectations to setup, but in order to verify that all the calls happen in the correct order, jMock “states” are used extensively. Because of the way JDBC works, this test also violates the guidelines of never having mocks returning mocks and in fact goes several levels deep! Even if you manage to get all of this working, something as simple as a typo in your SQL can mean that although your tests are green this will still fail when your code goes to production.

A final note on mocking – no sane developer these days would be using raw JDBC, but one of the higher-level abstractions available, and the same rules apply for these. Imagine a suite of tests setup to mock against JDBC and your code switches to Spring JdbcTemplate, jOOQ or Hibernate. Your tests will now have to be rewritten to mock against those frameworks instead – not an ideal solution.

Testing Against a Real Database

It may sound silly, but the best way to verify that your database interaction code works as expected is to actually have it interact with a database! As well as ensuring you’re using your chosen API correctly this technique can verify things that mocking never can, for example, your SQL is syntactically correct and does what you hope.

In-Memory Databases: One the easiest and quickest ways to get setup with a database to test against is to use one of the in-memory versions available, e. g. H2, HSQL or Derby. If you’re happy introducing a Spring dependency into your code, then the test setup can be as easy as this (Listing 2).

This code will create an instance of the H2 database, load the schema defined in schema.sql and any test data in test-data.sql. The returned object implements javax.sql.DataSource so can be injected directly into any class that requires it.

One of the great benefits of this approach is that it is fast. You can spin up a new database instance for each and every test requiring it giving you a cast iron guarantee that the data is clean. You also don’t need any extra infrastructure on your development machine as it’s all done within the JVM. This mechanism isn’t without its drawbacks though.

Unless you’re deploying against the same in-memory database that you’re using in your test, inevitably you will run up against compatibility issues that won’t surface until you hit higher level testing or god forbid – production. Because you’re using a different DataSource to your production instance it can be easy to miss configuration options required to make the Driver operate correctly. Recently I came across such a setup where H2 was configured to use a DATETIME column requiring millisecond precision. The same schema definition was used on a production MySQL instance which not only required this to be DATETIME(3) but also needs the useFractionalSeconds=true parameter provided to the driver. This issue was only spotted after the tests were migrated from using H2 to a real MySQL instance.

Real Databases:

Where possible I would highly recommend testing against a database that’s as close as possible to the one being run in your production environment. A variety of factors can make this difficult or even impossible, such as commercial databases requiring a license fee meaning that installing on each and every developer machine is prohibitively costly. A classic way to get around this problem is to have a single development database available for everyone to connect to. This in itself can cause a different set of problems, not least of which is performance (these always seem to get installed on the cheapest and oldest hardware) and test repeatability. The issue with sharing a database with other developers is that two or more people running the tests at the same time can lead to inconsistent results and data shifting in an unexpected way. As the number of people using the database grows, this problem gets worse – throw the CI server into the mix and you can waste a lot of time re-running tests and trying to find out if anyone else is running tests right now in order to get a clean build.