An introduction to Mocks, Spies and Fakes in unit testing

When we’re unit testing, in addition to knowing the boundary conditions, it is important to know what classification our dependencies fall into, in testing terms. These classifications are:

  • Mocks
  • Spies
  • Fakes

Mocks

As the name suggests, a mock is a proxy or placeholder value that can be programmed to respond just like a real object would under any or matched conditions. A mock should be used where the real object dependency we have is very complicated to deal with during testing. This could be (but is not limited to):

  • Third party interactions (database/queue/api/files)
  • Classes that do really complicated stuff that take a lot of time
  • Things that work with random
  • Time based work

Mocking dependencies is a three phased approach:

  1. Create the mock of the dependency based on the class or interface.
  2. “Arm the mock” by preparing it work request/responses based on any or matched interactions
  3. Set the mock on the test subject

Creating the mock could be the easiest or the hardest thing of all of the three because it depends on what you’re mocking and how you’re mocking it. If the test subject is relying on a class directly rather than an interface, mocking could become a bit trickier depending on how that class was implemented. This is because when you’re mocking a class, it has to first create a new instance of that class (by calling its constructor and by extension constructing all its ancestors). If at any one point, if one of the ancestors is doing something funky with one of its dependencies within the constructor, you’ll have to mock that out. This requires some level of knowledge or understanding of the entire class hierarchy and this is why use of interfaces for your dependencies is preferred.

When mocking an interface, Mockito uses JDK Proxy (as opposed to CGLib when mocking a class) which plays much more nicely with everything else. Also, because interfaces cannot be instantiated, there is no hierarchy to follow and the JDK Proxy wraps around it without any problems (or worry about dependencies). Although note that I have yet to personally test how this works with Java 9 which allows interfaces to have implementation code in them.

Arming the mock is relatively straight forward and involves three steps:

  • Knowing which part of class to “arm” Which method are you mocking?
  • Knowing what to return (or do) What should that method return/do?
  • Knowing what to match Under which parameters should this mock work?

In most mocking frameworks, you can mock methods and have them either call the real thing (if its a spy; covered in next point) or have them return a custom value. In addition to this, you can specify the circumstances under which the mock should do what you’ve configured it to do. This is where the matchers come in. I am not going to cover how the matchers work in this post as that is a whole thing of its own, you can find out more here:

https://static.javadoc.io/org.mockito/mockito-all/1.9.0/org/mockito/Matchers.html

Once you’ve armed the mock, it is time to make it available to the test subject. Now, ideally, this should be done in a way that the test subject isn’t even aware of the mocking, in a most non-intrusive way. Here are a couple of ways you can do this in decreasing order of intrusion:

  • Setter methods with default access modifier
  • Constructor args
  • Inversion of control – field level dependency Injection magic

Obviously its great to aim for point three but sometimes its harder to get DI involved in a project that has never has it so my suggestion here would be to work backwards from that.

When testing, while mocks are great at controlling the behaviours of the dependencies that your implementation is relying on, it is also great to verify whether or not the dependency is used in certain flows. In mockito for example, here you can ask Mockito to verify whether a particular mocked method was called, how many times it was called and even checking in certain cases to ensure that it wasn’t called!

Spies

Spies are like mocks, the only difference being that there is a real object under there. This is not the “real” object that CGLib wraps around, it is the one that you create to then set a spy on. Generally, spies are used in cases where you want to track the interactions that your test subject is making. This is nasty in my opinion and should only be used where you have to mock one of the test subject’s own methods (because they are being called from one of its other methods) or when you want to verify that a certain method is called.

Why is it nasty? Well it is because technically after setting a spy on the test subject object, it is no longer the real test subject. In addition to that spying on the test subject makes it easy to accidentally leave a mock on one of the methods of the test subject’s spy without realising and have the test(s) accidentally pass!

Spying generally involves three steps:

  1. Create real object
  2. Create Spy around the real object
  3. Set (or use) the spy

Fakes

Faking is not mocking. Fakes are real objects but they have fake values in them. My suggestion here is to fake POJOs and other “simple” classes. The general rule of thumb that I think everybody should follow is that if the external dependency is more work to mock than fake then fake. Faking should extend, but not be limited to:

  • Data structures (Map, Queue, List)
  • POJOs
  • Builder classes

Like mocks, using a fake involves two simple steps:

  1. Create the fake with required values
  2. Set the fake

As always, be mindful of the inheritance hierarchy here to ensure it doesn’t become unreliable and make our lives harder than it should be. An unreliable fake is a marker for something that should really be mocked.

Of course, all the three classifications above can be used in conjunction with each other. For example, recently I had to spy on the test subject, mock one of its methods to return a mock object which then in turn returns a fake object when one of its methods is called!

Finding biggest files and folders on Linux/Unix systems

I was happily doing some builds on my Jenkins slave server at home and then suddenly boom! it broke. It took all the queued up builds with it because they all started failing. When I looked, it was a disk space issue.

Normally my builds don’t take up much disk space so I started investigating. I needed to find files that were occupying largest size on the disk. After some trial and error, following command came to rescue:

This was great, but then I wanted to find folders that were the biggest. No problem!

Notice the -type d above which differentiates it from the first command. Also note that the above command outputting the disk use by folder also includes subfolders, so generally the largest ones will be the top level folders and as you go up the list (its ascending in order by default) you’ll find the subfolders with their sizes. You can always pipe the whole thing through grep to only look for the folders you want like so:

While this is great and all, sometimes you just want to go upto certain depth within the file tree. Say no more!

Notice the -maxdepth 2 flag which sets the max depth to 2.

However, you could be one of those people who don’t like the find command at all. Maybe a past feud, or just a dislike. Well, the du command has your back!

The -m flag makes it print file sizes in megabytes, -d 2 sets max depth to 2 and --all tells it to work with files as well as directories. Its actually quite comprehensive because using some clever flags like -I to provide a mask for files and directories to ignore and -L to follow symbolic links (they are not followed by default) you can get quite a lot out of it. Also, a quick note before ending this post, you can switch the file size block from -m for megabytes (example above) to -g for gigabytes or even -k for kilobytes. You can use -h for human readable where it will automatically choose the closest block size but this will confuse the sort because it doesn’t quite take the size character in account and only sorts things using the numeric values.

Above commands have been tested on mac where they were installed as part of GNU CoreUtils homebrew package.

Download Java Cryptography Extension (JCE) jars using curl

Normally, in order to acquire the JCE jars, you have to:

  1. Go to google
  2. Search JCE Jars
  3. Navigate to oracle website from search results
  4. Accept some agreement agreeing to sell your soul to Oracle
  5. Finally download the jars.

While this is great, it doesn’t work well for when you want JCE jars in your docker container. I mean, yeah you can get it by downloading it into a static directory so that it is available to docker during build time but to be honest, that is a bit lame. Lamer than just fetching it from a URL.

Now, I can’t remember the exact source but after much head banging and googling, I found the following command to download Java Cryptography Extension jars on the fly using curl.

At this point in time, I don’t have links to other versions of this. I needed it for Java 8 and I found it for Java 8. Regardless, I can’t imagine why it wouldn’t work if you’d just change the version from 8 to 7 in the above command.

How to delete all entries from Java JKS Keystore

I had to deal with this recently. After much trial and error, here’s the command that you can use to wipe your Java JKS Keystore of all its entries:

Here, the variable KEYSTOREis the path to your Java keystore and the variable KEYSTORE_PASS is the keystore’s password. If you are not comfortable in using the keystore password plain text in command line, I’d suggest you use an alternative version using a file containing keystore password or name of an environment variable instead. This will hide the password from appearing in shell history. You can do this by suffixing the -storepass argument with :file or :env resulting in it effectively becoming -storepass:file <path/to/file> or -storepass:env <ENV_NAME_WITHOUT_$. Here are some examples:

In the above, notice how the ${KEYSTORE_PASS} environment variable has changed to ${KEYSTORE_PASS_FILE}. Use this to provide a path to the file containing your keystore password.

Similar to previous, this one has been slightly modified to use the -storepass:env flag with ${KEYSTORE_PASS_ENV} environment variable instead.

Generating random bytes in Java

I recently needed to generate a bit of randomness in Java in order to produce a secret. Java comes built in with Random and SecureRandom classes which can help you do this properly but as with all things, there are multiple ways of doing things.

Two of these stood out to me.

Generate a long random number as String, convert to hex and then convert it to bytes.

You could potentially improve this method by not just converting the long number to hex string but also maybe base64 encoding it. You could potentially go further by adding additional entropy to it by filling in random bytes in random indices.

But as it stands above, the benefit of this method is that it is very quick to run. This may vary based on your random seed but I have found the above method to consistently produce byte arrays of size 14-16. This might be good enough in most cases – especially due to the fact that its fast, but in times where you might need very high amounts of entropy, the second approach might suit you.

My personal discontent with this method is that the bytes it produces will be chunked by the length of each individual hex character due to the fact that it is coming from a hex string. This, arguably is not very random. Arguably because although every individual hex characters are themselves in random order, the bytes generated off of those will have identifiable chunks representing each hex character. The second approach resolves this problem in a simpler way.

Another issue I have with this approach is that the total size is limited to the maximum value a Long can support (263-1). The size of Long data structure limits the length of hex string that gets generated which in turn limits total number of bytes produced.

The approach below resolves most of these issues.

Generate a array of random size composed of bytes and then let SecureRandom fill in those bytes.

Here, we’re creating a byte array of random size and then using secure random to fill that byte array in with random bytes. Its simple and elegant. However, as with all things, this has pros and cons of its own.

In my experiments, I have found the range of the byte array to be truly random. In a test that I ran, once it created a byte array of size 8890 while in another time, it created an array so large, I lost my patience and had to quit the process.

This poses a problem where your program could take a long time to generate your secret. Furthermore, it could potentially even go out of stack memory if the array becomes too large.

You can resolve this issue by setting a bound to the random.nextInt which is being used to determine the size of the byte array. I cannot tell you what size here is most optimal because it really depends on the capabilities of the processor you’re running on, your stack size as well as the algorithm you’re using to initialise SecureRandom. An implementation limiting the size of the byte array may look like following:

Here the size of the byte array will be between 0 and 20.

Also, please note that without the Math.abs the value for byteArraySize could be negative. If you initialise your array with a negative number for size, you will get NegativeArraySizeException.