Finding biggest files and folders on Linux/Unix systems

I was happily doing some builds on my Jenkins slave server at home and then suddenly boom! it broke. It took all the queued up builds with it because they all started failing. When I looked, it was a disk space issue.

Normally my builds don’t take up much disk space so I started investigating. I needed to find files that were occupying largest size on the disk. After some trial and error, following command came to rescue:

This was great, but then I wanted to find folders that were the biggest. No problem!

Notice the -type d above which differentiates it from the first command. Also note that the above command outputting the disk use by folder also includes subfolders, so generally the largest ones will be the top level folders and as you go up the list (its ascending in order by default) you’ll find the subfolders with their sizes. You can always pipe the whole thing through grep to only look for the folders you want like so:

While this is great and all, sometimes you just want to go upto certain depth within the file tree. Say no more!

Notice the -maxdepth 2 flag which sets the max depth to 2.

However, you could be one of those people who don’t like the find command at all. Maybe a past feud, or just a dislike. Well, the du command has your back!

The -m flag makes it print file sizes in megabytes, -d 2 sets max depth to 2 and --all tells it to work with files as well as directories. Its actually quite comprehensive because using some clever flags like -I to provide a mask for files and directories to ignore and -L to follow symbolic links (they are not followed by default) you can get quite a lot out of it. Also, a quick note before ending this post, you can switch the file size block from -m for megabytes (example above) to -g for gigabytes or even -k for kilobytes. You can use -h for human readable where it will automatically choose the closest block size but this will confuse the sort because it doesn’t quite take the size character in account and only sorts things using the numeric values.

Above commands have been tested on mac where they were installed as part of GNU CoreUtils homebrew package.

Download Java Cryptography Extension (JCE) jars using curl

Normally, in order to acquire the JCE jars, you have to:

  1. Go to google
  2. Search JCE Jars
  3. Navigate to oracle website from search results
  4. Accept some agreement agreeing to sell your soul to Oracle
  5. Finally download the jars.

While this is great, it doesn’t work well for when you want JCE jars in your docker container. I mean, yeah you can get it by downloading it into a static directory so that it is available to docker during build time but to be honest, that is a bit lame. Lamer than just fetching it from a URL.

Now, I can’t remember the exact source but after much head banging and googling, I found the following command to download Java Cryptography Extension jars on the fly using curl.

At this point in time, I don’t have links to other versions of this. I needed it for Java 8 and I found it for Java 8. Regardless, I can’t imagine why it wouldn’t work if you’d just change the version from 8 to 7 in the above command.

How to delete all entries from Java JKS Keystore

I had to deal with this recently. After much trial and error, here’s the command that you can use to wipe your Java JKS Keystore of all its entries:

Here, the variable KEYSTOREis the path to your Java keystore and the variable KEYSTORE_PASS is the keystore’s password. If you are not comfortable in using the keystore password plain text in command line, I’d suggest you use an alternative version using a file containing keystore password or name of an environment variable instead. This will hide the password from appearing in shell history. You can do this by suffixing the -storepass argument with :file or :env resulting in it effectively becoming -storepass:file <path/to/file> or -storepass:env <ENV_NAME_WITHOUT_$. Here are some examples:

In the above, notice how the ${KEYSTORE_PASS} environment variable has changed to ${KEYSTORE_PASS_FILE}. Use this to provide a path to the file containing your keystore password.

Similar to previous, this one has been slightly modified to use the -storepass:env flag with ${KEYSTORE_PASS_ENV} environment variable instead.

Generating random bytes in Java

I recently needed to generate a bit of randomness in Java in order to produce a secret. Java comes built in with Random and SecureRandom classes which can help you do this properly but as with all things, there are multiple ways of doing things.

Two of these stood out to me.

Generate a long random number as String, convert to hex and then convert it to bytes.

You could potentially improve this method by not just converting the long number to hex string but also maybe base64 encoding it. You could potentially go further by adding additional entropy to it by filling in random bytes in random indices.

But as it stands above, the benefit of this method is that it is very quick to run. This may vary based on your random seed but I have found the above method to consistently produce byte arrays of size 14-16. This might be good enough in most cases – especially due to the fact that its fast, but in times where you might need very high amounts of entropy, the second approach might suit you.

My personal discontent with this method is that the bytes it produces will be chunked by the length of each individual hex character due to the fact that it is coming from a hex string. This, arguably is not very random. Arguably because although every individual hex characters are themselves in random order, the bytes generated off of those will have identifiable chunks representing each hex character. The second approach resolves this problem in a simpler way.

Another issue I have with this approach is that the total size is limited to the maximum value a Long can support (263-1). The size of Long data structure limits the length of hex string that gets generated which in turn limits total number of bytes produced.

The approach below resolves most of these issues.

Generate a array of random size composed of bytes and then let SecureRandom fill in those bytes.

Here, we’re creating a byte array of random size and then using secure random to fill that byte array in with random bytes. Its simple and elegant. However, as with all things, this has pros and cons of its own.

In my experiments, I have found the range of the byte array to be truly random. In a test that I ran, once it created a byte array of size 8890 while in another time, it created an array so large, I lost my patience and had to quit the process.

This poses a problem where your program could take a long time to generate your secret. Furthermore, it could potentially even go out of stack memory if the array becomes too large.

You can resolve this issue by setting a bound to the random.nextInt which is being used to determine the size of the byte array. I cannot tell you what size here is most optimal because it really depends on the capabilities of the processor you’re running on, your stack size as well as the algorithm you’re using to initialise SecureRandom. An implementation limiting the size of the byte array may look like following:

Here the size of the byte array will be between 0 and 20.

Also, please note that without the Math.abs the value for byteArraySize could be negative. If you initialise your array with a negative number for size, you will get NegativeArraySizeException.

Making executable jar using maven

I was trying something out the other day and wanted to write a really simple application. So I created a simple application backed by Maven.

Now I could run the jar file that maven built using the standard -e flag that lets java know the entry point but all that is too main stream. I wanted maven to handle that for me.

After doing some googling, I found a plugin provided by codehaus. This one allowed me to run the application through maven. As you can see below, the configuration is quite simple.

Make sure you update the value inside mainClass with fully qualified name of your class that you want to run. This class must have a public static void main method in it.

Once you are happy with the configuration, you can run your application by executing the following in your terminal

While this is great, I cannot run the jar on its own on a server somewhere. If I wanted to, I’d have to get the source code with maven and then run it using the above command. Thats sub-optimal. So I went googling again.

Finally, I found this wonderful plugin provided by our friends at Apache. This is the maven-jar-plugin. This is a standard jar plugin but one of its features is ability to specify a mainClass attribute – just like the codehaus plugin. But unlike the codehaus plugin, I can run the jar as a standalone application without needing to pass in any other flags or parameters indicating the main class.

Here’s the maven build plugin configuration to use the maven-jar-plugin.

As it is standard with maven, package up your application using:

If the build was successful, just run your jar file using standard java -jar path/to/app.jar command. In my case, I ran: