Bulk edit filenames using shell

Unix/linux command to remove specific matching text from all files. In this case, I had a list of tiles that had -defaults.properties suffix. I wanted to rename them all so that they had just .properties suffix instead. For example, if a file was named connection-defaults.properties, I wanted it to be renamed to connection.properties

As usual, rather than wasting my time and doing it manually 100 times, I set myself to find an automated way of doing it. And sure enough, I found it!

Here it is:

The above command finds all files that end with -defaults.properties. For each file that match that criteria, we then rename it to a name that has -defaults stripped out. To strip out the name, we use sed command and to rename we use mv command.

Tested on mac running oh-my-zsh.

Documenting project information with Maven

In most projects I’ve worked on, the project information is captured in some sort of README.md file. While this has its strengths, such as ability to write free form text and embed images, it doesn’t quite fit with the project because when the artifact is produced during the build time, the read me is not checked into the artifact repository along with it. So how do I know who the contributors were for that specific artifact version? How do I know where the project was hosted at the time?

Well, today I learned that maven provides ability to capture some of those attributes and some more! I like this because it means that when the artifact gets checked in (along with its pom.xml) the details about the project are captured for that version, frozen in time.

Great thing about these attibutes is that when you run mvn site they are used in the project information html page that maven produces. Technically, then you can upload this to any statically hosted site location, all versioned up and ready to be explored.

Ok so lets begin exploring these attributes. The first three attributes are the basic ones:

For people working in corporations and companies, the following might be useful, especially when open sourcing the project:

Additionally, the licenses tag can be used to describe some licensing information. How many times have you come across a project that has changed the licensing information half way through versions? This is a life saver (at least legally)!

Some source control information is also useful, however, whats more useful is knowing where to go in order to raise issues. In most companies, people use source control to store code but then use an alternative mechanism like Jira or Trello to mange issues. The scm and issueManagement tags are useful here in clarifying such information:

Developer information on projects is very useful. Companies don’t usually like this because, well, developers are supposed to be expendable, however, in my humble opinion, it is useful to list developers who were working on the project at the time. This way the versions can be tracked, at least to the lead working on it at the time. Use it thusly:

For contributors, people who have worked on the project, although not necessarily part of it exclusively, use:

Lastly, it is useful to capture some information about pipeline that was used to create the project. For this, maven has ciManagement tag:

The above is just a small example of what you can do with these attributes, there are many more sub-attributes that you can explore. Hope this is a good introduction and kicks off some ideas of what you can do with this information.

Finding classes in jar/war files

Recently I was out looking for classes that were present in my Web Application Archive (WAR) file. Why? Well, I was out combating Jar Hell. As I am allergic to doing things manually, here’s a small command I constructed to help me find classes within jar files:

The above command prints out class file as well as the name of the jar file that class is present in. If you’re looking just for the path where the classes are present, you can just run:


Update all pom versions at once using maven versions

Most of the projects I work on are multi-module projects so updating versions of pom files manually is a bit pain. As always, here’s a command you can run that will update all pom versions in one go:

Make sure all your modules are discoverable. You can do this by enabling all your profiles in case some of your sub-modules are not visible from the main pom.

Source: https://stackoverflow.com/a/5726412

Finding biggest files and folders on Linux/Unix systems

I was happily doing some builds on my Jenkins slave server at home and then suddenly boom! it broke. It took all the queued up builds with it because they all started failing. When I looked, it was a disk space issue.

Normally my builds don’t take up much disk space so I started investigating. I needed to find files that were occupying largest size on the disk. After some trial and error, following command came to rescue:

This was great, but then I wanted to find folders that were the biggest. No problem!

Notice the -type d above which differentiates it from the first command. Also note that the above command outputting the disk use by folder also includes subfolders, so generally the largest ones will be the top level folders and as you go up the list (its ascending in order by default) you’ll find the subfolders with their sizes. You can always pipe the whole thing through grep to only look for the folders you want like so:

While this is great and all, sometimes you just want to go upto certain depth within the file tree. Say no more!

Notice the -maxdepth 2 flag which sets the max depth to 2.

However, you could be one of those people who don’t like the find command at all. Maybe a past feud, or just a dislike. Well, the du command has your back!

The -m flag makes it print file sizes in megabytes, -d 2 sets max depth to 2 and --all tells it to work with files as well as directories. Its actually quite comprehensive because using some clever flags like -I to provide a mask for files and directories to ignore and -L to follow symbolic links (they are not followed by default) you can get quite a lot out of it. Also, a quick note before ending this post, you can switch the file size block from -m for megabytes (example above) to -g for gigabytes or even -k for kilobytes. You can use -h for human readable where it will automatically choose the closest block size but this will confuse the sort because it doesn’t quite take the size character in account and only sorts things using the numeric values.

Above commands have been tested on mac where they were installed as part of GNU CoreUtils homebrew package.