Useful unix file editing commands

Find and replace text matching a regular expression in a single file

The above command searches for the (something\-[\da-zA-Z]+) regular expression and replaces it whole (because of the parenthesis which means to select the text matching the expression within) with ToReplace. The g in the end indicates that the operation will be applied to all matches in the file as supplied in path/to/file.txt argument. The -i parameter along with '' suggests the sed command to perform the edit on the file itself, without creating a new copy.

Find and replace text matching a regular expression in files matching name

The above command finds file within /base/directory whose names match *.txt format. Later we combine the output of this with xargs which appends each line of output (path to each matching file) to the following sed command. The -n 1 argument to the xargs command tells it to supply each line of argument one by one to the sed command.

Find and replace text matching a regular expression in files whose contents match a regular expression

In the above command we use the basic find command to get a list of files that we want to do the search in. Then we use grep to recursively find in those files. Now in the above command we don’t really need the -r flag for grep because the find command will list full paths to those files, but we would need it if we were doing find on a relative path instead (like .). The -l flag for grep here will only list the paths to the files that it found having the content TextToFindInFiles and not the actual matching contents like it usually prints. This list of the files is then outputted to the xargs command which then subsequently runs the sed command.


Documenting project information with Maven

In most projects I’ve worked on, the project information is captured in some sort of file. While this has its strengths, such as ability to write free form text and embed images, it doesn’t quite fit with the project because when the artifact is produced during the build time, the read me is not checked into the artifact repository along with it. So how do I know who the contributors were for that specific artifact version? How do I know where the project was hosted at the time?

Well, today I learned that maven provides ability to capture some of those attributes and some more! I like this because it means that when the artifact gets checked in (along with its pom.xml) the details about the project are captured for that version, frozen in time.

Great thing about these attibutes is that when you run mvn site they are used in the project information html page that maven produces. Technically, then you can upload this to any statically hosted site location, all versioned up and ready to be explored.

Ok so lets begin exploring these attributes. The first three attributes are the basic ones:

For people working in corporations and companies, the following might be useful, especially when open sourcing the project:

Additionally, the licenses tag can be used to describe some licensing information. How many times have you come across a project that has changed the licensing information half way through versions? This is a life saver (at least legally)!

Some source control information is also useful, however, whats more useful is knowing where to go in order to raise issues. In most companies, people use source control to store code but then use an alternative mechanism like Jira or Trello to mange issues. The scm and issueManagement tags are useful here in clarifying such information:

Developer information on projects is very useful. Companies don’t usually like this because, well, developers are supposed to be expendable, however, in my humble opinion, it is useful to list developers who were working on the project at the time. This way the versions can be tracked, at least to the lead working on it at the time. Use it thusly:

For contributors, people who have worked on the project, although not necessarily part of it exclusively, use:

Lastly, it is useful to capture some information about pipeline that was used to create the project. For this, maven has ciManagement tag:

The above is just a small example of what you can do with these attributes, there are many more sub-attributes that you can explore. Hope this is a good introduction and kicks off some ideas of what you can do with this information.

Finding classes in jar/war files

Recently I was out looking for classes that were present in my Web Application Archive (WAR) file. Why? Well, I was out combating Jar Hell. As I am allergic to doing things manually, here’s a small command I constructed to help me find classes within jar files:

The above command prints out class file as well as the name of the jar file that class is present in. If you’re looking just for the path where the classes are present, you can just run:


Update all pom versions at once using maven versions

Most of the projects I work on are multi-module projects so updating versions of pom files manually is a bit pain. As always, here’s a command you can run that will update all pom versions in one go:

Make sure all your modules are discoverable. You can do this by enabling all your profiles in case some of your sub-modules are not visible from the main pom.


An introduction to Mocks, Spies and Fakes in unit testing

When we’re unit testing, in addition to knowing the boundary conditions, it is important to know what classification our dependencies fall into, in testing terms. These classifications are:

  • Mocks
  • Spies
  • Fakes


As the name suggests, a mock is a proxy or placeholder value that can be programmed to respond just like a real object would under any or matched conditions. A mock should be used where the real object dependency we have is very complicated to deal with during testing. This could be (but is not limited to):

  • Third party interactions (database/queue/api/files)
  • Classes that do really complicated stuff that take a lot of time
  • Things that work with random
  • Time based work

Mocking dependencies is a three phased approach:

  1. Create the mock of the dependency based on the class or interface.
  2. “Arm the mock” by preparing it work request/responses based on any or matched interactions
  3. Set the mock on the test subject

Creating the mock could be the easiest or the hardest thing of all of the three because it depends on what you’re mocking and how you’re mocking it. If the test subject is relying on a class directly rather than an interface, mocking could become a bit trickier depending on how that class was implemented. This is because when you’re mocking a class, it has to first create a new instance of that class (by calling its constructor and by extension constructing all its ancestors). If at any one point, if one of the ancestors is doing something funky with one of its dependencies within the constructor, you’ll have to mock that out. This requires some level of knowledge or understanding of the entire class hierarchy and this is why use of interfaces for your dependencies is preferred.

When mocking an interface, Mockito uses JDK Proxy (as opposed to CGLib when mocking a class) which plays much more nicely with everything else. Also, because interfaces cannot be instantiated, there is no hierarchy to follow and the JDK Proxy wraps around it without any problems (or worry about dependencies). Although note that I have yet to personally test how this works with Java 9 which allows interfaces to have implementation code in them.

Arming the mock is relatively straight forward and involves three steps:

  • Knowing which part of class to “arm” Which method are you mocking?
  • Knowing what to return (or do) What should that method return/do?
  • Knowing what to match Under which parameters should this mock work?

In most mocking frameworks, you can mock methods and have them either call the real thing (if its a spy; covered in next point) or have them return a custom value. In addition to this, you can specify the circumstances under which the mock should do what you’ve configured it to do. This is where the matchers come in. I am not going to cover how the matchers work in this post as that is a whole thing of its own, you can find out more here:

Once you’ve armed the mock, it is time to make it available to the test subject. Now, ideally, this should be done in a way that the test subject isn’t even aware of the mocking, in a most non-intrusive way. Here are a couple of ways you can do this in decreasing order of intrusion:

  • Setter methods with default access modifier
  • Constructor args
  • Inversion of control – field level dependency Injection magic

Obviously its great to aim for point three but sometimes its harder to get DI involved in a project that has never has it so my suggestion here would be to work backwards from that.

When testing, while mocks are great at controlling the behaviours of the dependencies that your implementation is relying on, it is also great to verify whether or not the dependency is used in certain flows. In mockito for example, here you can ask Mockito to verify whether a particular mocked method was called, how many times it was called and even checking in certain cases to ensure that it wasn’t called!


Spies are like mocks, the only difference being that there is a real object under there. This is not the “real” object that CGLib wraps around, it is the one that you create to then set a spy on. Generally, spies are used in cases where you want to track the interactions that your test subject is making. This is nasty in my opinion and should only be used where you have to mock one of the test subject’s own methods (because they are being called from one of its other methods) or when you want to verify that a certain method is called.

Why is it nasty? Well it is because technically after setting a spy on the test subject object, it is no longer the real test subject. In addition to that spying on the test subject makes it easy to accidentally leave a mock on one of the methods of the test subject’s spy without realising and have the test(s) accidentally pass!

Spying generally involves three steps:

  1. Create real object
  2. Create Spy around the real object
  3. Set (or use) the spy


Faking is not mocking. Fakes are real objects but they have fake values in them. My suggestion here is to fake POJOs and other “simple” classes. The general rule of thumb that I think everybody should follow is that if the external dependency is more work to mock than fake then fake. Faking should extend, but not be limited to:

  • Data structures (Map, Queue, List)
  • POJOs
  • Builder classes

Like mocks, using a fake involves two simple steps:

  1. Create the fake with required values
  2. Set the fake

As always, be mindful of the inheritance hierarchy here to ensure it doesn’t become unreliable and make our lives harder than it should be. An unreliable fake is a marker for something that should really be mocked.

Of course, all the three classifications above can be used in conjunction with each other. For example, recently I had to spy on the test subject, mock one of its methods to return a mock object which then in turn returns a fake object when one of its methods is called!