Skip to content

Blog

Introduction

I use Vim quite often in my day to day activities. I do not use it to write code on it, but I tend to find myself logged into a linux host, where I need to edit or update files. Vim is my editor of choice for doing that.

I am not really an expert on Vim, and I tend to use a particular set of commands, hence I decided to summarize the most common commands and shortcuts I use and have this blog post as a cheatsheet whenever I want to refresh my memory.

VIM Command Categories

I tried to group the commands I use into categories like Navigation, Edit, Search and Options.

Navigating in VIM can be done using the arrows. This navigates the cursor wherever the user wants. However, as VIM is really optimized to do everything with keyboard, in the most efficient way, navigation can also be achieved with keys 'h', 'j', 'k', 'l'. Those keys map nicely into the user's fingers, without the need to move your hand to reach the arrows.

Another very common function I find myself doing is going to lines. Someone can switch on line number by :set numbers (see options section). Moving directly to a line is as simple as ${lineNumber}G, which can be read as 'go to line ${lineNumber}'.

Additionally moving to absolute positions like the file's start or end, or to the begginning or end of a line is also fairly common.

Moving one word at a time, or a continuous block (like a camel case word) can be achieved with 'w' and 'W' respectively

Edit

I have grouped actions like inserting characters, deleting character, copying and pasting under the Edit category

Operations like copying and pasting are fairly common when editing files on a unix host

Search commands are very useful either in VIM or in less. Fortunately enough both tools have the same mechanism for searching

VIM can highlight any found search terms by setting the option :set hlsearch (see options section)

Options

VIM is highly customizable. User can set a profile with the desired properties and VIM behaviour at .vimrc file. The most common options I personally used are the below.

Python UnPickling in Java

Sometimes, applications, are necessary to interact with the serialized form of a different language. This usually happens in the persistence layer. Ideally, the form chosen for persistence should be cross platform (i.e protobufs), but unfortunately the reality is that sometimes the developer has no control over it or he/she needs to deal with third party systems, which they use a different serialized form than his/her language of choice supports.

This article, is about such a scenario and presents in simple steps how someone can deserialize (unPickle) in Java, a Python pickled object [see: Python Pickle].

Outlining the main steps someone has to follow below:

  • Have a dependency to Jython, which will be called from inside Java
  • Create a Java interface, which will act as a proxy between the Python/Jython and Java.
  • Create a stub class of the Python object, which the serialized form represents, and make sure you extend the above Java interface defined in step two. Also, this python class will have to implement all the methods for the specified interface
  • From your Java code use the cPickle Jython API to un-pickle the serialized object into your java interface

The below example demonstrates the steps defined above.

The Python object

Let's assume the python object that gets pickled is the below

Knowing how the python object looks like makes it a lot easier to define the Jython stub. If the object is not known, the developer will have to reverse engineer the serialized pickle and create his/her stub.

If an instance of the above object gets pickled it will look similar to:

Jython Dependency

If using maven add a dependency to Jython.

Java Interface acting as a proxy

Next step is to define a Java interface which acts as a proxy between Jython and Java. You can read more about this here

The interface we defined is the below:

Jython stub

We need to create a stub of the original object, which will extend the java interface we have defined. That stub is the Jython object that will store the serialized object after unpickling and before returning to Java.

In our case this can look like:

Note here that we import the actual Java interface we have defined and the python objects implements that interface.

Be aware

Some minor caveats to be aware of, at this step, is that this python object needs to be in your Jython's classpath, or module directory even better. This can be achieved by various ways, but the easiest would be:

  • Add the module directory in the environment variable JYTHONENV
  • Programmatically from Java add the directory that the module resides to python.path (i.e System.setProperty( "python.path", "/pythonModuleDir" ) )

UnPickle the object

The final step is to unpickle and use the object in Java. A sample code to do that is:

Java 9 Process API

In a previous blog post I wrote about one of my favourite features of Java 9, the JShell. At this post, I will write about another feature I am excited about. The new Java 9 Process API. I will also present some code showing how powerful and intuitive it is.

The new API adds greater flexibility to spawning, identifying and managing processes. As an example, before Java 9 someone would need to do the following in order to retrieve the PID of a running process:

The above is not intuitive and seems like a hack. It feels to someone that the Java process should at least easily expose its own PID.

Moreover, I quite a few times needed to spawn new child processes from inside a Java process and manage them. The process of doing so is very cumbersome. A reference to the child process has to be kept throughout the program's execution if the developer wishes to destroy that process later. Not to mention that getting the PIDs of the children processes is also a pain.

Fortunately, Java 9 comes to fix those issues and provide a clean API for interaction with processes. More specifically two new interfaces has been added to the JDK:

1. java.lang.ProcessHandle 2. java.lang.ProcessHandle.Info

The two new interfaces add quite a few methods. The first one methods for retrieving a PID, all the processes running in the system and also methods for relationships between processes. The second one mainly provides meta information about the process.

As someone would expect most of the methods have native, platform specific implementations. The OpenJDK's implementation of ProcessHandle can be found here. Also the Unix specific implementation can be seen here.

I have created a very simple program which makes use of most of the features of this new Process API. The program does the below:

  • Can retrieve the running process' PID
  • Can start a long running process
  • Can start a short running process, which terminates about ~5seconds after starting
  • Can list all child processes that were spawned by the parent one
  • Can kill all child processes that were spawned by the parent one
  • Attaches a callback when a child process exits. This is done using the onExit() method of the ProcessHandle

The sample class is provided below. For the entire example please see here:

JShell

As of now, Java 9 official release date is 27.07.2017. According to openJDK mailing list the push back was due to the most anticipated feature of Java 9, which is the modularisation of the JDK or commonly known as Project Jigsaw.

I am not as much excited for this feature as I am for the brand new JShell. Many people criticise the language's verbosity and sometimes the amount of code that is required to do some stuff. I do not disagree that this, in many cases, is true. But, what I was missing mainly from Java was the ability to quickly evaluate an expression/algorithm/piece of code.

For example, many times I find myself needing to try something quick which involves reading a file or reading something from the web and performing some manipulation on it. Or even sometimes testing out a lambda expression to see its behaviour. Up to this point, actions like that were a bit cumbersome, as it involved the creation of a class, a main method and the execution of that program.

JShell is introduced to solve problems like that and more. Also known as Project Kulla JShell is an extremely useful tool. It is a REPL (Read Evaluate Print Loop) tool. Similar ones exist in various other languages like Python, Perl, even Scala.

For someone to use JShell she/he needs to download JDK 9. Then all she/he has to do is to navigate to /bin directory and execute the jshell command.

Firstly, the JShell itself prompts the user to type /help intro

The jshell comes with auto-completion features, so the user can press Tab and see a list of commands depending on the first letter she/he typed:

A list of help command can appear on the output by typing /help.

By default JShell has to import the classes that the user is going to use. It comes with a pre-defined set of common classes already imported:

A user can import any JDK class, or even her/his own classes by adding to the classpath:

As someone can notice it is not mandatory to add semicolons in the end of statements. However, it is mandatory to add them if the user adds a class or a method.

Each expression the user writes on the console is evaluated and printed on the standard output. If an expression has a return type that return type is automatically assigned in a variable that the shell creates on the fly. Of course, later on the user can make use of that variable as normal:

User can define methods outside of classes. Additionally, classes can be defined and referenced as normal:

A very nice feature is the fact that the user does not need any try{}catch{} blocks for methods which define checked exceptions:

Finally, the user can see the defined methods, types and variables and reset her/his session:

Concluding, I believe JShell will be a nice to have tool. By exploring it I am pretty sure people will come up with some interesting uses of it.

JDK Evolution

I know for fact that many people (especially in the financial technology industry) are very skeptical when a new version of Java is released. People, actually persist to update their Java version (even the JRE version) for many years. There are a lot of places that are still using Java 6! Even though, this persistence is valid for some cases, especially in the early stages of a new release, i personally find it wrong. Indeed, to upgrade the version of Java a software is using is not a simple and easy thing in most of the cases. Lots of testing needs to be done, to ensure at least the application's performance has not degraded. Additionally, more testing is needed when the application is doing something very tailored, like calling native code in-process.

In my opinion, upgrading to the newest version is advisable for many reasons. The one i would like to mention today is the JDK evolution. Meaning, that in most of the cases a software developer will have some free gains, without him, in principal, doing anything. The code inside the JDK has some minor changes between releases. This is done for bug fixing reasons, improving performance reasons or even better for following hardware trends. There are lots of times that CPUs introduce a new instruction which solves a problem down at the silicon level, meaning faster processing. The Java engineers and in particular people who are involved in the OpenJDK project have lots of mechanical sympathy.

A well shout example is the commonly used java.util.concurrent.atomic.AtomicInteger class. There is a huge difference in the implementation for a couple of methods in this class, between Java7 and Java8. The difference is presented below:

Java 7

117  public final int getAndSet(int newValue) {118      for (;;) {119          int current = get();120          if (compareAndSet(current, newValue))121              return current;122      }123  }

Java 8

119  public final int getAndSet(int newValue) {120      return unsafe.getAndSetInt(this, valueOffset, newValue);121  }

There is a very important difference. Java 8 uses some code inside the Unsafe class, where Java 7 is performing a busy loop. Java 8 actually makes uses of a new CPU instruction for that. That means that the unsafe.getAndSetInt is an intrinsic function. Java's intrinsic functions can be found here.

This is a very simple but very important reason why someone should consider regularly upgrading his/her Java version. Simple things like that, which are spread across the newer implementations can actually have a positive impact on every application.

Log4j2 vs Log4j

Log4j2 is the evolution not only to Log4j but also to Logback, as it takes Logback's feature one step forward. The main selling point is the improved performance, throughput of messages and latency, which apparently is a huge leap forward compared to Log4j and also Logback.

Other interesting Log4j2 features are:

  • Automatic reloading of logging configurations
  • Property Support: Log4j2 loads the system's properties and they can be evaluated even at the configuration level
  • Java8 lambdas and lazy evaluation: It provides an API for wrapping a log message inside a lambda statement, which only gets evaluated if truly needed
  • Garbage free: An interesting architectural feature, as Log4j2 has no or very little (in case of web apps) garbage. You can read more about that here.
  • Async loggers using the LMAX Disruptor: The disruptor is a very interesting technology and it is always provoking to examine use cases of it being used in strain

I played around with Log4j2 and in general i was very happy with its API, implementation ( it actually separates the API from the implementation, even though that means the developer needs to add 2 maven dependencies), configuration simplicity and finally the performance.

Even though measuring a logger's performance with JMH is not advisable i tried to compare its performance (using async and sync loggers) against the old Log4j. The performance (average time and throughput) was indeed better and at the edge cases 15K ops/ms faster!. Having said that, you should take that with a pinch of salt, because as mentioned earlier JMH is not the right tool to performance measure and compare the two logging implementation.

For reference the simple java program used to perform the various tests can be found in Github.

Some indicative results, performing 3 runs for each logger can be seen below.

Log4j2 Async Logger: #1 Benchmark Mode Cnt Score Error Units Log4JBenchmarking.logMessage thrpt 20 84.875 ± 6.383 ops/ms Log4JBenchmarking.logMessage avgt 20 0.015 ± 0.001 ms/op #2 Benchmark Mode Cnt Score Error Units Log4JBenchmarking.logMessage thrpt 20 87.430 ± 9.362 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 79.753 ± 13.381 ops/ms Log4JBenchmarking.logMessage avgt 20 0.013 ± 0.001 ms/op ----------------------------------------------------------------- Log4j2 Logger: #1 Log4JBenchmarking.logMessage thrpt 20 75.881 ± 10.960 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.002 ms/op #2 Log4JBenchmarking.logMessage thrpt 20 79.698 ± 12.290 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.002 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 87.428 ± 6.678 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op ----------------------------------------------------------------- Log4j Logger: #1 Log4JBenchmarking.logMessage thrpt 20 72.490 ± 8.350 ops/ms Log4JBenchmarking.logMessage avgt 20 0.014 ± 0.002 ms/op #2 Log4JBenchmarking.logMessage thrpt 20 84.169 ± 9.227 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op #3 Log4JBenchmarking.logMessage thrpt 20 72.599 ± 10.801 ops/ms Log4JBenchmarking.logMessage avgt 20 0.012 ± 0.001 ms/op

Being a polyglot developer

Throughout my professional career i have mainly used object oriented, statically typed programming languages. The likes of Java and C# were ideal for big scale projects. Such languages have a massive active community behind them, developing frameworks, tools, writing articles about it and responding to questions in online forums. Hence it is really hard to advocate against using them. It is rare that someone would stumble upon a problem that someone else hasn't solved already. It was not until recently that i had to switch to a different paradigm of language. A dynamically typed language which can be used as an object oriented one, but also having functional abilities.

This blog post is not about advocating which style is better, as i believe there is no silver bullet out there. Every problem is different and can be solved in multiple ways.

However, this blog post is about the benefits of being a polyglot developer, or using different styles of programming languages. At the beginning, i found myself struggling to get used to the new one. I tended to try make it work using the approach i already knew, which was not the right mentality. After a couple of weeks and as i got more familiar with the language its features and its 'mentality' i found myself trying to approach the problems from a different perspective. I was combining the various different programming styles and nice ideas were coming out of those. Even when i was going back and solving problems in the object oriented way, i found out that i could apply new techniques which were making the solution much more elegant and fun to program.

I believe that being open to different languages is a great thing for a software engineer. It opens new perspectives, adding new fundamental knowledge and giving agility on how someone approaches a problem. Having said that, I was looking at a new programming language to learn or at least get a bit familiar with. I heard really positive things about Haskell. A functional, statically type language. Even though i know i might not use such a language for a project at my workplace, i believe that i will benefit a lot by at least spending a couple of months playing with it.

Concluding, i would urge everyone to try a different programming language than the one he/she is used to. The benefits are enormous and is really fun too.

Hibernate Tools - JPA Entity generation

Recently i was reviewing and trying some examples using the Hibernate Tools. More specifically, i was trying their latest version (5.0.0.CR1) in order to generate some JPA entity POJOs, out of a database schema.

Hibernate Tools, can either be used programmatically from their Java API, or using their pre-defined ANT tasks. The below examples demonstrate the programmatic way and a Mavenized way, by invoking ANT from within Maven.

I used an in-memory HSQL database, with two very simple tables. A Users table with an ID and a name and an Address table with an ID, some fields and a foreign key to the Users table, mimicking a many-to-one dependency.

The code that starts up the HSQL server and creates the tables can be found in GitHub.

As mentioned above the Hibernate Tools can be invoked programmatically. Initially i found it a bit tricky as i hadn't realized i needed to invoke the JDBC Configuration step before i invoked the POJOs generation step. Probably, this is needed in order for the tool to read the Hibernate configuration file, and identify the database and its schema. The configuration that is needed is actual rather trivial:

  • Set the destination folder
  • Point the tool to the hibernate configuration file, in order to pickup the database details
  • Invoke the JDBCConfigurationTask in order to identify the database schema
  • Invoke the Hbm2JavaGenerationTask in order to generate the JPA entities out of the above database schema

A sample code that does the above is shown below:

The java code that is generated for the two database tables is the below:

The whole process can be made as part of a maven compilation step. This is done using the ANT tasks that are provided. The relevant section of the pom.xml file is the below. Additionally, using the maven helper plugin the generated classes can automatically be added on the project's classpath, bulletproffing the application ( and automating the tedious task of re-generating the entities ) of future changes to the database schema.

The complete example can be found in GitHub.

Java Enum as a class

Recently i have been asked a fairly simple question. "Can you extend an enum?". My reaction to that was "Why would you want to do that?". But, given a second thought, i realized that i didn't really know the answer. Of course i knew that in Java enums are treated as classes, but i had no clue how they look like inside the JVM, whether they were made final or not. I could of course try to extend an enum in IntelliJ and see whether the IDE would give me an error or not.

However, the correct way is to inspect how the class looks like after it gets deconstructed back from its bytecode. This can be done using the javap utility which comes along with the JDK. For example imagine we have the following enum:

Using the javap utility we can dissasemble the .class file, which will not give us the above result.

[bash] javap Weekdays.class [/bash]

The class that the JVM knows is:

Finally we got our answer. The enums are indeed represented as classes inside the JVM and those classes are final, hence we cannot extend them.

Deadlock

This article will present a deadlock and some tools to examine and identify it.

A deadlock situation happens when two or more threads are waiting to acquire the object monitor of one or more objects that are already locked one of the competing threads. Hence, the threads will wait forever, if there are no detection and prevention strategies.

The following little code snippet simulates the occurrence of a deadlock, between two competing threads.

In the above situation, thread named 'Left-1' tries and acquires the monitor of object named 'left'. Then it sleeps for a couple of seconds and tries to acquire the monitor of object named 'right', but 'Right-1' thread has already done so. The two threads have no back out logic, hence that program execution will freeze forever.

Detecting a deadlock

Although, in the above example the program is trivial and we can immediately understand where and why the deadlock is happening, in a real-world application that might be a bit tricky. The easiest way is to get a thread dump and analyze it.

  • Using an IDE

In case you were running the application locally, from your IDE, most of the chances are that your IDE already have the ability to do so. I am mainly using IntelliJ. You can find that functionality in the 'Run' window as shown below.

IntelliJ_dump_threads

That will dump in your standard output all the threads with their stack and the state they are in.

[bash highlight="4,5,14,15"] "Right-1" #13 prio=5 os_prio=31 tid=0x00007f8444219800 nid=0x5503 waiting for monitor entry [0x000070000134f000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nikoskatsanos.deadlock.Deadlocked.lambda$start$1(Deadlocked.java:44) - waiting to lock <0x00000007970c0328> (a java.lang.Object) - locked <0x00000007970c05c0> (a java.lang.Object) at com.nikoskatsanos.deadlock.Deadlocked$$Lambda$2/1241276575.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

"Left-1" #12 daemon prio=5 os_prio=31 tid=0x00007f8443944800 nid=0x530f waiting for monitor entry [0x000070000124c000] java.lang.Thread.State: BLOCKED (on object monitor) at com.nikoskatsanos.deadlock.Deadlocked.lambda$start$0(Deadlocked.java:33) - waiting to lock <0x00000007970c05c0> (a java.lang.Object) - locked <0x00000007970c0328> (a java.lang.Object) at com.nikoskatsanos.deadlock.Deadlocked$$Lambda$1/1022308509.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [/bash]

The above is part of the thread dump created by IntelliJ, which includes the stack of our two deadlocked threads. By analyzing the above snippet, we can see that both threads are in the 'BLOCKED' state and both of them are waiting to lock an object. If we observe closer the objects that each thread is trying to lock, is the one already locked by the other thread. This indication and a look at our source code, which will ensure that those locks will never be released, is enough for us to come to our conclusion.

  • Using a tool

Another way to analyze and detect a deadlock, would be to use a more sophisticated tool. There are plenty out there some of them commercial and some of them shipped with your JDK. Three of the most popular are: JConsole, JVisualVM and JavaMissionControl.

Those tools are very easy in use and all of them are quite similar. JConsole is probably the simplest. Using JConsole requires to launch the application and connect to the process running the application you want to analyze. Once started, the user can find a tab named 'Threads'. That screen will give the user everything he/she needs. The user can examine the existing threads. The information is actually the same as the one produced by IntelliJ above and we will see the reason further below. But most importantly the user can notice a detect deadlock button on the bottom. By just using that button makes it extremely easy to find if a deadlock is present in the application. It will look like below, which indicates the two threads on the left hand side are in a deadlock.

jconsole_deadlock_screen

  • Using jstack

Finally, in many cases the application might be running in a server and the only way to interact with it is a shell. In such cases the user needs to use command line utilities provided by the JDK itself. More specifically the jstack. jstack is what is actually used underneath the covers by the above two ways.

In order to do that the user needs to find the process' PID. That can be done either by using OS level command or by just using the jps command, which also comes with the JDK. Once the user has the pid he/she can invoke jstack command in order to get an output similar to the above tools.

[bash] jstack -l ${PID} [/bash]

The full source code for the example can be found in GitHub.