Developing new algorithms: Ivo Filot at Penn State University, part 1

In the summer of 2016, Ivo Filot went to Penn State University (USA) to stay for two months in Adri van Duin’s research group. The goal of his stay was to develop a new algorithm for fitting ReaxFF potentials. In this blog, kept especially for MCEC, Ivo tells you all about his research, his professional as well as his personal findings, and the perks and peculiarities of working halfway across the world.

Week 1: Training set
(Monday July 4th – Sunday July 10th)

My day of arrival in the US is on the national holiday 4th of July, but since I’m still at Detroit Airport that day, the party’s lost on me. Instead, when I arrive at Penn State University, I dive straight into my research. Every working week starts with setting a few goals for myself. Initially, for this first week at Penn State, my only goal was to get acquainted with my new environment. I figured I needed some time to get to know the campus, to get the through the organizational bureaucracy, and to meet my new temporary colleagues. But in the end, not only did I manage to get myself registered as a foreign researcher, but I actually overcame a computational hurdle as well. I’ll tell you about it in a minute.

Materials Research Institute, Penn State

But first things first. Why ReaxFF?

Using ReaxFF potentials can help us develop novel computational routines to describe mesoscale catalytic phenomena. And that’s most certainly what we aim for, since they cost only a fraction of the computational time compared to the (standard) DFT calculations. Which are much more accurate, true; but the ReaxFF potentials are much more feasible to simulate larger dimensions or longer time scales.

Before I left, I already had a very simple version of the ReaxFF fitting program developed. After a few meetings, it however appeared that, most likely, the algorithms were working, yet for what purpose the parameters were being trained for was not for the things we as chemists are interested in. So what was wrong?


From a DFT perspective, it makes sense to construct a training set wherein you just compare the binding energy of DFT with ReaxFF for a set of relevant systems. For instance, if you have a Co FCC crystal, you calculate the difference in energy between that particular (bulk) crystal structure and a single Co atom. This number can be directly compared to the ReaxFF energy for a Co FCC crystal, as the ReaxFF calculates binding energies instead of electronic energies. The training set is then composed of a large set of systems that are relevant to construct your potential.

Penn State

What is wrong with this approach, is that you are no longer fitting the kind of chemical phenomena you want to use your potential for in the first place. A better approach is thus to construct your training set in such a way that chemical phenomena such as adsorption energies, expansion energies or reaction energies are calculated, and not the underlying binding energies.

As it turned out, my program wasn’t ready to handle such kind of training sets as it could not handle any kind of ‘human written mathematical expressions’. What I mean with that is the following:Imagine you are using Microsoft Excel and you want to calculate the product of two cells. You would type in yet another cell something like “=A1 * A2” and in that cell the product of cells A1 and A2 would be calculated. What Excel does under the hood, is interpreting the formula you have written, connecting the numerical values to the variables and calculating the answer. That kind of functionality was not yet implemented in my program.

Penn State logo

How I in the end implemented this functionality is quite simple. I used an external library. An external library is simply another program from which you use some of its functions for your program. It is good practice in programming to look if such a library exists before you start writing your own code, and try and reinvent the wheel. For my programming I had been using C++. Now, as an external library, I chose libLUA. LUA is a programming in itself, but much simpler than C++. The C++ part of my program now reads in the expression, parses the expression to libLUA and lets libLUA calculate the answer. The answer is then used further in my program. Problem solved and no need to whip something up myself.

With that out of the way, I could start the parallelization of my program. I’ll tell you about it in my next blog.