Introduction to NanoLanguage

Table of Contents

Introduction

The release of Atomistix ToolKit (ATK) version 2.1 introduced a major advance in the way electronic structure calculations are performed on a computer. By introducing the so-called NanoLanguage, QuantumWise is laying the foundation for a new architecture on which we will be able to build new functionality at a faster pace, with better transparency, and in a much more flexible way. Above all, NanoLanguage will also enable our products to become a platform on which other developers and companies can build applications and extend the functionality.

A key concept of NanoLanguage is that it is not a proprietary QuantumWise product, but rather a seed to what hopefully will evolve into a standard method for handling numerical atomic-scale simulations. The vision is that it will be possible to integrate various codes with one common interface, and that these codes can exchange both input and output data through NanoLanguage.

In this document we will try to explain what this actually means, and which effects it will have for the users of QuantumWise products.

What is NanoLanguage?

Put shortly, NanoLanguage is a new way of thinking scientific computing, combining the strength of flexible object-oriented scripting interfaces (known from Mathematica and MatLab) with sophisticated high performance scientific computing algorithms. The goal is to enable scientists to efficiently extend, specialize and combine methods to calculate nanoscale properties of matter, including density functional theory, semi-empirical tight-binding, classical potentials, k.p and various quantum-chemical methods.

NanoLanguage allows for both low and high level detailed control of the computer simulations. At the high level, it offers a common interface for setting up complex atomic-scale simulations and analyzing the results. On the lower level side, it provides an interface to the low-level functionality in ATK.

NanoLanguage is built on top of Python, a powerful and well-established interpreted programming language, and thus it includes basic elements such as loops over simulation control parameters, plus support for efficient manipulations of e.g. numerical array data. It is therefore an ideal tool for automating series of simulations where geometric, material, or other parameters are to be optimized.

NanoLanguage will allow scientists to express models of nature in a common language without the need to re-implement already developed algorithms, and it will allow for third-party development of new functionality on top of the ATK platform. Such functionality may consist of new atomic-scale modeling methodologies, tailored semi-empirical methods, or complex post-processing methods for calculating new quantities from the fundamental simulation results.

Why the change?

In ATK 2.0, as in earlier versions of TranSIESTA-C, the input to the program was given as a simple text file, containing a set of well-defined keywords. There were several disadvantages of this old keyword approach:

  • A lot of overhead for handling different file formats.

  • Very limited ability to control the execution path of the program.

  • Difficulties in creating a common interface for different methods.

  • Unsuitable format for assigning properties to objects, such as atoms.

To address these issues, it was chosen to develop our software such that all functionality is accessed through an object-oriented application programming interface (API). Among several possibilities, an optimal combination of desired features was provided by Python, a well-known interpreted programming language that gains in popularity, in particular within scientific communities.

Hence, as of ATK version 2.1 the user instead instructs ATK through a NanoLanguage script, which is nothing but a short (or long) program, written in Python. On the surface, one may regard this as merely a cosmetic difference, just a new format for the input file. In reality, however, the implications of this change are nothing short of a shift in paradigm.

Benefits of NanoLanguage

With a Python interface, the user is immediately offered all kinds of extended functionality out of the box. This list could be made very long, and below we will just highlight some of the most important advantages and features of NanoLanguage. First, however, it is necessary to clearly define where “NanoLanguage” fits into this concept. That is, how is NanoLanguage related to, or different from, Python itself.

From a programming language point of view, ATK is just a Python interpreter. That is, the syntax is the same as in Python, and the entire behavior of the ATK application should be as close as possible to any other Python interpreter.

NanoLanguage further extends Python with concepts and objects relevant for quantum physics and chemistry. That is, in NanoLanguage there is a periodic table containing elements, units such as Rydberg and Angstrom, methods for calculating the density-functional spectrum of a molecules, constructors for creating molecules (as class objects), Bravais lattices, and so on. True to its heritage, NanoLanguage will in its inaugural shape primarily be intended for electronic transport calculations. However, it will quickly be extended with more and more methods, making it applicable in new areas of physics and chemistry.

Among many other things, NanoLanguage offers you

Control

Using a programming language gives the user control of the program execution path, as opposed to running ATK as a black box. Although this functionality was limited in ATK 2.1, this advantage will be further developed in each release until almost the entire work-flow in the calculation has been opened up and modularized. At that time, users will for example have access to individual steps in the self-consistent mixing algorithm, and are hence given the possibility to e.g. change the way the mixing is performed.

Transparency

By delivering parts of the end-user functionality in the form of open scripts, NanoLanguage offers a software transparency, giving the user increased insight as to what actually goes on inside the program. As a consequence, the user will also have the ability to tweak the behavior, or define new functionality within a given framework. Also, no operations are performed unless the user requests them. Obviously this saves time, but it also makes the scripts much more self-documenting. This is also reflected in more low-level details such as that all physical quantities must be defined with a unit. Again, this is related to transparency; nothing is implicit (and hence subject to misunderstandings or oversights).

Extendable functionality

Given the native capabilities of Python, users have the chance to extend the functionality of NanoLanguage. On the simplest level, this can be in the form of small functions performing routine tasks, or loops for simplifying automation (calculating the properties of a bulk sample for a range of lattice constants, for instance). A more advanced application could be to define entire classes to represent complex geometries (such as a nanotube).

Access

NanoLanguage gives access to data, such as the electron density, in native Python types (basically, numerical arrays). This, for example, enables you to perform integration over the plane perpendicular to the transport direction to plot the effective potential along the transport axis in a two-probe system, or make cut-lines in the data. Later on, it will also be possible to access individual elements of the density matrix and other core constituents of the calculation through native Python types. This will allow for a multitude of user-defined analysis options, which in many cases can be defined and delivered as open scripts, instead of core functionality with a rigid interface.

Python

In addition to the features of NanoLanguage mentioned above, we would also like to highlight some of the specific advantages offered by using Python as the platform for ATK.

  • Python is fully programmable, mature, modern object-oriented programming language.

  • The Python syntax is very powerful, yet surprisingly simple to learn. It is similar to other well-known languages such as C, but requires less unnecessary typing (in particular compared to e.g. Perl) and gives a better structural overview via its indentation scheme.

  • There are several standard modules for Python which are of particular relevance for scientific applications. As the most notable example we can mention the fast handling of large data arrays obtained via the NumPy package.

  • Since Python is inherently modularized, it will be much easier to modularize ATK. In particular, this offers great simplifications in order for third-party players to develop extensions to the basic ATK package.

  • Python is an interpreted language, which could be used as an argument against it, for performance reasons. However, ATK takes advantage of the rich possibilities to couple Python with efficient code written in C, C++, or Fortran, and all performance-critical operations are carried out in compiled libraries, including the possibility to run in parallel. Therefore, Python, as an interpreted language, gains an advantage as it can be used interactively yet with the power and speed of a compiled program.

  • Since the language is interpreted, the source code is, with minor exceptions, automatically cross-platform compatible.

  • Python comes with “batteries included”, meaning that you can perform a very large variety of operations out of the box.

  • On the more technical level, dynamical type-checking and name resolution are major benefits for the flexibility of the code.

For more information about Python, see the Python website. Also, we highly recommend the Wikipedia page on Python which contains a large number of links for further reading.

Community concepts

An important ingredient in extending the functionality of NanoLanguage is that users may construct their own code and scripts, and share them with other users. The modularization and scripting facilities in NanoLanguage will also make it possible for QuantumWise (and other partners) to release functionality before it is completely ready for production, in the form of prototypes, and invite the community to contribute. Conversely, some user-contributed functionality can even become part of the product at a later stage, if it proves to be robust and useful enough (distributed under the proper licenses, of course).

NanoLanguage in QuantumWise products

In the first version, the capabilities of NanoLanguage are focused around setting up the system geometries, performing analysis tasks, and controlling the main path of execution of the calculation. So far, however, the inner workings of ATK are not yet modularized and made transparent, and therefore there are still some key parts of the work-flow that involve black-box functions. The most notable examples are the self-consistent loop and geometry optimizations.

To a larger and larger extent, the high-level functionality of ATK will however be transferred into NanoLanguage, and the black boxes will be opened up. At that point, the user will both gain insight into these processes, and also have ability to customize the functionality. As a specific example, one might wish to print and/or export the atomic positions and forces in each step of a relaxation, or customize the way the progress information of the self-consistent loop is presented. More important will be the possibility to directly influence the parameters used in the calculation dynamically. One could for instance continuously change the mixing strategy depending on the convergence rate, or even implement a completely different method for obtaining self-consistency.

The more capable NanoLanguage becomes, the easier it will be to deliver new functionality at a much more rapid pace. Some of these features will be deliverable as open scripts, for ultimate transparency, and as mentioned several times in this document it will open up for third-party plug-ins.

Initially, ATK is by definition fully compatible with NanoLanguage. As the language evolves more independently, the strive of QuantumWise will always be to offer the best possible implementation of the standard in terms of performance, features, and code quality.

What can NanoLanguage do for my calculations?

The key benefits of NanoLanguage were already listed above. Here we will present some more specific examples of how NanoLanguage can offer a competitive advantage for the users of QuantumWise software.

Control of parameters

Offering a well-documented interface of the objects used as input parameters gives the user the power to simply control which parameters are used. For example, it is a trivial operation in ATK to:

  • Check the convergence of the simulation results with respect to an important parameter like the mesh cutoff and/or basis set size.

  • Build up a number of different parameter sets and evaluate the accuracy/efficiency of each parameter set with respect to solving a particular problem.

  • Systematically study the properties of a system as a function of some internal geometrical parameter, such as the C-C bond-length in a carbon nanotube, or the lattice constant of a particular crystal.

Control of flow

In addition to giving the user transparent access to the control of the parameters used in the calculations, ATK gives users the ability to control the flow of a simulation.

An example of controlling program flow could be the combined use of a quick method for geometry estimation and an accurate method. The end goal is to use the accurate method. However, using the quick method can often give a reasonable estimation of the geometry that is close to the true answer and will allow the user to find the accurate geometry estimate faster. In ATK, this simply amounts to applying two different total energy methods in sequence and is not in the least bit more complex than being able to apply a single total energy method.

The user can also develop algorithms that allow the software to make decisions and branch at different points. For instance, the user can easily develop an algorithm that increases an accuracy parameter like the mesh cutoff until the change in a key result (like the total energy) due to the change between subsequent values of the mesh cutoff is under a certain tolerance.

Control of output

A key feature of ATK is that it produces no output by default. While this may seem at first as a hindrance and not a help, it is actually quite useful when a user wants to generate a number of specific reports of simulations on a number of trial molecules/systems. The report generator can be written so that the user has complete control about the format of the results, without having to worry about reading files generated by the software or anything else. Of course, integration of these reports in, say, a Word or Excel document may require work, but extensions to Python can readily tackle this.

ATK can also be used to generate a variety of databases. Python supports MySQL databases with a simple API and users familiar with MySQL can easily create records that contain all information they feel is relevant to store about a particular calculation.

Popular and powerful scripting-based plotting tools such as gnuplot and the Python module matplotlib (pylab) can easily be controlled from within the same NanoLanguage script that performs the calculation and generates the data. Thus a single script can both define the atomic geometry, execute the self-consistent quantum-mechanical calculation, and export and plot the results.

Programming

Using Python as an input format gives users access to a true programming language with syntax checking, modularization, object-oriented concepts, etc. Using the total energy methods in the ATK module as a basis, users can develop arbitrarily complex:

  • Default constructions via the import statement and creation of objects

  • Screening protocols: This is nothing more than assigning a score to each target structure. With deep-level access to all quantum-mechanical quantities in ATK, the scores can be based upon arbitrarily detailed information without significant increase in programming complexity.

  • Ionic dynamics modules: A critical part of simulation is to be able to estimate geometries and/or molecular dynamics trajectories. Power users and third parties can easily extend the functionality of ATK by defining routines to manipulate the atomic structures of the system based on information about, say, the total energy and forces on a system

Integration of third-party software

An important benefit of having a well-defined interface to all the relevant results available from simulations is that the integration with third-party software can be more easily accomplished. The definition of the interface gives a framework in which to store, manipulate and visualize data from any atomistic simulation package.

Summary

To summarize, NanoLanguage combines the power of the advanced quantum-chemical methods implemented in ATK with the flexibility of a high-level object-oriented programming language, namely Python. The features of Python itself dramatically improve the possibilities to extend ATK with custom functionality, but also offers great benefits for the implementation of the core features of ATK itself, and the rate at which these can be developed.

Some of the perspectives envisioned in this article lie a bit further into future than others. NanoLanguage is still a brand new concept, and will naturally take some time to mature. We do however strongly believe that already the very first iteration offers users of QuantumWise software products a vast range of new functionality, flexibility, transparency and control. In the long run, the hope is that NanoLanguage will evolve into a standard interface for numerical atomic-scale simulations.

We hope that all users of ATK will enjoy the freedom and power of NanoLanguage!