# Input & Output Files

## Overview

There are two main input files, input.nn and input.data, which have to be present in all running modes of RuNNer. If a fit shall be restarted using a preoptimized set of weights, additional input files weights.XXX.data and/or weightse.XXX.data must be present for the short range and the electrostatic NN, respectively. Further, if Kalman matrices of a previous fit shall be used for a restart, the files kalman.short.XXX.data and/or kalman.elec.XXX.data must be available. In the prediction mode, the files scaling.data and weights.XXX.data (and in case of electrostatics the corresponding files scalinge.data and weightse.XXX.data) must be in the running directory.

The main output of RuNNer in all modes is sent to standard output (the screen). It can be redirected to a file by typing

./RuNNer.x | tee runner.out


It is custom but not obligatory to call these files mode1.out, mode2.out, and mode3.out.

Depending on the mode, a number of output files is generated. In mode 1, the construction of the symmetry functions, for the short range NN the files

• function.data,
• testing.data,
• trainstruct.data,
• teststruct.data,
• trainforces.data and
• testforces.data

are written. For the electrostatic NN additionally the files

• functione.data,
• testinge.data,
• trainforcese.data and
• testforcese.data

are generated. In the fitting mode in each epoch the weight files

• scaling.data,
• scalinge.data,
• YYYYYY.short.XXX.out and
• YYYYYY.ewald.XXX.out.

are written. The files optweights.XXX.out and optweightse.out contain the sets of weights with the lowest overall testing error. Further, if requested, also the files kalman.short.XXX.data and/or kalman.elec.XXX.data, trainpoints.YYYYYY.out, testpoints.YYYYYY.out, traincharges.YYYYYY.out, testcharges.YYYYYY.out, trainforces.YYYYYY.out and testforces.YYYYYY.out are written.

In the case of 4G-HDNNPs, not only are the weight files generated but also the optimized hardness (YYYYYY.hardness.XXX.out). The file opthardness.XXX.out contains the values with the lowest overall testing error.

Friendly reminder:

Users should use a set of weight and hardness files from the same epoch or epoch with the lowest overall testing error.

## Input and Output Files

### XXXXXX.short.YYY.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file is written in the fitting mode. It contains the short range NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY.

### XXXXXX.ewald.YYY.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file is written in the fitting mode. It contains the electrostatic NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY. In the 4G-HDNNP case, it contains the electronegativity NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY.

### energy.out

Mode 1: ---  Mode 2: ---  Mode 3: Output

This file contains the total energy of the system in Hartree. In case only the short range part or only the electrostatic part is used, the total energy just contains this part.

### opthardness.XXX.out

Mode 1: ---  Mode 2: Output  Mode 3: Input

This file contains a single hardness value of the epoch with the smallest error of the test set. The value is identical to the hardness in the corresponding XXXXXX.hardness.YYY.out file.

### function.data

Mode 1: Output  Mode 2: Input  Mode 3: ---

This file contains the short range symmetry function values of all structures in the training set and is written by RuNNer in runner_mode 1. It is a mandatory input file in runner_mode 2 in case of a short-range fit. For each structure, in the first line the number of atoms in the structure is given. Then, for each atom there is one line starting with the nuclear charge, followed by all symmetry function values characterizing this atoms' environment. For each structure, the final line contains the total charge, the total energy, the short range energy and the electrostatic energy. Please note that all energy contributions, the total energy and the total charge are normalized per atom here. This is required because for unnormalized target quantities larger systems would get a higher fitting weight, because they typically have a larger error.

### functione.data

Mode 1: Output  Mode 2: Input  Mode 3: ---

This file contains the electrostatic symmetry function values of all structures in the training set and is written by RuNNer in runner_mode 1. It is a mandatory input file in runner_mode 2 in case of a charge fit. For each structure, in the first line the number of atoms in the structure is given. Then, for each atom there is one line starting with the nuclear charge, followed by all symmetry function values characterizing this atoms' environment. For each structure, the final line contains the total charge, the total energy, the short range energy and the electrostatic energy. Please note that alle energy contributions, the total energy and the total charge are normalized per atom here. This is required because for unnormalized target quantities larger systems would get a higher fitting weight, because they typically have larger errors.

### input.data

Mode 1: Mandatory Input  Mode 2: ---  Mode 3: Mandatory Input

The input.data file contains one or more structures. In runner_mode 1 the full reference data set is provided. In runner_mode 3 all structures destined for prediction are provided. Each structure in the input.data file is framed by a pair of a begin and an end keyword. There can be an arbitrary number of structures one after the other in a single input.data file. The order of the lines in between begin and end is arbitrary and each line is free-formatted. The following information can be provided:

• For each structure it is possible to add comment lines starting with c or comment. They can be used, for instance to label the data, give information about the settings of the electronic structure calculations (DFT code, basis set etc.) and about the author of the data.

• For periodic structures there must be three lines starting with the keyword lattice followed by the x, y, and z coordinates of the respective lattice vectors. The unit is Bohr.

• For each atom in the system there is one line starting with the keyword atom, followed by three numbers specifying the Cartesian coordinates (in Bohr). Then the element symbol is given, followed by a number for the atomic charge (e.g. a Mulliken or Hirshfeld charge), the atomic energy (this is not used at the moment, please always put 0.0), and three numbers giving the x, y, and z components of the atomic forces in Hartree/Bohr.

• For each structure there must be a line starting with the keyword energy specifying the total energy of the system in Hartree.

• For each structure there must be a line starting with the keyword charge specifying the total charge (in most cases 0.0, but RuNNer can also handle systems with net charge) in units of the proton charge.

#### Example

begin
comment This is an arbitrary comment line
lattice 10.00 0.00 0.00
lattice 0.00 10.00 0.00
lattice 0.00 0.00 10.00
atom 0.000 0.000 0.000 Zn 0.32171 0.00000 0.00000 0.00000 0.02218
atom 0.000 0.000 5.499 O -0.32172 0.00000 0.00000 -0.00000 -0.02218
energy -1854.16937000
charge 0.00000000
end


### input.nn

Mode 1: Input  Mode 2: Input  Mode 3: Input

The input.nn file is the main control file of RuNNer. The keywords can be given in arbitrary order, and blank lines and commented lines (starting with #) are permitted. If keywords are not specified, reasonable defaults are assumed where possible and written to the output for information. If an essential keyword is missing, RuNNer will stop with an error message and ask the user to specify the keyword.

All keywords are documented in the reference section.

The file input.nn is read twice, first by the subroutine getdimensions.f90 to get the dimensions of some arrays, then all input options are read by the subroutine readinput.f90. It contains a set of mandatory and optional keywords, which are listed reference section.

### nnatoms.out

Mode 1: ---  Mode 2: ---  Mode 3: Output

This file contains the atomic charges and energies of the system in e and Hartree from Neural Network potential and reference method. The file contains 7 columns including Configurations (Conf.), atom id (atom), element type (element), reference charge (Ref. charge) and energy (Ref. energy), Neural Network charge (NN charge) and energy (NN energy).

### nnforces.out

Mode 1: ---  Mode 2: ---  Mode 3: Optional Output

This file is written in runner_mode 3 if the keyword calculate_forces is used. It contains the force vectors acting on all atoms in Ha/Bohr from Neural Network potential and reference mthod.The file contains 8 columns including Configurations(Conf.),atom id,Reference atomic force along x,y,z directions(Ref. $$F_{\mathrm{x}}$$, Ref. $$F_{\mathrm{y}}$$, Ref. $$F_{\mathrm{z}}$$), Neural Network atomic force along x, y, z directions (NN $$F_{\mathrm{x}}$$, NN $$F_{\mathrm{y}}$$, NN $$F_{\mathrm{z}}$$).

### nnstress.out

Mode 1: ---  Mode 2: ---  Mode 3: Optional Output

This file is written in runner_mode 3 if the keyword calculate_stress is used. It contains the short range stress only, the electrostatic contribution to the stress tensor is currently not implemented. The stress tensor can only be calculated for periodic systems. The file contains 4 columns including Configurations (Conf.), Neural Network stress along x, y, z directions (NN $$P_{\mathrm{x}}$$, NN $$P_{\mathrm{y}}$$, NN $$P_{\mathrm{z}}$$).

### optweights.XXX.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file contains the short range weights of the epoch with the smallest error of the test set. The weights are identical to the weights in the corresponding XXXXXX.short.YYY.out file.

### optweightse.XXX.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file contains the electrostatic weights of the epoch with the smallest error of the test set. The weights are identical to the weights in the corresponding XXXXXX.ewald.YYY.out file.

### output.data

Mode 1: ---  Mode 2: ---  Mode 3: Output

This file is written in the prediction mode and contains all data predicted by the NN. The format is the same as of the file input.data.

### runner.out/standard out

Mode 1: Output  Mode 2: Output  Mode 3: Output

This is the recommended name for the main output file of RuNNer. By default, the output is written to the standard output and needs to be piped to runner.out by the command

RuNNer.serial.x | tee runner.out


Alternatively, it would also be useful to name the output files of runner_mode 1, 2 and 3 as mode1.out, mode2.out, and mode3.out.

### scaling.data

Mode 1: ---  Mode 2: Output & Optional Input  Mode 3: Input

This file is written during the fitting process. It contains the minimum, maximum and average value for each symmetry function for the short range NN. It is a mandatory input file for the prediction of energies for new structures. In runner_mode 2 a scaling.data file can be read using the keyword use_old_scaling. This can be required to keep exactly the same fit (the file scaling.data is part of the fit) when restarting runner_mode 2 with a modified training set.

### scalinge.data

Mode 1: ---  Mode 2: Output & Optional Input  Mode 3: Input

This file is written during the fitting process. It contains the minimum, maximum and average value for each symmetry function for the electrostatic NN. It is a mandatory input file for the prediction of energies for new structures. In runner_mode 2 a scalinge.data file can be read using the keyword use_old_scaling. This can be required to keep exactly the same fit (the file scalinge.data is part of the fit) when restarting runner_mode 2 with a modified training set.

### testcharges.XXXXXX.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in runner_mode 2 for electrostatic fits (keyword electrostatic_type 1) and contains a comparison of the atomic charges for DFT and the NN for each structure in the test set. A separate file is written in each epoch.

### testing.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the symmetry function values for the test set for the short range NN. The file is written in RuNNer mode 1, and is a mandatory input file in the fitting mode (mode 2). The contents has the same structure as the file function.data.

### testinge.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the symmetry function values for the test set for the electrostatic NN. The file is written in RuNNer mode 1, and is a mandatory input file in the fitting mode (mode 2). The contents has the same structure as the file functione.data.

### testforces.XXXXXX.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in runner_mode 2 and contains a comparison of the atomic force components for DFT and the NN for each point in the testing set. A separate file is written in each epoch.

### testpoints.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in mode 2 and contains a comparison of the energies for DFT and the NN for each point in the test set. The file is updated in each epoch.

### teststruct.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the structures of the test set. The structures are needed for the calculation of the electrostatic energies. It is written in RuNNer mode 1, while the symmetry functions are calculated. In the fitting mode it is a mandatory input file. The file contains the following information:

For each structure in the training set, the first line gives the number of that structure in the training set and a logical variable specifying if the structure is periodic T or non-periodic f. For periodic structures the following three lines contain the lattice vectors. Further, for each atom in the structure there is one line containing the nuclear charge, the x, y, and z positions of the atom, that atomic partial charge, the atomic energy, and finally the x, y, and z components of the forces. If an electrostatic NN is used (or has been used in mode 1), then the forces are not identical to the total reference forces, but contain only the short range forces.

### testforces.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the atomic forces of all structures in the testing set. It is written in runner_mode 1, when the symmetry functions are calculated. In runner_mode 2 it is a mandatory input file.

### traincharges.XXXXXX.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in runner_mode 2 for electrostatic fits (keyword electrostatic_type 1) and contains a comparison of the atomic charges for DFT and the NN for each structure in the training set. A separate file is written in each epoch.

### trainforces.XXXXXX.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in runner_mode 2 and contains a comparison of the atomic force components for DFT and the NN for each point in the training set. A separate file is written in each epoch.

### trainpoints.XXXXXX.out

Mode 1: ---  Mode 2: Optional Output  Mode 3: ---

This file is written in runner_mode 2 and contains a comparison of the energies for DFT and the NN for each point in the training set. A separate file is written in each epoch.

### trainstruct.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the structures of the training set. The structures are needed for the calculation of the forces and of the electrostatic energy. It is written in runner_mode 1, when the symmetry functions are calculated. In runner_mode 2 it is a mandatory input file. The file contains the following information:

For each structure in the training set, the first line gives the number of that structure in the training set and a logical variable specifying if the structure is periodic T or non-periodic f. For periodic structures the following three lines contain the lattice vectors. Further, for each atom in the structure there is one line containing the nuclear charge, the x, y, and z positions of the atom, that atomic partial charge, the atomic energy, and finally the x, y, and z components of the forces. If an electrostatic NN is used (or has been used in mode 1), then the forces are not identical to the total reference forces, but contain only the short range forces.

### trainforces.data

Mode 1: Output  Mode 2: Mandatory Input  Mode 3: ---

This file contains the atomic forces of all structures in the training set. It is written in runner_mode 1, when the symmetry functions are calculated. In runner_mode 2 it is a mandatory input file.

### weights.XXX.data

Mode 1: ---  Mode 2: Optional Input  Mode 3: Input

This file contains the weight parameters for the short-range NN. It has the same format as XXXXXX.short.YYY.out file and is usually a copy of that file. If in runner_mode 2 a short range fit is restarted by using the keyword use_old_weights_short, this file must be present. In runner_mode 3 this is a mandatory input file if a short range NN is used.

### weightse.XXX.data

Mode 1: ---  Mode 2: Optional Input  Mode 3: Input

This file contains the weight parameters for the electrostatic NN. It has the same format as XXXXXX.ewald.YYY.out file and is usually a copy of that file. If in runner_mode 2 a charge fit is restarted by using the keyword use_old_weights_charge, this file must be present. In runner_mode 3 this is a mandatory input file if an electrostatic NN is used.

### XXXXXX.short.YYY.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file is written in runner_mode 2 for the short range fit. It contains the short range NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY. For readability the file contains a lot of additional information. Only the first column, which contains the weight values, is relevant for RuNNer. The remaining columns contain the following information:\

• a or b for weights connecting two nodes or a bias weight, respectively

• A counter for the number of the weight

• Information on the role of the weight in the NN. in case of a weight connecting two nodes four numbers are given specifying the source layer and node as well as the target layer and node. In case of a bias weight only the target layer and node are given.

### XXXXXX.ewald.YYY.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file is written in runner_mode 2 for charge fits (keyword electrostatic_type 1, 3, or 4). It contains the electrostatic NN weight parameters of epoch XXXXXX for the element of nuclear charge YYY. For readability the file contains a lot of additional information. Only the first column, which contains the weight values, is relevant for RuNNer. The remaining columns contain the following information:

• a or b for weights connecting two nodes or a bias weight, respectively

• A counter for the number of the weight

• Information on the role of the weight in the NN. in case of a weight connecting two nodes four numbers are given specifying the source layer and node as well as the target layer and node. In case of a bias weight only the target layer and node are given.

### XXXXXX.hardness.YYY.out

Mode 1: ---  Mode 2: Output  Mode 3: ---

This file is written in runner_mode 2 for charge fits (keyword electrostatic_type 4). It contains a single hardness value of epoch XXXXXX for the element of the nuclear charge YYY.