HISTOGRAM
This is part of the analysis module

Accumulate the average probability density along a few CVs from a trajectory.

When using this method it is supposed that you have some collective variable \(\zeta\) that gives a reasonable description of some physical or chemical phenomenon. As an example of what we mean by this suppose you wish to examine the following SN2 reaction:

\[ \textrm{OH}^- + \textrm{CH}_3Cl \rightarrow \textrm{CH}_3OH + \textrm{Cl}^- \]

The distance between the chlorine atom and the carbon is an excellent collective variable, \(\zeta\), in this case because this distance is short for the reactant, \(\textrm{CH}_3Cl\), because the carbon and chlorine are chemically bonded, and because it is long for the product state when these two atoms are not chemically bonded. We thus might want to accumulate the probability density, \(P(\zeta)\), as a function of this distance as this will provide us with information about the overall likelihood of the reaction. Furthermore, the free energy, \(F(\zeta)\), is related to this probability density via:

\[ F(\zeta) = - k_B T \ln P(\zeta) \]

Accumulating these probability densities is precisely what this Action can be used to do. Furthermore, the conversion of the histogram to the free energy can be achieved by using the method CONVERT_TO_FES.

We calculate histograms within PLUMED using a method known as kernel density estimation, which you can read more about here:

https://en.wikipedia.org/wiki/Kernel_density_estimation

In PLUMED the value of \(\zeta\) at each discrete instant in time in the trajectory is accumulated. A kernel, \(K(\zeta-\zeta(t'),\sigma)\), centered at the current value, \(\zeta(t)\), of this quantity is generated with a bandwith \(\sigma\), which is set by the user. These kernels are then used to accumulate the ensemble average for the probability density:

\[ \langle P(\zeta) \rangle = \frac{ \sum_{t'=0}^t w(t') K(\zeta-\zeta(t'),\sigma) }{ \sum_{t'=0}^t w(t') } \]

Here the sums run over a portion of the trajectory specified by the user. The final quantity evalulated is a weighted average as the weights, \(w(t')\), allow us to negate the effect any bias might have on the region of phase space sampled by the system. This is discussed in the section of the manual on Analysis.

A discrete analogue of kernel density estimation can also be used. In this analogue the kernels in the above formula are replaced by dirac delta functions. When this method is used the final function calculated is no longer a probability density - it is instead a probability mass function as each element of the function tells you the value of an integral between two points on your grid rather than the value of a (continuous) function on a grid.

Additional material and examples can be also found in the tutorial Belfast tutorial: Analyzing CVs.

Compulsory keywords
STRIDE ( default=1 ) the frequency with which the data should be collected and added to the quantity being averaged
CLEAR ( default=0 ) the frequency with which to clear all the accumulated data. The default value of 0 implies that all the data will be used and that the grid will never be cleared
BANDWIDTH the bandwidths for kernel density esimtation
KERNEL ( default=gaussian ) the kernel function you are using. More details on the kernels available in plumed plumed can be found in kernelfunctions.
GRID_MIN the lower bounds for the grid
GRID_MAX the upper bounds for the grid
Options
SERIAL ( default=off ) do the calculation in serial. Do not parallelize
LOWMEM ( default=off ) lower the memory requirements
TIMINGS ( default=off ) output information on the timings of the various parts of the calculation
UNORMALIZED

( default=off ) output the unaveraged quantity/quantities.

LOGWEIGHTS list of actions that calculates log weights that should be used to weight configurations when calculating averages
ARG the input for this action is the scalar output from one or more other actions. The particular scalars that you will use are referenced using the label of the action. If the label appears on its own then it is assumed that the Action calculates a single scalar value. The value of this scalar is thus used as the input to this new action. If * or *.* appears the scalars calculated by all the proceding actions in the input file are taken. Some actions have multi-component outputs and each component of the output has a specific label. For example a DISTANCE action labelled dist may have three componets x, y and z. To take just the x component you should use dist.x, if you wish to take all three components then use dist.*.More information on the referencing of Actions can be found in the section of the manual on the PLUMED Getting started. Scalar values can also be referenced using POSIX regular expressions as detailed in the section on Regular Expressions. To use this feature you you must compile PLUMED with the appropriate flag. You can use multiple instances of this keyword i.e. ARG1, ARG2, ARG3...
DATA input data from action with vessel and compute histogram
GRID_BIN the number of bins for the grid
GRID_SPACING the approximate grid spacing (to be used as an alternative or together with GRID_BIN)
UPDATE_FROM Only update this action from this time
UPDATE_UNTIL

Only update this action until this time

Examples

The following input monitors two torsional angles during a simulation and outputs a continuos histogram as a function of them at the end of the simulation.

TORSION ATOMS=1,2,3,4 LABEL=r1
TORSION ATOMS=2,3,4,5 LABEL=r2
HISTOGRAM ...
  ARG=r1,r2
  GRID_MIN=-3.14,-3.14
  GRID_MAX=3.14,3.14
  GRID_BIN=200,200
  BANDWIDTH=0.05,0.05
  LABEL=hh
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo

The following input monitors two torsional angles during a simulation and outputs a discrete histogram as a function of them at the end of the simulation.

TORSION ATOMS=1,2,3,4 LABEL=r1
TORSION ATOMS=2,3,4,5 LABEL=r2
HISTOGRAM ...
  ARG=r1,r2
  KERNEL=DISCRETE
  GRID_MIN=-3.14,-3.14
  GRID_MAX=3.14,3.14
  GRID_BIN=200,200
  LABEL=hh
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo

The following input monitors two torsional angles during a simulation and outputs the histogram accumulated thus far every 100000 steps.

TORSION ATOMS=1,2,3,4 LABEL=r1
TORSION ATOMS=2,3,4,5 LABEL=r2
HISTOGRAM ...
  ARG=r1,r2
  GRID_MIN=-3.14,-3.14
  GRID_MAX=3.14,3.14
  GRID_BIN=200,200
  BANDWIDTH=0.05,0.05
  LABEL=hh
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo STRIDE=100000

The following input monitors two torsional angles during a simulation and outputs a separate histogram for each 100000 steps worth of trajectory. Notice how the CLEAR keyword is used here and how it is not used in the previous example.

TORSION ATOMS=1,2,3,4 LABEL=r1
TORSION ATOMS=2,3,4,5 LABEL=r2
HISTOGRAM ...
  ARG=r1,r2 CLEAR=100000
  GRID_MIN=-3.14,-3.14
  GRID_MAX=3.14,3.14
  GRID_BIN=200,200
  BANDWIDTH=0.05,0.05
  GRID_WFILE=histo
  LABEL=hh
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo STRIDE=100000