Convergence Analysis Tools

A collection of tools for assessing statistical error and convergence, found in the Packages/Convergence/ directory:

assign_frames

Given a trajectory and a set of fiducial structures (histogram centers), assign each frame in the trajectory to a histogram bin. Part of the workflow for computing the effective sample size. (See effsize.pl)

avgconv

Computes the RMSD between the average structure for time i and i+1 for a trajectory. The "locally optimal" flag determines whether the trajectory is globally aligned first or whether each block of frames used in the average is aligned prior to averaging.

bcom

Implements the block covariance overlap method. Briefly, think of block-averaging where the trajectory is broken up into blocks of a given size, the PCA computed for the block, and then the covariance overlap is calculated between the block's PCA and the PCA for the entire trajectory. Then this is repeated for increasing block sizes. A Z-score for the bcom result can also be calculated (using the –zscore=1 flag and optionally setting the number of "tries" to use).

block_average

Reads a simple columnated text file, and computes the block-averaged standard error as a function of block size. The plateau value is the best estimate for the true standard error. Reference: Flyvbjerg, H. & Petersen, H. G. J. Chem. Phys., 1989, 91, 461-466

block_avgconv

Block-averaging of RMSD between average structures for a trajectory. "Range" in this case is the range of block sizes and not stricly which frames of the trajectory to use.

bootstrap_overlap.pl

PERL program to compute the bcom and bootstrapped bcom for a trajectory, generating a plot of their ratio and an exponential fit. Also generates a plot of the residual error in the fit. Use the "--help" option for more details. Note that the number of block sizes used is somewhat conservative, so it's probably a good idea to use a low number of block sizes initially to get a quick idea of how good or bad the sampling is, and then use the higher number of blocks for a more detailed analysis. Also note that plotting requires gnuplot. If you do not have gnuplot installed (or do not like gnuplot), use the "--noplot" flag to disable this.

boot_bcom

Bootstrapped bcom is similar to bcom above, but rather than using contiguous blocks, it uses a bootstrap procedure by randomly selecting frames from the trajectory to build decorrelated blocks. If no seed for the random number generator is given, LOOS will pick a default (based on the current system clock). The –replicates option determines how many blocks are generated for a given size.

chist

Calculates either a cumulative histogram (where each output row is the histogram up to that point), or a windowed histogram.

coscon

Computes the cosine content for varying windows of a trajectory, based on Hess, B. "Convergence of sampling in protein simulations." Phys Rev E (2002) 65(3):031910

decorr_time

Decorrelaton time as computed by structural histogram analysis. The default values for the range of N-values, repetitions, and bin fraction are taken from the paper below and may need to be changed, particularly if you are using a trajectory you suspect is undersampled. Reference: Lyman & Zuckerman, J Phys Chem B (2007) 111:1287-82

effsize.pl

PERL front-end to the effective sample size tools (ufidpick, assign_frames, hierarchy, neff). If you want to apply the Zuckerman-style effective sample size method (see the entry for neff, below), you probably should use this script instead of the individual tools, since this tool automates the process of picking fiducial structures (the frames that will be the centers of your histogram bins), assigning the frames from the trajectories to those bins, working out the mean first passage time between bins, and computing the effective sample size. Reference: Lyman & Zuckerman, Biophys J (2006) 91:164-72

fidpick

Picks fiducial structures for structural histograms. Reference: Lyman & Zuckerman, Biophys J (2006) 91:164-72

hierarchy

Given a trajectory whose structures have been binned into states via reference structures, computes the mean first passage time between states and then constructs a hierarchy of states based on exchange rates. Used to generate input for neff. Based on Zhang, Bhatt, and Zuckerman; JCTC, DOI: 10.1021/ct1002384 and code provided by the Zuckerman Lab (http://www.ccbb.pitt.edu/Faculty/zuckerman/software.html)

neff

Computes effective sample size given an assignment and state file (from hierarchy). Based on Zhang, Bhatt, and Zuckerman; JCTC, DOI: 10.1021/ct1002384 and code provided by the Zuckerman Lab (http://www.ccbb.pitt.edu/Faculty/zuckerman/software.html)

qcoscon

Computes a "quick" cosine content using the entire trajectory for the top few modes, based on Hess, B. "Convergence of sampling in protein simulations." Phys Rev E (2002) 65(3):031910

sortfids

Sorts fiducials (from fidpick) based on a decreasing bin population.

ufidpick

Picks a set of fiducial structures from a trajectory using a uniform distribution. Reference: Lyman & Zuckerman, J Phys Chem B (2007) 111:12876-82