af2rave.spib module

af2rave.spib.SPIBProcess

The SPIB data analysis module.

class SPIBProcess(traj: str | list[str], **kwargs)[source]

Bases: object

This is the af2rave wrapper for SPIB. To initialize, provide the list of Colvar files for SPIB to process.

Parameters:

traj (str | list[str]) – The list of trajectory files to process.
init (str) – The way initial labels are initialized. Default is “split” which split each piece of the trajectory in half. Available options include: “tica:n” where n is the number of clusters.

run(time_lag: int, **kwargs)[source]

Run SPIB on the loaded data.

Parameters:: time_lag (int) – The time lag for SPIB.
Returns:: SPIBResult object.
Return type:: SPIBResult

af2rave.spib.SPIBResult

Container class for SPIB results.

class SPIBResult(prefix: str, postfix: str, n_traj: int, **kwargs)[source]

Bases: object

The container class for SPIB results. An SPIBProcess will create an instance of it. This object has a variety of method to retrieve and analyze the results.

Example usage:

result = SPIBResult.from_file("spib_result.pkl")
n = result.n_converged_states

This will compute and return the number of converged states in the SPIB model.

property apparent_bias: ndarray[tuple[Any, ...], dtype[_ScalarT]]: The apparent bias of the model. This is the bias when directly applied to the CVs. Shape: 2 x 1

property apparent_weight: ndarray[tuple[Any, ...], dtype[_ScalarT]]: The apparent weight of the model. This is the weight when directly applied to the CVs. Shape: 2 x n_input_dims

property bias: ndarray[tuple[Any, ...], dtype[_ScalarT]]

The bias of the linear encoder.

This bias is applied to the normalized input data for understanding the projection. To apply to raw input data, use apparent_weight.

property dt: float: Time lag used for this run.

classmethod from_file(filename: str) → SPIBResult[source]

Load a SPIBResult object from a binary pickle file.

Parameters:: filename (str) – The filename to load.
Returns:: The SPIBResult object.
Return type:: SPIBResult
Raises:: FileNotFoundError – If the file does not exist.

get_free_energy(nbins=200) → tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Get the free energy as the negative logarithm of the probability distribution. Unit: kT

Parameters:: nbins (int, (int, int), optional, default=200) – The number of bins for the histogram. The format is the same with np.histogram2d.
Returns:: x, y, f. x and y are the bin edges, f is the free energy.

Example:

Plot in matplotlib

plt.pcolor(*result.get_free_energy(), cmap="RdBu_r", shading="auto")

The sequence of the return value allows a direct input to the pcolor function.

get_latent_representation(traj_idx: list[int] | int = None) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Return the latent representation of one or all trajectory. If no index is provided, return all trajectories.

Parameters:: traj_idx (int, list[int], optional) – The index of the trajectory. If None, return all trajectories.
Returns:: The latent representation. shape: 2 x n_frames * n_traj
Return type:: np.ndarray

Example:

Get the latent representation of the first and second trajectory

result.get_latent_representation([0, 1])

Get the latent representation of all trajectories

result.get_latent_representation()

get_probability_distribution(nbins=200) → tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]

Get the probability distribution in the latent space.

Parameters:: nbins (int, (int, int), optional, default=200) – The number of bins for the histogram. The format is the same with np.histogram2d.
Returns:: x, y, h. x and y are the bin edges, h is the histogram.

Example:

Plot in matplotlib

plt.pcolor(*result.get_probability_distribution(), cmap="RdBu_r", shading="auto")

The sequence of the return value allows a direct input to the pcolor function.

get_state_label(traj_idx: int = None)[source]

Return the state label of the trajectory. If no index is provides, return all trajectories.

Parameters:: traj_idx (int, optional) – The index of the trajectory. If None, return all trajectories.
Returns:: The state label as a number from 0 to n_states. shape: n_frames
Return type:: np.ndarray

get_traj_label(traj_idx: int = None)[source]

Return the trajectory label of the trajectory, i.e. the index of the trajectory each frame belongs to. If no index is provides, return all trajectories. When a index is provided, the return value the same as [index] * nframes.

Parameters:: traj_idx (int, optional) – The index of the trajectory. If None, return all trajectories.
Returns:: The trajectory label. shape: n_frames
Return type:: np.ndarray

property n_converged_states: int: The number of remaining converged states.

property n_input_labels: int: The number of initial states/input labels. With default initialization, this is 2 * n_traj.

property n_traj: int: The number of input trajectories.

project(X) → ndarray[tuple[Any, ...], dtype[_ScalarT]][source]

Project the input data to the latent space.

Parameters:: X (np.ndarray) – The input data to project. Dimension: n_input_dims x n_frames
Returns:: The projected data. Dimension: 2 x n_frames
Return type:: np.ndarray

project_colvar(X: Colvar) → Colvar[source]

Project the input colvar into the latent space.

Parameters:: X (Colvar) – The input colvar to project.
Returns:: The projected colvar.
Return type:: Colvar

project_state_label(X: ndarray[tuple[Any, ...], dtype[_ScalarT]])[source]

Project an arbitary coordinate in the latent space to the most probable state. NaN will be returned if the coordinate is outside the convex hull of the latent space.

This algorithm interpolates a one-hot vector representation of state labels over the latent space with a 2D Clough-Tocher interpolator as implemented in scipy.

Parameters:: X (np.ndarray) – The coordinate, shape: 2 * npoints
Returns:: The state label.
Return type:: np.ndarray

to_file(filename: str) → None[source]

Dump the object into a binary pickle file for future use.

Parameters:: filename (str) – The filename to save.

property weight: ndarray[tuple[Any, ...], dtype[_ScalarT]]

The weight of the linear encoder.

This weight is applied to the normalized input data, so it provides understanding of the projection. To apply to raw input data, use SPIBResult.apparent_weight.