tfep.io.dataset.traj.TrajectorySubset

class tfep.io.dataset.traj.TrajectorySubset(dataset, indices)[source]

Bases: object

A subset of a TrajectoryDataset.

Provides the same functionality of the PyTorch class torch.utils.data.Subset, which is to provided a subset of the main dataset, but for TrajectoryDataset.

Contrarily to torch.utils.data.Subset, TrajectorySubset can also be constructed from a filter function rather than only a list of indices.

The class exposes the same interface as TrajectoryDataset, with the exception of TrajectoryDataset.subsample(). The reason for this exception is to avoid users to inadvertently leave an object in an undesired state since the indices of TrajectorySubset might be meaningless after the subsampling.

Parameters:
  • dataset (TrajectoryDataset or TrajectorySubset) – The trajectory dataset.

  • indices (array_like) – A list of indices of the dataset elements forming the subset.

Examples

First we create the main TrajectoryDataset

>>> import os
>>> import MDAnalysis
>>> test_data_dir_path = os.path.join(os.path.dirname(__file__), '..', '..', 'tests', 'data')
>>> pdb_file_path = os.path.join(test_data_dir_path, 'chloro-fluoromethane.pdb')
>>> universe = MDAnalysis.Universe(pdb_file_path, dt=5)  # ps
>>> trajectory_dataset = TrajectoryDataset(universe)

We can then create a subset of the indices.

>>> len(trajectory_dataset)
5
>>> trajectory_subset = TrajectorySubset(trajectory_dataset, indices=[0, 2, 4])
>>> len(trajectory_subset)
3

Or alternatively from a filter function taking as input an MDAnalysis Timestep object and returning True or False whether the sample must be included in the subset or not. The following trivial example takes all samples for which the distance between two atoms is greater than 3 Angstrom.

>>> filter_func = lambda idx, ts: np.linalg.norm(ts.positions[1] - ts.positions[0]) > 3
>>> trajectory_subset = TrajectorySubset.from_filter(trajectory_dataset, filter_func)
>>> len(trajectory_subset)
2

The TrajectorySubset can be used as a normal TrajectoryDataset.

>>> trajectory_subset.n_atoms
6
>>> trajectory_subset.select_atoms('index 0:2 or index 4')
__init__(dataset, indices)[source]

Methods

__init__(dataset, indices)

from_filter(dataset, filter_func)

Static constructor creating a subset based on a boolean filter function.

get_timestep(item)

Return the MDAnalysis Timestep object of the frame with the given index.

iterate_as_timestep()

Iterate over MDAnalysis Timestep objects.

select_atoms(selection)

Select a subset of atoms.

Attributes

n_atoms

Number of selected atoms in the dataset.

return_dataset_sample_index

Whether to return the keyword "dataset_sample_index" in the batch sample.

return_trajectory_sample_index

Whether to return the keyword "trajectory_sample_index" in the batch sample.

trajectory_sample_indices

Indices of the dataset semples in the trajectory (before subsampling).

universe

The MDAnalysis Universe object encapsulated by the dataset.

classmethod from_filter(dataset, filter_func)[source]

Static constructor creating a subset based on a boolean filter function.

Parameters:
  • dataset (TrajectoryDataset) – The trajectory dataset.

  • filter_func (Callable) – A function taking as input (in this order) the index of the sample in the original dataset and the MDAnalysis Timestep object and returning True or False if the sample must be included in the subset or not.

Returns:

subset – A new TrajectorySubset object.

Return type:

TrajectorySubset

get_timestep(item)[source]

Return the MDAnalysis Timestep object of the frame with the given index.

See also TrajectoryDataset.get_timestep().

iterate_as_timestep()[source]

Iterate over MDAnalysis Timestep objects.

See also TrajectoryDataset.iterate_as_timestep().

property n_atoms

Number of selected atoms in the dataset.

property return_dataset_sample_index

Whether to return the keyword "dataset_sample_index" in the batch sample.

property return_trajectory_sample_index

Whether to return the keyword "trajectory_sample_index" in the batch sample.

select_atoms(selection)[source]

Select a subset of atoms.

See also TrajectoryDataset.iterate_as_timestep.select_atoms().

property trajectory_sample_indices

Indices of the dataset semples in the trajectory (before subsampling).

trajectory_sample_indices[i] is the index of the i-th sample in self.dataset.trajectory.

property universe

The MDAnalysis Universe object encapsulated by the dataset.