tfep.io.log.TFEPLogger

class tfep.io.log.TFEPLogger(save_dir_path='tfep_logs', data_loader=None, train_subdir_name='train', eval_subdir_name='eval')[source]

Bases: object

Store and retrieve potential energies and CVs during training/evaluations.

The user can use this to easily store and retrieve arbitrary per-sample quantities such as potential energies and CV values by epoch, batch, or step.

Warning

Currently, this class is not multi-process or thread safe.

Current database format

Currently, the data is stored in compressed numpy format with a different format depending on whether the data was generated during training set or evaluation. In both cases, the data is store as an .npz numpy compressed archive of named numpy 1D arrays. However, training and evaluation data differ in the array dimensions and file naming.

For training, each .npz file is saved in a train/ subdirectory with name epoch-X.npz, where X correspond to the training epoch index used for the data. Each array in the archive has length n_samples_per_epoch, whose value takes into account whether drop_last is set in the PyTorch DataLoader). In each array in the archive archive['name'][i] is the quantity corresponding to the i%batch_size data point in the i//batch_size-th batch.

For the evaluation data, each .npz file is saved in a eval/ subdirectory with name step-X.npz, which correspond to the quantities evaluated using the neural network optimized for X steps.

Finally, a JSON file is used to store metadata about the experiment such as batch and epoch sizes.

__init__(save_dir_path='tfep_logs', data_loader=None, train_subdir_name='train', eval_subdir_name='eval')[source]

Constructor.

Parameters:
  • save_dir_path (str, optional) – The main directory where to save the training and evaluation data.

  • data_loader (torch.utils.data.DataLoader, optional) – The data loader used for training wrapping a :class:tfep.io.dataset.TrajectoryDataset. This must be passed when a new logger is created as it is used to determine epoch, batch, and trajectory dimensions. If save_dir_path points to an existing logger, then this is ignored.

  • train_subdir_name (str, optional) – The name of the subdirectory where the training data is stored.

  • eval_subdir_name (str, optional) – The name of the subdirectory where the evaluation data is stored.

Methods

__init__([save_dir_path, data_loader, ...])

Constructor.

read_eval_tensors([names, step_idx, ...])

Read the tensors generated with the NN model trained for the given number of epoch/batch/step.

read_train_tensors([names, step_idx, ...])

Read the tensors saved with save_train_tensors.

save_eval_tensors(tensors[, step_idx, ...])

Save the tensors generated with the NN model trained for the given number of epoch/batch/step.

save_train_tensors(tensors[, step_idx, ...])

Save the tensors generated during the given epoch/batch/step of training.

Attributes

INDEX_NAMES

MASK_NAME

METADATA_FILE_NAME

VERSION

batch_size

The batch size of the training dataset.

n_batches_per_epoch

the number of batches per training epoch.

n_samples_per_epoch

The number of samples per training epoch.

save_dir_path

The path to the main directory where the data is stored.

property batch_size

The batch size of the training dataset.

property n_batches_per_epoch

the number of batches per training epoch.

property n_samples_per_epoch

The number of samples per training epoch.

This may be equal to the dataset size, depending on the value of the drop_last option in DataLoader.

read_eval_tensors(names=None, step_idx=None, epoch_idx=None, batch_idx=None, remove_nans=False, sort_by=None, as_numpy=False)[source]

Read the tensors generated with the NN model trained for the given number of epoch/batch/step.

Either step_idx or both epoch_idx and batch_idx must be passed.

Parameters:
  • names (List[str], optional) – If given, only the tensors saved with the names in this list are returned. Otherwise, all the saved tensors for this step/epoch/batch are returned.

  • step_idx (int, optional) – If given, the tensors for this optimization step are returned.

  • epoch_idx (int, optional) – If given, the tensors for this epoch are returned. If step_idx is passed, this is ignored.

  • batch_idx (int, optional) – If given together with epoch_idx, the tensors for this epoch/batch are returned. If step_idx is passed, this is ignored.

  • remove_nans (bool or str, optional) – If True only the indices corresponding to non NaN entries are returned. If a string, only the indices corresponding to NaN values of tensors[remove_nans] are returned.

  • sort_by (str, optional) – If given, all the returned tensors will be sorted based on the tensor with this name (useful if sort_by is 'trajectory_sample_index'). The new data order is also stored on disk thus subsequent calls without saving new data are guaranteed to follow in the same order.

  • as_numpy (bool, optional) – If True, the tensors are returned as a numpy array rather than PyTorch Tensors.

Returns:

tensors – A dictionary mapping the name of the saved tensors to their values.

Return type:

Dict[str, torch.Tensor]

read_train_tensors(names=None, step_idx=None, epoch_idx=None, batch_idx=None, remove_nans=False, as_numpy=False)[source]

Read the tensors saved with save_train_tensors.

At least one between step_idx and epoch_idx must be passed. Note that only the data for the batches that have been saved are returned. As a consequence, the returned tensors might be smaller than the number of samples per epoch if the training was interrupted before the end of the epoch.

Parameters:
  • names (List[str], optional) – If given, only the tensors saved with the names in this list are returned. Otherwise, all the saved tensors for this step/epoch/batch are returned.

  • step_idx (int, optional) – If given, the tensors for this optimization step are returned.

  • epoch_idx (int, optional) – If given, the tensors for this epoch are returned. If step_idx is passed, this is ignored.

  • batch_idx (int, optional) – If given together with epoch_idx, the tensors for this epoch/batch are returned. If step_idx is passed, this is ignored.

  • remove_nans (bool or str, optional) – If True only the indices corresponding to non NaN entries are returned. If a string, only the indices corresponding to NaN values of tensors[remove_nans] are returned.

  • as_numpy (bool, optional) – If True, the tensors are returned as a numpy array rather than PyTorch Tensors.

Returns:

tensors – A dictionary mapping the name of the saved tensors to their values.

Return type:

Dict[str, torch.Tensor]

property save_dir_path

The path to the main directory where the data is stored.

save_eval_tensors(tensors, step_idx=None, epoch_idx=None, batch_idx=None, update=False)[source]

Save the tensors generated with the NN model trained for the given number of epoch/batch/step.

Either step_idx or both epoch_idx and batch_idx must be passed.

Currently, saving only some of the tensors already on disk is not supported. In other words, the tensor names that will be saved in the first call to save_eval_tensors for the given step will have to be in all subsequent calls.

Warning

By default, no check is performed on writing twice data for the same sample indices and the data is simply appended to the existing one. This check is performed only if update is set True and the existing data is overwritten. This check is based on the tensors named (in order of priority) 'trajectory_sample_indices' and 'dataset_sample_indices'. Note that this might be an expensive operation and should not be used if not necessary.

Parameters:
  • tensors (Dict[str, torch.Tensor]) – A dictionary mapping the name of the saved tensors to their values. All tensors must have shape (batch_size,) unless only epoch_idx is provided, in which case they must have shape (n_samples_per_epoch,).

  • step_idx (int or None) – If given, the tensors for this optimization step are saved.

  • epoch_idx (int or None) – If given, the tensors for this epoch are saved.

  • batch_idx (int or None) – If given together with epoch_idx, the tensors for this epoch/batch are saved. Otherwise, the data is assumed to be for the entire epoch.

  • update (bool, optional) – If True, data points corresponding to already stored sample indices are updated (this slows down the method). If False, this check is not performed and all tensors are simply added to the logger.

save_train_tensors(tensors, step_idx=None, epoch_idx=None, batch_idx=None)[source]

Save the tensors generated during the given epoch/batch/step of training.

At least one between step_idx and epoch_idx/batch_idx must be passed.

Parameters:
  • tensors (Dict[str, torch.Tensor]) – A dictionary mapping the name of the saved tensors to their values. All tensors must have shape (batch_size,) unless only epoch_idx is provided, in which case they must have shape (n_samples_per_epoch,).

  • step_idx (int or None) – If given, the tensors for this optimization step are saved.

  • epoch_idx (int or None) – If given, the tensors for this epoch are saved.

  • batch_idx (int or None) – If given together with epoch_idx, the tensors for this epoch/batch are saved. Otherwise, the data is assumed to be for the entire epoch.