tfep.io.log.TFEPLogger
- class tfep.io.log.TFEPLogger(save_dir_path='tfep_logs', data_loader=None, train_subdir_name='train', eval_subdir_name='eval')[source]
Bases:
objectStore and retrieve potential energies and CVs during training/evaluations.
The user can use this to easily store and retrieve arbitrary per-sample quantities such as potential energies and CV values by epoch, batch, or step.
Warning
Currently, this class is not multi-process or thread safe.
Current database format
Currently, the data is stored in compressed numpy format with a different format depending on whether the data was generated during training set or evaluation. In both cases, the data is store as an
.npznumpy compressed archive of named numpy 1D arrays. However, training and evaluation data differ in the array dimensions and file naming.For training, each
.npzfile is saved in atrain/subdirectory with nameepoch-X.npz, whereXcorrespond to the training epoch index used for the data. Each array in the archive has lengthn_samples_per_epoch, whose value takes into account whetherdrop_lastis set in the PyTorchDataLoader). In each array in the archivearchive['name'][i]is the quantity corresponding to thei%batch_sizedata point in thei//batch_size-th batch.For the evaluation data, each
.npzfile is saved in aeval/subdirectory with namestep-X.npz, which correspond to the quantities evaluated using the neural network optimized forXsteps.Finally, a JSON file is used to store metadata about the experiment such as batch and epoch sizes.
- __init__(save_dir_path='tfep_logs', data_loader=None, train_subdir_name='train', eval_subdir_name='eval')[source]
Constructor.
- Parameters:
save_dir_path (str, optional) – The main directory where to save the training and evaluation data.
data_loader (torch.utils.data.DataLoader, optional) – The data loader used for training wrapping a :class:
tfep.io.dataset.TrajectoryDataset. This must be passed when a new logger is created as it is used to determine epoch, batch, and trajectory dimensions. Ifsave_dir_pathpoints to an existing logger, then this is ignored.train_subdir_name (str, optional) – The name of the subdirectory where the training data is stored.
eval_subdir_name (str, optional) – The name of the subdirectory where the evaluation data is stored.
Methods
__init__([save_dir_path, data_loader, ...])Constructor.
read_eval_tensors([names, step_idx, ...])Read the tensors generated with the NN model trained for the given number of epoch/batch/step.
read_train_tensors([names, step_idx, ...])Read the tensors saved with
save_train_tensors.save_eval_tensors(tensors[, step_idx, ...])Save the tensors generated with the NN model trained for the given number of epoch/batch/step.
save_train_tensors(tensors[, step_idx, ...])Save the tensors generated during the given epoch/batch/step of training.
Attributes
INDEX_NAMESMASK_NAMEMETADATA_FILE_NAMEVERSIONThe batch size of the training dataset.
the number of batches per training epoch.
The number of samples per training epoch.
The path to the main directory where the data is stored.
- property batch_size
The batch size of the training dataset.
- property n_batches_per_epoch
the number of batches per training epoch.
- property n_samples_per_epoch
The number of samples per training epoch.
This may be equal to the dataset size, depending on the value of the
drop_lastoption inDataLoader.
- read_eval_tensors(names=None, step_idx=None, epoch_idx=None, batch_idx=None, remove_nans=False, sort_by=None, as_numpy=False)[source]
Read the tensors generated with the NN model trained for the given number of epoch/batch/step.
Either
step_idxor bothepoch_idxandbatch_idxmust be passed.- Parameters:
names (List[str], optional) – If given, only the tensors saved with the names in this list are returned. Otherwise, all the saved tensors for this step/epoch/batch are returned.
step_idx (int, optional) – If given, the tensors for this optimization step are returned.
epoch_idx (int, optional) – If given, the tensors for this epoch are returned. If
step_idxis passed, this is ignored.batch_idx (int, optional) – If given together with
epoch_idx, the tensors for this epoch/batch are returned. Ifstep_idxis passed, this is ignored.remove_nans (bool or str, optional) – If
Trueonly the indices corresponding to non NaN entries are returned. If a string, only the indices corresponding to NaN values oftensors[remove_nans]are returned.sort_by (str, optional) – If given, all the returned tensors will be sorted based on the tensor with this name (useful if
sort_byis'trajectory_sample_index'). The new data order is also stored on disk thus subsequent calls without saving new data are guaranteed to follow in the same order.as_numpy (bool, optional) – If
True, the tensors are returned as a numpy array rather than PyTorchTensors.
- Returns:
tensors – A dictionary mapping the name of the saved tensors to their values.
- Return type:
Dict[str, torch.Tensor]
- read_train_tensors(names=None, step_idx=None, epoch_idx=None, batch_idx=None, remove_nans=False, as_numpy=False)[source]
Read the tensors saved with
save_train_tensors.At least one between
step_idxandepoch_idxmust be passed. Note that only the data for the batches that have been saved are returned. As a consequence, the returned tensors might be smaller than the number of samples per epoch if the training was interrupted before the end of the epoch.- Parameters:
names (List[str], optional) – If given, only the tensors saved with the names in this list are returned. Otherwise, all the saved tensors for this step/epoch/batch are returned.
step_idx (int, optional) – If given, the tensors for this optimization step are returned.
epoch_idx (int, optional) – If given, the tensors for this epoch are returned. If
step_idxis passed, this is ignored.batch_idx (int, optional) – If given together with
epoch_idx, the tensors for this epoch/batch are returned. Ifstep_idxis passed, this is ignored.remove_nans (bool or str, optional) – If
Trueonly the indices corresponding to non NaN entries are returned. If a string, only the indices corresponding to NaN values oftensors[remove_nans]are returned.as_numpy (bool, optional) – If
True, the tensors are returned as a numpy array rather than PyTorchTensors.
- Returns:
tensors – A dictionary mapping the name of the saved tensors to their values.
- Return type:
Dict[str, torch.Tensor]
- property save_dir_path
The path to the main directory where the data is stored.
- save_eval_tensors(tensors, step_idx=None, epoch_idx=None, batch_idx=None, update=False)[source]
Save the tensors generated with the NN model trained for the given number of epoch/batch/step.
Either
step_idxor bothepoch_idxandbatch_idxmust be passed.Currently, saving only some of the tensors already on disk is not supported. In other words, the tensor names that will be saved in the first call to
save_eval_tensorsfor the given step will have to be in all subsequent calls.Warning
By default, no check is performed on writing twice data for the same sample indices and the data is simply appended to the existing one. This check is performed only if
updateis setTrueand the existing data is overwritten. This check is based on the tensors named (in order of priority)'trajectory_sample_indices'and'dataset_sample_indices'. Note that this might be an expensive operation and should not be used if not necessary.- Parameters:
tensors (Dict[str, torch.Tensor]) – A dictionary mapping the name of the saved tensors to their values. All tensors must have shape
(batch_size,)unless onlyepoch_idxis provided, in which case they must have shape(n_samples_per_epoch,).step_idx (int or None) – If given, the tensors for this optimization step are saved.
epoch_idx (int or None) – If given, the tensors for this epoch are saved.
batch_idx (int or None) – If given together with
epoch_idx, the tensors for this epoch/batch are saved. Otherwise, the data is assumed to be for the entire epoch.update (bool, optional) – If
True, data points corresponding to already stored sample indices are updated (this slows down the method). IfFalse, this check is not performed and all tensors are simply added to the logger.
- save_train_tensors(tensors, step_idx=None, epoch_idx=None, batch_idx=None)[source]
Save the tensors generated during the given epoch/batch/step of training.
At least one between
step_idxandepoch_idx/batch_idxmust be passed.- Parameters:
tensors (Dict[str, torch.Tensor]) – A dictionary mapping the name of the saved tensors to their values. All tensors must have shape
(batch_size,)unless onlyepoch_idxis provided, in which case they must have shape(n_samples_per_epoch,).step_idx (int or None) – If given, the tensors for this optimization step are saved.
epoch_idx (int or None) – If given, the tensors for this epoch are saved.
batch_idx (int or None) – If given together with
epoch_idx, the tensors for this epoch/batch are saved. Otherwise, the data is assumed to be for the entire epoch.