tfep.potentials.psi4.Psi4PotentialEnergyFunc
- class tfep.potentials.psi4.Psi4PotentialEnergyFunc(*args, **kwargs)[source]
Bases:
FunctionPyTorch-differentiable potential energy of a Psi4 molecule.
This is essentially a wrapper of
psi4.energy, but it provides additional functionalities: - Handle batches of coordinate tensors of shape(batch_size, 3*n_atoms). - Provides sample-specific restart capabilities for samples within the batch. - Implement thetorch.autograd.Functioninterface to enable the calculationof the potential energy gradients used for backpropagation through
psi4.gradient.For efficiency reasons, by default the function computes and cache the gradient (i.e., the forces) during the forward pass so that it can be used during backpropagation. This gives better performance overall when backpropagation is necessary as the wavefunction is converged just once. Even when a restart file is provided, the restart wavefunction correspond only to that of the Hartree-Fock SCF procedure so if another potential is used (e.g., MP2), it requires another self-consistent calculation. If backpropagation is not necessary, set
precompute_gradienttoFalse.Double backpropagation (sometimes necessary, for example, to train on forces) is supported by estimating the vector-Hessian product with finite-differences [1].
By default, the perform the batch of energy/gradient calculations serially, using the native thread parallelization implemented in Psi4. This scheme is, however, not embarassingly parallel. Thus, the module supports batch parallelization schemes through :class:
tfep.utils.parallel.ParallelizationStrategy``s Note that, because a psi4 ``Moleculeis not picklable, it cannot be sent to multiple processes for the purpose of batch parallelization (e.g., using :class:~tfep.utils.parallel.ProcessPoolStrategy). A work-around is to use aninitializerfor themultiprocessing.Poolthat creates and activates the molecule in each subprocess (see example below).- Parameters:
ctx (torch.autograd.function._ContextMethodMixin) – A context to save information for the gradient.
batch_positions (torch.Tensor) – A tensor of positions in flattened format (i.e., with shape
(batch_size, 3*n_atoms)).name (str) – The name of the potential to pass to
psi4.energy().molecule (psi4.core.Molecule, optional) – If not
None, this will be set as the currently activated molecule in Psi4. Note that the old active molecule is not restored at the end of the execution.positions_unit (pint.Unit, optional) – The unit of the positions passed. This is used to appropriately convert
batch_positionsto Psi4 units. IfNone, no conversion is performed, which assumes that the input positions are in the same units used by Psi4.energy_unit (pint.Unit, optional) – The unit used for the returned energies (and as a consequence forces). This is used to appropriately convert Psi4 energies into the desired units. If
None, no conversion is performed, which means that energies and forces will use Psi4 units.write_orbitals (bool or str or List[str], optional) – This option is passed to
psi4.energyto store the wavefunction on disk at each Hartree-Fock SCF iteration, which can later be used to restart the calculation. If alist, it must specify a path to the path to the restart file to write for each batch sample.restart_file (str or List[str], optional) – A Psi4 restart file path (or a list of restart file paths, one for each batch sample) storing a wavefunction that can be used as a starting point for the Hartree-Fock SCF optimization.
precompute_gradient (bool, optional) – If
True, the gradient is computed in the forward pass and saved to be consumed during backward.parallelization_strategy (tfep.utils.parallel.ParallelizationStrategy, optional) – The parallelization strategy used to distribute batches of energy and gradient calculations. By default, these are executed serially using the thread-based parallelization native in psi4.
on_unconverged (str, optional) – Specifies how to handle the case in which the calculation did not converge. It can have the following values: -
'raise': Raise the Psi4 exception. -'nan': Returnfloat('nan')energy and zero forces. To treat the calculation as converged and return the latest energy, force, and/or wavefunction, simply set the psi4 global option'fail_on_maxiter'.kwargs (dict, optional) – Other keyword arguments to pass to
psi4.energyandpsi4.gradient.
- Returns:
potentials –
potentials[i]is the potential energy of configurationbatch_positions[i].- Return type:
torch.Tensor
See also
Psi4PotentialModuleAPI for computing potential energies with Psi4.psi4.energydocumentation:More information on the supported keyword arguments.
psi4.gradientdocumentation:More information on the supported keyword arguments.
Examples
The example sets up a parallelization strategy based on a pool of processes for the calculation of energies and gradients of a water molecule. Note that molecules cannot be sent between processes with
pickleso it is convenient to create the molecule and activate it in the process through aninitializer. Note that the functional syntaxpsi4_potential_energy()is used rather thanPsi4PotentialEnergyFunc.apply(), which do not support keyword arguments.import numpy as np import pint from torch.multiprocessing import Pool from tfep.utils.parallel import ProcessPoolStrategy def pool_process_initializer(positions): # Create a water molecule. molecule = create_psi4_molecule(positions=positions, activate=True, elem=['O', 'H', 'H']) # Create a scratch directory. scratch_dir_path = os.path.join('tmp/', str(os.getpid())) os.makedirs(scratch_dir_path, exist_ok=True) # Configure psi4 and activate the molecule. configure_psi4( n_threads=1, psi4_output_file_path='quiet', psi4_scratch_dir_path=scratch_dir_path, active_molecule=molecule, global_options=dict(basis='cc-pvtz', reference='RHF'), ) # A batch of size 2 with identical positions. ureg = pint.UnitRegistry() positions = [ [-0.2950, -0.2180, 0.1540], [-0.0170, 0.6750, 0.4080], [0.3120, -0.4570, -0.5630], ] batch_positions = np.array([positions, positions], dtype=np.double) * ureg.angstrom with Pool(2, pool_process_initializer, initargs=[batch_positions[0]]) as p: strategy = ProcessPoolStrategy(p) energy = psi4_potential_energy(batch_positions, name='scf', positions_unit=ureg.angstrom)
References
- [1] Putrino A, Sebastiani D, Parrinello M. Generalized variational density
functional perturbation theory. The Journal of Chemical Physics. 2000 Nov 1;113(17):7102-9.
- __init__(*args, **kwargs)
Methods
__init__(*args, **kwargs)apply(*args, **kwargs)backward(ctx, grad_output)Compute the gradient of the potential energy.
forward(ctx, batch_positions, name[, ...])Compute the potential energy of the molecule with Psi4.
jvp(ctx, *grad_inputs)Define a formula for differentiating the operation with forward mode automatic differentiation.
mark_dirty(*args)Mark given tensors as modified in an in-place operation.
mark_non_differentiable(*args)Mark outputs as non-differentiable.
mark_shared_storage(*pairs)maybe_clear_saved_tensorsnameregister_hookregister_prehooksave_for_backward(*tensors)Save given tensors for a future call to
backward().save_for_forward(*tensors)Save given tensors for a future call to
jvp().set_materialize_grads(value)Set whether to materialize grad tensors.
setup_context(ctx, inputs, output)There are two ways to define the forward pass of an autograd.Function.
vjp(ctx, *grad_outputs)Define a formula for differentiating the operation with backward mode automatic differentiation.
vmap(info, in_dims, *args)Define the behavior for this autograd.Function underneath
torch.vmap().Attributes
dirty_tensorsgenerate_vmap_rulematerialize_gradsmetadataneeds_input_gradnext_functionsnon_differentiablerequires_gradsaved_for_forwardsaved_tensorssaved_variablesto_save- static forward(ctx, batch_positions, name, molecule=None, positions_unit=None, energy_unit=None, write_orbitals=False, restart_file=None, precompute_gradient=True, parallelization_strategy=None, on_unconverged='raise', kwargs=None)[source]
Compute the potential energy of the molecule with Psi4.
- static jvp(ctx: Any, *grad_inputs: Any) Any
Define a formula for differentiating the operation with forward mode automatic differentiation.
This function is to be overridden by all subclasses. It must accept a context
ctxas the first argument, followed by as many inputs as theforward()got (None will be passed in for non tensor inputs of the forward function), and it should return as many tensors as there were outputs toforward(). Each argument is the gradient w.r.t the given input, and each returned value should be the gradient w.r.t. the corresponding output. If an output is not a Tensor or the function is not differentiable with respect to that output, you can just pass None as a gradient for that input.You can use the
ctxobject to pass any value from the forward to this functions.
- mark_dirty(*args: Tensor)
Mark given tensors as modified in an in-place operation.
This should be called at most once, in either the
setup_context()orforward()methods, and all arguments should be inputs.Every tensor that’s been modified in-place in a call to
forward()should be given to this function, to ensure correctness of our checks. It doesn’t matter whether the function is called before or after modification.- Examples::
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_AUTOGRAD) >>> class Inplace(Function): >>> @staticmethod >>> def forward(ctx, x): >>> x_npy = x.numpy() # x_npy shares storage with x >>> x_npy += 1 >>> ctx.mark_dirty(x) >>> return x >>> >>> @staticmethod >>> @once_differentiable >>> def backward(ctx, grad_output): >>> return grad_output >>> >>> a = torch.tensor(1., requires_grad=True, dtype=torch.double).clone() >>> b = a * a >>> Inplace.apply(a) # This would lead to wrong gradients! >>> # but the engine would not know unless we mark_dirty >>> # xdoctest: +SKIP >>> b.backward() # RuntimeError: one of the variables needed for gradient >>> # computation has been modified by an inplace operation
- mark_non_differentiable(*args: Tensor)
Mark outputs as non-differentiable.
This should be called at most once, in either the
setup_context()orforward()methods, and all arguments should be tensor outputs.This will mark outputs as not requiring gradients, increasing the efficiency of backward computation. You still need to accept a gradient for each output in
backward(), but it’s always going to be a zero tensor with the same shape as the shape of a corresponding output.- This is used e.g. for indices returned from a sort. See example::
>>> class Func(Function): >>> @staticmethod >>> def forward(ctx, x): >>> sorted, idx = x.sort() >>> ctx.mark_non_differentiable(idx) >>> ctx.save_for_backward(x, idx) >>> return sorted, idx >>> >>> @staticmethod >>> @once_differentiable >>> def backward(ctx, g1, g2): # still need to accept g2 >>> x, idx = ctx.saved_tensors >>> grad_input = torch.zeros_like(x) >>> grad_input.index_add_(0, idx, g1) >>> return grad_input
- save_for_backward(*tensors: Tensor)
Save given tensors for a future call to
backward().save_for_backwardshould be called at most once, in either thesetup_context()orforward()methods, and only with tensors.All tensors intended to be used in the backward pass should be saved with
save_for_backward(as opposed to directly onctx) to prevent incorrect gradients and memory leaks, and enable the application of saved tensor hooks. Seetorch.autograd.graph.saved_tensors_hooks.Note that if intermediary tensors, tensors that are neither inputs nor outputs of
forward(), are saved for backward, your custom Function may not support double backward. Custom Functions that do not support double backward should decorate theirbackward()method with@once_differentiableso that performing double backward raises an error. If you’d like to support double backward, you can either recompute intermediaries based on the inputs during backward or return the intermediaries as the outputs of the custom Function. See the double backward tutorial for more details.In
backward(), saved tensors can be accessed through thesaved_tensorsattribute. Before returning them to the user, a check is made to ensure they weren’t used in any in-place operation that modified their content.Arguments can also be
None. This is a no-op.See extending-autograd for more details on how to use this method.
- Example::
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_AUTOGRAD) >>> class Func(Function): >>> @staticmethod >>> def forward(ctx, x: torch.Tensor, y: torch.Tensor, z: int): >>> w = x * z >>> out = x * y + y * z + w * y >>> ctx.save_for_backward(x, y, w, out) >>> ctx.z = z # z is not a tensor >>> return out >>> >>> @staticmethod >>> @once_differentiable >>> def backward(ctx, grad_out): >>> x, y, w, out = ctx.saved_tensors >>> z = ctx.z >>> gx = grad_out * (y + y * z) >>> gy = grad_out * (x + z + w) >>> gz = None >>> return gx, gy, gz >>> >>> a = torch.tensor(1., requires_grad=True, dtype=torch.double) >>> b = torch.tensor(2., requires_grad=True, dtype=torch.double) >>> c = 4 >>> d = Func.apply(a, b, c)
- save_for_forward(*tensors: Tensor)
Save given tensors for a future call to
jvp().save_for_forwardshould be called at most once, in either thesetup_context()orforward()methods, and all arguments should be tensors.In
jvp(), saved objects can be accessed through thesaved_tensorsattribute.Arguments can also be
None. This is a no-op.See extending-autograd for more details on how to use this method.
- Example::
>>> # xdoctest: +SKIP >>> class Func(torch.autograd.Function): >>> @staticmethod >>> def forward(ctx, x: torch.Tensor, y: torch.Tensor, z: int): >>> ctx.save_for_backward(x, y) >>> ctx.save_for_forward(x, y) >>> ctx.z = z >>> return x * y * z >>> >>> @staticmethod >>> def jvp(ctx, x_t, y_t, _): >>> x, y = ctx.saved_tensors >>> z = ctx.z >>> return z * (y * x_t + x * y_t) >>> >>> @staticmethod >>> def vjp(ctx, grad_out): >>> x, y = ctx.saved_tensors >>> z = ctx.z >>> return z * grad_out * y, z * grad_out * x, None >>> >>> a = torch.tensor(1., requires_grad=True, dtype=torch.double) >>> t = torch.tensor(1., dtype=torch.double) >>> b = torch.tensor(2., requires_grad=True, dtype=torch.double) >>> c = 4 >>> >>> with fwAD.dual_level(): >>> a_dual = fwAD.make_dual(a, t) >>> d = Func.apply(a_dual, b, c)
- set_materialize_grads(value: bool)
Set whether to materialize grad tensors. Default is
True.This should be called only from either the
setup_context()orforward()methods.If
True, undefined grad tensors will be expanded to tensors full of zeros prior to calling thebackward()andjvp()methods.- Example::
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_AUTOGRAD) >>> class SimpleFunc(Function): >>> @staticmethod >>> def forward(ctx, x): >>> return x.clone(), x.clone() >>> >>> @staticmethod >>> @once_differentiable >>> def backward(ctx, g1, g2): >>> return g1 + g2 # No check for None necessary >>> >>> # We modify SimpleFunc to handle non-materialized grad outputs >>> class Func(Function): >>> @staticmethod >>> def forward(ctx, x): >>> ctx.set_materialize_grads(False) >>> ctx.save_for_backward(x) >>> return x.clone(), x.clone() >>> >>> @staticmethod >>> @once_differentiable >>> def backward(ctx, g1, g2): >>> x, = ctx.saved_tensors >>> grad_input = torch.zeros_like(x) >>> if g1 is not None: # We must check for None now >>> grad_input += g1 >>> if g2 is not None: >>> grad_input += g2 >>> return grad_input >>> >>> a = torch.tensor(1., requires_grad=True) >>> b, _ = Func.apply(a) # induces g2 to be undefined
- static setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) Any
There are two ways to define the forward pass of an autograd.Function.
Either:
Override forward with the signature
forward(ctx, *args, **kwargs).setup_contextis not overridden. Setting up the ctx for backward happens inside theforward.Override forward with the signature
forward(*args, **kwargs)and overridesetup_context. Setting up the ctx for backward happens insidesetup_context(as opposed to inside theforward)
See
torch.autograd.Function.forward()and extending-autograd for more details.
- static vjp(ctx: Any, *grad_outputs: Any) Any
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses. (Defining this function is equivalent to defining the
vjpfunction.)It must accept a context
ctxas the first argument, followed by as many outputs as theforward()returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_gradas a tuple of booleans representing whether each input needs gradient. E.g.,backward()will havectx.needs_input_grad[0] = Trueif the first input toforward()needs gradient computed w.r.t. the output.
- static vmap(info, in_dims, *args)
Define the behavior for this autograd.Function underneath
torch.vmap().For a
torch.autograd.Function()to supporttorch.vmap(), you must either override this static method, or setgenerate_vmap_ruletoTrue(you may not do both).If you choose to override this staticmethod: it must accept
an
infoobject as the first argument.info.batch_sizespecifies the size of the dimension being vmapped over, whileinfo.randomnessis the randomness option passed totorch.vmap().an
in_dimstuple as the second argument. For each arg inargs,in_dimshas a correspondingOptional[int]. It isNoneif the arg is not a Tensor or if the arg is not being vmapped over, otherwise, it is an integer specifying what dimension of the Tensor is being vmapped over.*args, which is the same as the args toforward().
The return of the vmap staticmethod is a tuple of
(output, out_dims). Similar toin_dims,out_dimsshould be of the same structure asoutputand contain oneout_dimper output that specifies if the output has the vmapped dimension and what index it is in.Please see func-autograd-function for more details.