tfep.nn.transformers.moebius.MoebiusTransformer

class tfep.nn.transformers.moebius.MoebiusTransformer(dimension: int, max_radius: float = 0.99, unit_sphere: bool = False)[source]

Bases: MAFTransformer

Moebius transformer.

This implements a generalization of the Moebius transformation proposed in [1, 2] to non-unit spheres. The transformer will expand/contract the distribution on the sphere of radius \(r\), where \(r\) is the norm of the input vector.

The transformation has the form

\(y = \frac{||x||^2 - ||w||^2}{||x - w||^2} (x - w) - w\)

where \(y, x, w\) are all dimension-dimensional vectors and \(||w|| < ||x||\). The function automatically rescales the w argument following the same strategy as in [2] to satisfy the condition on the norm. Consequently, ``w``s of any norm can be passed.

The implementation of the transformation on the unit sphere is slightly more efficient and can be toggled with the unit_sphere argument.

References

[1] Kato S, McCullagh P. Moebius transformation and a Cauchy family: on the sphere. arXiv preprint arXiv:1510.07679. 2015 Oct 26.
[2] Rezende DJ, Papamakarios G, Racanière S, Albergo MS, Kanwar G,: Shanahan PE, Cranmer K. Normalizing Flows on Tori and Spheres. arXiv preprint arXiv:2002.02428. 2020 Feb 6.

__init__(dimension: int, max_radius: float = 0.99, unit_sphere: bool = False)[source]

Constructor.

Parameters:

dimension (int) – The dimensionality of the vectors in x and w.
max_radius (float) – Must be stringly less than 1. Rescaling of the w vectors will be performed so that its maximum norm will be max_radius * |x|.
unit_sphere (bool) – If True, the input vectors x are assumed to be on the unit sphere, which makes the implementation slightly faster.

Methods

`__init__`(dimension[, max_radius, unit_sphere])	Constructor.
`add_module`(name, module)	Add a child module to the current module.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`children`()	Return an iterator over immediate children modules.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`cpu`()	Move all model parameters and buffers to the CPU.
`cuda`([device])	Move all model parameters and buffers to the GPU.
`double`()	Casts all floating point parameters and buffers to `double` datatype.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Set the extra representation of the module.
`float`()	Casts all floating point parameters and buffers to `float` datatype.
`forward`(x, parameters)	Apply the transformation.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_degrees_out`(degrees_in)	Returns the degrees associated to the conditioner's output.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_identity_parameters`(n_features)	Return the value of the parameters that makes this the identity function.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	Casts all floating point parameters and buffers to `half` datatype.
`inverse`(y, parameters)	Reverse the transformation.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`modules`()	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`parameters`([recurse])	Return an iterator over module parameters.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module)	Set the submodule given by `target` if it exists, otherwise throw an error.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`to`(args, *kwargs)	Move and/or cast the parameters and buffers.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`train`([mode])	Set the module in training mode.
`type`(dst_type)	Casts all parameters and buffers to `dst_type`.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

Attributes

`T_destination`
`call_super_init`
`dump_patches`
`training`

forward(x: Tensor, parameters: Tensor) → tuple[Tensor][source]

Apply the transformation.

Parameters:

x (torch.Tensor) – Shape (batch_size, n_vectors*dimension). Contiguous elements of x are interpreted as vectors (i.e., the first and second input vectors are x[:dimension] and x[dimension:2*dimension].
parameters (torch.Tensor) – Shape (batch_size, n_vectors*dimension). The transformation parameters. These parameter vectors are automatically rescaled so that |w| < |x|.

Returns:

y (torch.Tensor) – Shape (batch_size, n_vectors*dimension). The transformed vectors.
log_det_J (torch.Tensor) – Shape (batch_size,). The logarithm of the absolute value of the Jacobian determinant dy / dx.

get_degrees_out(degrees_in: Tensor) → Tensor[source]

Returns the degrees associated to the conditioner’s output.

Parameters:: degrees_in (torch.Tensor) – Shape (n_transformed_features,). The autoregressive degrees associated to the features provided as input to the transformer.
Returns:: degrees_out – Shape (n_parameters,). The autoregressive degrees associated to each output of the conditioner that will be fed to the transformer as parameters.
Return type:: torch.Tensor

get_identity_parameters(n_features: int) → Tensor[source]

Return the value of the parameters that makes this the identity function.

This can be used to initialize the normalizing flow to perform the identity transformation.

Parameters:: n_features (int) – The dimension of the input vector passed to the transformer.
Returns:: parameters – A tensor of shape (n_features,) representing the parameter vector to perform the identity function with a Moebius transformer.
Return type:: torch.Tensor

inverse(y: Tensor, parameters: Tensor) → tuple[Tensor][source]

Reverse the transformation.

Parameters:

y (torch.Tensor) – Shape (batch_size, n_vectors*dimension). Contiguous elements of y are interpreted as vectors (i.e., the first and second input vectors are y[:dimension] and y[dimension:2*dimension].
parameters (torch.Tensor) – Shape (batch_size, n_vectors*dimension). The transformation parameters. These parameter vectors are automatically rescaled so that |w| < |y|.

Returns:

x (torch.Tensor) – Shape (batch_size, n_vectors*dimension). The transformed vectors.
log_det_J (torch.Tensor) – Shape (batch_size,). The logarithm of the absolute value of the Jacobian determinant dx / dy.