The gaussian_process_regression function#

Aliases#

halerium.core.regression.gaussian_process_regression
gaussian_process_regression(name, operands, result_shape, operands_location=None, operands_scale=None, result_location=None, result_scale=None, return_scale_field=False, interpolation_points=100, interpolation_range=4.0, kernel_correlation_length=0.4, kernel_size=21)#

Gaussian process regression.

Calculate mean (and variance) fields to be used for Gaussian process regression.

Mean (and variance) are calculated by applying a set of 1-d functions to each entry of the combined operands element-wise. The 1-d functions are defined by 1-d Gaussian random fields on a regularly spaced grid. The function evaluation is then calculated by linear interpolation between the adjacent grid points. The total support space of the 1-d functions is approximately the interval [-interpolation_range*operand_scale, interpolation_range*operand_scale]. Evaluation points outside these bounds will yield the same as evaluations at the nearest bound.

The mean is then calculated by summing over element-wise function applications for each entry in operands yielding an array with shape result_shape.

For the variances a separate set of 1-d functions is used. Their results are also combined to a operator of shape result_shape, but the reduction function is sum(exp(…)).

The 1-d fields are created as StaticVariables which is placed into the sub-entities mean and log_variance which are in turn placed into an Entity with the provided name. Two StaticVariables are created in each sub-entity: source_field and field. field and source_field are related by a convolution with the provided kernel. field = convolve(source_field, kernel). The fields are of shape (*input_op.shape, *result_shape, length). This means prod((*input_op.shape, *result_shape)) 1-d fields are created.

Parameters:
  • name (str) – The name to be given to the entity containing the function fields.

  • operands (list, tuple, set, Operator) – The operand or list of operands of the regression. Can also be a set of operands if neither operands_location nor operands_scale are given.

  • result_shape (tuple) – The shape of the result of the regression.

  • operands_location (optional) – The (list of) location(s) for scaling the operand(s), which will be subtracted from the operand(s). The default is None (no subtraction).

  • operands_scale (optional) – The (list of) scale(s) for scaling the operand(s), which will divide the operand(s) after subtracting any location parameter(s). The default is None (no division).

  • result_location (optional) – The location for unscaling the result. This will be added to the result of the regression (after multiplying by any scale parameter). The default is None (no addition).

  • result_scale (optional) – The scale for unscaling the result. The result of the regression will be multiplied by this. The default is None (no multiplication).

  • return_scale_field (bool, optional) – Whether scale fields (for variance or similar) are created and evaluated. If False, the second return element is None. The default is False.

  • interpolation_points (int) – The amount of support points to be used for the function fields.

  • interpolation_range (float) –

    The support range of the function field w.r.t the operand scales. The effective support range is the interval [-interpolation_range * op_scale,

    interpolation_range * op_scale * (length/2-1)/(length/2)]

  • kernel_correlation_length (float) – The kernel correlation length. It defines the scale of (auto-)correlation function between the points of the function fields w.r.t. the operand scale. i.e. a kernel_correlation_length of 0.4 means an effective correlation length of operand_scale*0.4.

  • kernel_size (int) – Must be smaller than interpolation_points. The size of the convolution kernel used to apply the correlation structure. Effectively this means the kernel gets truncated after distance (kernel_size-1)/2 grid points.

Returns:

  • location_value (Operator) – An operator of shape result_shape. The location values (i.e. mean or similar) are calculated by summing over the contributions of all entries in operands.

  • scale_value (Operator or None) – If return_scale_field is False, None is returned. Otherwise, it is an operator of shape result_shape. The scale values (i.e. variance or similar) are calculated by summing over the contributions of all entries in operands.

  • parameters (Entity) – The Entity containing the StaticVariables that define the function fields.