The RankEstimator class#

Aliases#

halerium.RankEstimator
halerium.core.RankEstimator
halerium.core.objectives.RankEstimator
class RankEstimator(graph, data, compiler=None, n_samples=1000, method='upsampled', name='RankEstimator', description=None, copy_graph=True)#

The rank estimator class.

This class estimates the rank of a set of data points for a given graph. The rank is estimated for each data point individually. The rank of a data point here is defined as the fraction of a random sample expected to have the same or lower probabilitiy than the data point in question. So rank values range between 0 and 1. Very unlikely events have a rank close to 0. The most likely events have a rank close to 1.

Parameters:
  • graph (halerium.core.Graph) – The graph that defines the dependencies and probabilities of the variables.

  • data (dict, halerium.core.DataLinker) – The data for which to estimate ranks. Either dictionary with variables as keys and data arrays as values, or a DataLinker holding links to the variables in graph.

  • compiler (halerium.core.compiler.compiler_base.CompilerBase, optional) – The backend compiler to be used. The default is the Tensorflow compiler.

  • n_samples (int, optional) – The amount of samples to be used to estimate the probabilities. The default is 1000.

  • method (str, optional,) – The method with which the probability is estimated. Either “marginalized” or “upsampled”. “marginalized” marginalizes the missing values, so that the result represents a probability density only over the variables that have data. “upsampled” samples the missing values, so that the result represents a probability density over all variables. The “marginalized” method is slower and more memory intensive. The dafault is “upsampled”.

  • name (str, optional) – The name of the objective.

  • description (str, optional) – The description of the objective.

  • copy_graph (bool, optional) – Whether the objective should make a copy of the graph for its own use, or just keep the graph itself as attribute. Users should leave this set to the default True, unless they are certain that the graph won’t be altered by the user or other code.

Examples

A call of the instance with the graph as argument returns the ranks of all data points. >>> from halerium.core import Graph, Variable >>> g = Graph(“g”) >>> with g: >>> Variable(“v”, mean=0, variance=1) >>> g_data = {g.v: [-1., 0., 1.]} >>> rank_estimator = RankEstimator(g, data= g_data) >>> rank_estimator(g) array([0.32, 1, 0.32])

__call__(fetches=None)#

Get rank.

Returns the logarithmic probability densities of each element in fetches for each data point.

Parameters:

fetches (halerium.core.scope.Scopetor, dict, list or tuple, optional) – The scopetors for which to return the log probabilities. If no fetches are provided, the default is to return estimates for the graph itself and all its elements.

Returns:

ranks – The rank for each element in fetches.

Return type:

array, dict, list or tuple

dump_dict(value_postprocessor=None)#

Dump a dict with information on the objective.

The dict returned contains the name, description and the values resulting from a call of the objective. Additional keys included are used by the GUI for appropriately displaying the results of the objective.

Parameters:

value_postprocessor (optional) – A function to apply to the values returned by the call of the objective. The default is None, in which case no post-processing is done.

Returns:

result – A dictionary containing the name, description, etc. of the objective.

Return type:

dict