.. currentmodule:: halerium

================================
High-level usage - Code Overview
================================

On a high level the usage of Halerium involves the creation,
training and evaluation of causal structures.

Causal Structures
=================

A :class:`~CausalStructure` is a collection of dependencies between parameters,
with each parameter being defined by a string. The most convenient way to use
a causal structure in conjunction with a pandas DataFrame, where the parameter names
of the causal structure match the column names of the DataFrame.

For example: ::

    >>> data = pandas.DataFrame(columns=["a", "b", "c", "d"])
    >>> CausalStructure([[{"a", "b"}, {"c", "d"}]])
    CausalStructure([[{'a', 'b'}, 'c'],
                     [{'a', 'b'}, 'd']])

Here the causal structure defines that c and d each depend on a and b.

Internally causal structures contains a :class:`~core.Dependencies` instance.
The causal converts its :class:`~core.Dependencies` instance to the necessary
core-objects (namely the :class:`~core.Graph`) to evaluate various
objectives.

Dependencies
------------

Internally causal structures utilize instances of :class:`~core.Dependency`
and :class:`~core.Dependencies`. These classes manage the dependencies that make up
the causal structure and are in charge of checking that the dependency tree is acyclic, i.e. that following
a chain of dependencies cannot lead back to the first parameter in the chain. ::

    >>> Dependency(feature: "a", target="a")
    CyclicDependencyError: Cyclic dependency detected for 'a'.

    >>> Dependencies([["a", "b"],
                      ["b", "c"],
                      ["c", "a"]])
    CyclicDependencyError: Cyclic dependency detected for {'b'}.

The user does not have to create dependencies explicitly. The `dependencies` argument to the `__init__`
of the :class:`~CausalStructure` class is used to create the :class:`~core.Dependencies`
instance automatically. The user can however also create the :class:`~core.Dependencies`
instance themselves ::

    >>> dependencies = Dependencies([[{"a", "b"}, {"c", "d"}]])
    >>> causal_structure = CausalStructure(dependencies)
    >>> causal_structure
    CausalStructure([[{'a', 'b'}, 'c'],
                     [{'a', 'b'}, 'd']])

Basic Methods
-------------

The most important method of the :class:`~CausalStructure` class are the following


:meth:`~CausalStructure.train`: This method trains the causal structure (or rather
its internal :class:`~core.Graph`) with a training data set. After training the
causal structure can be used to make predictions or to evaluate objectives. ::

    >>> data = pandas.DataFrame(columns=["a", "b", "c", "d"],
    >>>                         data=[[0, 0,  0, 0],
    >>>                               [1, 0,  1, 2],
    >>>                               [0, 1, -2, 1],
    >>>                               [1, 1, -1, 3]])
    >>> causal_structure.train(data)

:meth:`~CausalStructure.predict`: This method makes a prediction using the
internal trained graph and an input data set. ::

    >>> data_in = pandas.DataFrame(columns=["a", "b"],
    >>>                            data=[[ -1,  -1],
    >>>                                  [0.5, 0.5]])
    >>> causal_structure.predict(data_in)
         a    b         c         d
    0 -1.0 -1.0  0.976529 -2.953433
    1  0.5  0.5 -0.476481  1.490052


:meth:`~CausalStructure.evaluate_objective`: This method evaluates an objective
class using the internal trained graph and additional arguments to the
objective class. See the `Objectives`_ section.


Advanced Methods
----------------

The :class:`~CausalStructure` class offers a number of advanced methods
that allow the user to influence how the internal graph is build or to
modify and/or utilize the graph with the :ref:`core package<core overview>`
The most important of these methods are the following


:meth:`~CausalStructure.build_graph`: This method converts the dependencies into
a :class:`~core.Graph` instance (see the core-package for details).
The method is automatically called when the :meth:`~CausalStructure.get_graph` or
:meth:`~CausalStructure.train` methods are called. With the explicit call the
user can modify the build arguments.


:meth:`~CausalStructure.get_graph`: This method returns the :class:`~core.Graph` instance
that was created from the dependencies. If no graph was built yet, the
:meth:`~CausalStructure.build_graph` is triggered first.
The user can modify the returned graph in-place using the
core-package. Alternatively, a modified graph be used to replace
the :attr:`~CausalStructure.graph` attribute.


:meth:`~CausalStructure.get_trained_graph`: This method returns the :class:`~core.Graph` instance
that was created by the :meth:`~CausalStructure.train` method.
If no training has taken place yet an Exception is raised.
The user can modify the returned graph in-place using the
core-package. Alternatively, a modified graph be used to replace
the :attr:`~CausalStructure.trained_graph` attribute.


:meth:`~CausalStructure.get_data_linker`: This method creates a
:class:`~core.DataLinker` instance (see the core-package documentation
for details) compatible with the internal graph from a provided data set.


Examples
--------

The :class:`CausalStructure`, :class:`~core.Dependency`
and :class:`~core.Dependencies` classes are further explained
in the following examples:

 .. toctree::
    :maxdepth: 1


    examples/04_causal_structure/01-causal_structure_dependency_basics
    examples/04_causal_structure/02-01-creation_and_training
    examples/04_causal_structure/02-02-prediction


Real data applications of the :class:`CausalStructure` are
in the following examples:

 .. toctree::
    :maxdepth: 1

    examples/04_causal_structure/03-causal_structures_calschool


.. _highlevel objectives:

Objectives
==========

Objectives are special classes that answer specific questions.
The answer is based on the trained graph and the additional arguments
to the objective (e.g. data).
After the causal structure has been trained with the
:meth:`~CausalStructure.train` method objectives can be evaluated by
calling the :meth:`~CausalStructure.evaluate_objective`
method. The first argument to the :meth:`~CausalStructure.evaluate_objective`
method is the objective class. The available classes are


:class:`~Predictor`: The predictor answers the question
what the values of all parameters could be given the values
of a subset of the parameters as data. ::

    >>> causal_structure.evaluate_objective(Predictor, data=data_in, measure="mean")
         a    b         c         d
    0 -1.0 -1.0  0.976529 -2.953433
    1  0.5  0.5 -0.476481  1.490052

    >>> causal_structure.evaluate_objective(Predictor, data=data_in, measure="std")
         a    b         c          d
    0  0.0  0.0  9.443231  10.300908
    1  0.0  0.0  0.875485   0.982973

:class:`~Evaluator`: The evaluator answers the question
how well the predictions perform on a test data set. ::

    >>> data_test = pandas.DataFrame(columns=["a", "b", "c", "d"],
    >>>                              data=[[-1, -1,  1, -3],
    >>>                                    [ 0, -1,  2, -1],
    >>>                                    [-1,  0, -1, -2],
    >>>                                    [ 2,  1,  0,  5],
    >>>                                    [ 1,  2, -3,  4]])
    >>> causal_structure.evaluate_objective(Evaluator, data=data_test,
    >>>                                     inputs=["a", "b"], metric="r2")
    {'a': None, 'b': None, 'c': 0.9997590212263663, 'd':  0.9998194827898427}

:class:`~InfluenceEstimator`: The influence estimator answers the question
how much a certain target is influenced by the other parameters. ::

    >>> causal_structure.evaluate_objective(InfluenceEstimator, target="d")
    {'a': 0.7127661538640915, 'b': 0.4127766942443396, 'c': 0.0, 'd': 1.0}

:class:`~OutlierDetector`: The outlier detector answers the question which
data points in a given data set are outliers
(i.e. are very incompatible with the trained graph). ::

    >>> data_test = pandas.DataFrame(columns=["a", "b", "c", "d"],
    >>>                              data=[[1.5, 1.0, -0.5,  4.0],
    >>>                                    [1.5, 1.0, -0.5, 40.0]])
    >>> causal_structure.evaluate_objective(OutlierDetector, data=data_test)
         a    b    c    d  graph
    0  1.0  0.0  0.0  0.0    0.0
    1  1.0  0.0  1.0  1.0    1.0


:class:`~RankEstimator`: The rank estimator is the continuous analogon to
the outlier detector. It answers the question of how many comparison data
points would be more likely than the data point in question. ::

    >>> causal_structure.evaluate_objective(RankEstimator, data=data_test)
          a     b     c     d  graph
    0  0.04  0.31  0.60  0.52   0.22
    1  0.04  0.31  0.00  0.00   0.00


:class:`~ProbabilityEstimator`: The probability estimator answers the
question of what is the logarithmic probability density of the data
point in question. ::

    >>> causal_structure.evaluate_objective(ProbabilityEstimator, data=data_test)
              a         b         c           d        graph
    0 -2.225791 -0.725791 -2.117248   -2.287631    -7.388386
    1 -2.225791 -0.725791 -2.117248 -997.881033 -1222.759525

To answer questions which are not covered by these objective classes
the user will have to utilize the low-level functionalities of
the :ref:`core package<core overview>`.

Examples
--------

The objectives are used with the :class:`CausalStructure` class
in the following examples:

 .. toctree::
    :maxdepth: 1


    examples/04_causal_structure/02-03-objectives_intro
    examples/04_causal_structure/02-04-evaluation
    examples/04_causal_structure/02-05-outlier_detection
    examples/04_causal_structure/02-06-influence_estimation
    examples/04_causal_structure/02-07-rank_estimation
    examples/04_causal_structure/02-08-probability_estimation