{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Influence Estimation" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "# execute the creation & training notebook first\n", "%run \"02-01-creation_and_training.ipynb\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After [training](./02-01-creation_and_training.ipynb) we might want to know how much our parameters influence a certain target. We can do this with the ``.estimate_influences`` method." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start by estimating the influences on the parameter '(c|a,b)'." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(a) 0.675741\n", "(b|a) 3.677125\n", "(c|a,b) 1.000000\n", "Name: influence on (c|a,b), dtype: float64" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "causal_structure.estimate_influences(target='(c|a,b)')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that '(c|a,b)' influences itself by 1, that is to say it influences itself by 100%.\n", "'(a)' influences it by 66%. This seems fairly in line with the R2-score in the [performance evaluation section](./02-04-evaluation.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, '(b|a)' shows an influence above 100%. How can this be understood?\n", "\n", "### Influences of correlated parameters\n", "\n", "The reason why '(b|a)' shows an influence above 100% is that\n", "'(a)' and '(b|a)' are strongly anti-correlated" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
(a)(b|a)
(a)1.000000-0.959284
(b|a)-0.9592841.000000
\n", "
" ], "text/plain": [ " (a) (b|a)\n", "(a) 1.000000 -0.959284\n", "(b|a) -0.959284 1.000000" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.corr().loc[['(a)', '(b|a)'], ['(a)', '(b|a)']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "but they both have a positive effect on '(c|a,b)'." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0, 0.5, '(c|a,b)')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "test_data_a = np.linspace(4.5, 5.5, 100)\n", "test_data_b = np.linspace(-35, -10, 100)\n", "\n", "pl.figure(figsize=(10,3.5))\n", "fig = pl.subplot(1,2,1)\n", "fig.plot(test_data_a,\n", " causal_structure.predict(data={'(a)': test_data_a, '(b|a)': -22.5})['(c|a,b)'],)\n", "fig.set_title(\"(c|a,b) with fixed (b|a)\")\n", "fig.set_xlabel(\"(a)\")\n", "fig.set_ylabel(\"(c|a,b)\")\n", "fig = pl.subplot(1,2,2)\n", "fig.plot(test_data_b,\n", " causal_structure.predict(data={'(a)': 5., '(b|a)': test_data_b})['(c|a,b)'],)\n", "fig.set_title(\"(c|a,b) with fixed (a)\")\n", "fig.set_xlabel(\"(b|a)\")\n", "fig.set_ylabel(\"(c|a,b)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So under normal circumstances the effects of '(a)' and '(b|a)' on '(c|a,b)' cancel each other out to a large degree.\n", "\n", "The influence estimator tells us what is the influence on the target, if you *only* change the parameter in question.\n", "Think of this as an intervention that breaks the strong correlation between '(a)' and '(b|a)'.\n", "\n", "This is why the influence of '(b|a)' on '(c|a,b)' is above 100%." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Influences follow causal directions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we change the target of the influence estimation to '(b|a)'" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(a) 0.919762\n", "(b|a) 1.000000\n", "(c|a,b) 0.000000\n", "Name: influence on (b|a), dtype: float64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "causal_structure.estimate_influences(target='(b|a)')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "we see that '(c|a,b)' has an influence of zero on '(b|a)', even though you can predict '(b|a)' from '(c|a,b)' (see the \"Backwards prediction\" subsection in the [prediction section](./02-02-prediction.ipynb)).\n", "This is because ``estimate_influences`` respects causal directions. So effects do not influence causes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For further details about the ``InfluenceEstimator`` see the [corresponding section](../02_objectives/02_influence_estimator.ipynb) in the core-documentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the [next section](./02-07-rank_estimation.ipynb) we will have a look at rank estimation." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }