Causal Structures - Prediction#

[1]:

%%capture
# execute the creation & training notebook first
%run "02-01-creation_and_training.ipynb"

In the previous section we created and trained our causal structure. We can now make predictions. Let’s create some test data first.

[2]:

test_data_a = {"(a)": np.linspace(4.5, 5.5, 100)}
prediction = causal_structure.predict(data=test_data_a)

Since we passed a dictionary as data to the predict method, the result is also a dictionary.

[3]:

display(type(prediction))
display(prediction.keys())

pandas.core.frame.DataFrame

Index(['(a)', '(b|a)', '(c|a,b)'], dtype='object')

If we pass a pandas data frame we get a pandas data frame in return.

[4]:

test_data_a = pd.DataFrame(data=test_data_a)
prediction = causal_structure.predict(data=test_data_a)
prediction.head()

[4]:

	(a)	(b\|a)	(c\|a,b)
0	4.500000	-3.950142	45.574075
1	4.510101	-4.436720	45.443720
2	4.520202	-4.927574	45.272895
3	4.530303	-5.408997	45.130555
4	4.540404	-5.907117	44.996075

Note that even though the column ‘(c|a,b)’ depends on both ‘(a)’ and ‘(b|a)’ and we only provided data for ‘(a)’ we get a prediction for ‘(c|a,b)’. This prediction is based on averaging the possible values of ‘(b|a)’ given the provided data for ‘(a)’.

We can plot the prediction and compare it to the training data.

[5]:

pl.figure(figsize=(10,3.5))
fig = pl.subplot(1,2,1)
fig.plot(prediction["(a)"], prediction["(b|a)"], color="red")
fig.scatter(data["(a)"], data["(b|a)"], marker="x", color="k")
fig.set_xlabel("(a)")
fig.set_ylabel("(b|a)")
fig = pl.subplot(1,2,2)
fig.plot(prediction["(a)"], prediction["(c|a,b)"], color="red")
fig.scatter(data["(a)"], data["(c|a,b)"], marker="x", color="k")
fig.set_xlabel("(a)")
fig.set_ylabel("(c|a,b)")

[5]:

Text(0, 0.5, '(c|a,b)')

../../_images/examples_04_causal_structure_02-02-prediction_10_1.png

Prediction uncertainties#

If we want to see the uncertainty margin of the prediction we have to ask the predict method for the standard deviation.

[6]:

prediction_mean, prediction_std = causal_structure.predict(
    data=test_data_a, return_std=True)

[7]:

pl.figure(figsize=(10,3.5))
fig = pl.subplot(1,2,1)
fig.plot(prediction_mean["(a)"], prediction_mean["(b|a)"], color="red")
fig.fill_between(prediction_mean["(a)"],
                 (prediction_mean - prediction_std)["(b|a)"],
                 (prediction_mean + prediction_std)["(b|a)"],
                 color="red", alpha=0.5)
fig.scatter(data["(a)"], data["(b|a)"], marker="x", color="k")
fig.set_xlabel("(a)")
fig.set_ylabel("(a|b)")
fig = pl.subplot(1,2,2)
fig.plot(prediction_mean["(a)"], prediction_mean["(c|a,b)"], color="red")
fig.fill_between(prediction_mean["(a)"],
                 (prediction_mean - prediction_std)["(c|a,b)"],
                 (prediction_mean + prediction_std)["(c|a,b)"],
                 color="red", alpha=0.5)
fig.scatter(data["(a)"], data["(c|a,b)"], marker="x", color="k")
fig.set_xlabel("(a)")
fig.set_ylabel("(c|a,b)")

[7]:

Text(0, 0.5, '(c|a,b)')

../../_images/examples_04_causal_structure_02-02-prediction_14_1.png

The uncertainty margin consists of two contributions. The learned variance of the quadratic regressions and the uncertainty of the regression parameters.

With the regression equation,

\(y(x) = a \cdot x + b \cdot x^2 + c + \xi\),

the learned variance describes the strength of the random noise \(\xi\) and the uncertainty of the regression parameters quantifies that with finite data we know \(a\), \(b\), and \(c\) only with a finite precision.

Backwards prediction#

We can also evaluate the inverse prediction, e.g. predicting ‘(b|a)’ and ‘(a)’ from ‘(c|a,b)’. We just have to provide data for ‘(c|a,b)’ to the predict method and the causal structure will solve the inverse model.

[8]:

test_data_c = pd.DataFrame(data={'(c|a,b)': np.linspace(36, 44, 100)})

prediction_mean, prediction_std = causal_structure.predict(
    data=test_data_c, return_std=True)

prediction_mean.head()

[8]:

	(a)	(b\|a)	(c\|a,b)
0	5.199343	-35.290706	36.000000
1	5.195043	-35.022511	36.080808
2	5.190741	-34.832743	36.161616
3	5.186438	-34.653216	36.242424
4	5.182134	-34.413845	36.323232

[9]:

pl.figure(figsize=(10,3.5))
fig = pl.subplot(1,2,1)
fig.plot(prediction_mean["(c|a,b)"], prediction_mean["(a)"], color="red")
fig.fill_between(prediction_mean["(c|a,b)"],
                 (prediction_mean - prediction_std)["(a)"],
                 (prediction_mean + prediction_std)["(a)"],
                 color="red", alpha=0.5)
fig.scatter(data["(c|a,b)"], data["(a)"], marker="x", color="k")
fig.set_xlabel("(c|a,b)")
fig.set_ylabel("(a)")
fig = pl.subplot(1,2,2)
fig.plot(prediction_mean["(c|a,b)"], prediction_mean["(b|a)"], color="red")
fig.fill_between(prediction_mean["(c|a,b)"],
                 (prediction_mean - prediction_std)["(b|a)"],
                 (prediction_mean + prediction_std)["(b|a)"],
                 color="red", alpha=0.5)
fig.scatter(data["(c|a,b)"], data["(b|a)"], marker="x", color="k")
fig.set_xlabel("(c|a,b)")
fig.set_ylabel("(b|a)")

[9]:

Text(0, 0.5, '(b|a)')

../../_images/examples_04_causal_structure_02-02-prediction_19_1.png

The plots of the backwards prediction show us that the backwards prediction usually is a little bit conservative. The predictions stay closer to the average value than the data points.

This is common behavior in inverse modelling and has to do with the fact that a high value of ‘(c|a,b)’ can have multiple causes, a high value in ‘(a)’, a high value in ‘(a|b)’ or a high value of the random noise contribution. This effect is explained in more detail in the inheritance example in the core-documentation.

Apart from predicting values a causal structure can evaluate objectives. We will see what they are in the next section.