Reusing models#

In the past a colleague defined a graph that connected some data x with data y

[2]:

import numpy as np
import halerium.core as hal

ga = hal.Graph("ga")
with ga:
    with inputs:
        hal.Entity("e1")
        with e1:
            hal.Variable("x", shape=(10,), mean=0, variance=1)
    with outputs:
        hal.Entity("e2")
        with e2:
            hal.Variable("y", shape=(7,))
    hal.regression.connect_via_regression("reg", inputs=inputs.e1.x, outputs=outputs.e2.y)
    outputs.e2.y.variance = hal.exp(hal.StaticVariable("lnv", mean=-3, variance=1.))

# use this in the online platform to show the graph
#hal.show(ga)

The colleague had training data for x and y available and trained the graph accordingly

[3]:

# generating artificial data
np.random.seed(42)
real_xy_slope = np.random.randn(7,10)
real_xy_intercept = np.random.randn(7)

data_x = np.random.randn(74,10)
data_y = np.einsum("ij, nj -> ni", real_xy_slope, data_x) + real_xy_intercept
# normally these would of course be loaded from somewhere

posterior_model_a = hal.get_posterior_model(ga, data={ga.inputs.e1.x: data_x, ga.outputs.e2.y: data_y})

the trained model was converted back to a graph and saved as a JSON file

[4]:

posterior_graph_a = posterior_model_a.get_posterior_graph()
posterior_graph_a.dump_file("posterior_graph.json")

This was done in the past. Now lets clear the session and start over.

[5]:

%reset -f

Starting from there#

[6]:

import halerium.core as hal
import numpy as np

Our graph connects y and z.

[7]:

gb = hal.Graph("gb")
with gb:
    with inputs:
        hal.Entity("e2")
        with e2:
            hal.Variable("y", shape=(7,), mean=0, variance=1)
    with outputs:
        hal.Entity("e3")
        with e3:
            hal.Variable("z", shape=(5,))
    hal.regression.connect_via_regression("reg", inputs=inputs.e2.y, outputs=outputs.e3.z)
    outputs.e3.z.variance = hal.exp(hal.StaticVariable("lnv", mean=-3, variance=1.))

We have training data for y and z available, but not for x. We train our graph…

[8]:

# generating artificial data
np.random.seed(137)
real_yz_slope = np.random.randn(5,7)
real_yz_intercept = np.random.randn(5)

data_y = np.random.randn(63,7)
data_z = np.einsum("ij, nj -> ni", real_yz_slope, data_y) + real_yz_intercept
# normally these would of course be loaded from somewhere

posterior_model_b = hal.get_posterior_model(gb, data={gb.inputs.e2.y: data_y, gb.outputs.e3.z: data_z})

We can again extract the posterior graph

[9]:

posterior_graph_b = posterior_model_b.get_posterior_graph()

And load the work of our colleague from the hard drive

[10]:

posterior_graph_a = hal.Graph.from_specification(file="posterior_graph.json")

Now we can plug those two together in a big graph

[11]:

big_graph = hal.Graph("big_graph")
with big_graph:
    posterior_graph_a.copy("ga")
    posterior_graph_b.copy("gb")
    hal.link(ga.outputs.e2, gb.inputs.e2)

With this graph we can now predict from x to z

[12]:

# test data
test_data_x = np.random.randn(100,10)

model_predict = hal.get_generative_model(big_graph, data={big_graph.ga.inputs.e1.x: test_data_x})
predicted_y, predicted_z = model_predict.get_means([big_graph.ga.outputs.e2.y, big_graph.gb.outputs.e3.z])