Getting Started

Overview

This getting started tutorial outlines how to set up and simulate a PEtab SciML problem using AMICI. Some familiarity with the PEtab format is assumed, see PEtab for a refresher. The environment and example PEtab files to run this notebook are provided in the PEtab SciML repo.

As an example, we will use the Lotka-Voltera system:

\[\frac{\mathrm{d} \text{prey}}{\mathrm{d} t} = \alpha \cdot \text{prey} - \beta \cdot \text{prey} \cdot \text{predator}\]

\[\frac{\mathrm{d} \text{predator}}{\mathrm{d} t} = \gamma \cdot \text{prey} \cdot \text{predator} - \delta \cdot \text{predator}\]

We will replace the two interactive terms (\(\beta\) and \(\gamma\)) with outputs from a neural network. The prey and predator variables in the model will be used as the inputs to this network. This example follows the Universal Differential Equation (UDE) case in Rackauckas et al. 2020.

\[\frac{\mathrm{d} \text{prey}}{\mathrm{d} t} = \alpha \cdot \text{prey} - \text{NN}(\text{prey}, \text{predator})[0]\]

\[\frac{\mathrm{d} \text{predator}}{\mathrm{d} t} = \text{NN}(\text{prey}, \text{predator})[1] - \delta \cdot \text{predator}\]

Environment

Support for PEtab SciML models is under active development. A requirements.txt file is provided with these docs which checks out branches of the external dependencies where support has been implemented.

Loading the PEtab problem

We will start by loading the PEtab problem using the PEtab Python library, so that we can interactively explore the files that are used to build and simulate the problem.

[ ]:

from petab.v2 import Problem

petab_problem = Problem.from_yaml("problem.yaml")

The files needed to define a PEtab SciML problem are:

model file
neural network file
mapping table
hybridization table
parameters table
network array inputs
measurements table
observables table
conditions table
YAML file

The hybridization, network YAML and network input files are new and specific to PEtab SciML problems. The other files already exist in the PEtab format but have been expanded in order to support PEtab SciML problems.

Model File

PEtab SciML importers accept models in various exchangeable standard formats (e.g., SBML, CellML, BioNetGen). For our example we use an SBML model because this standard is widely supported across PEtab importers. The SBML model will need modifying because outputs from the neural network can only be assigned to parameters in the PEtab problem. This means the prey and predator species will need to be removed from the expressions we want to parameterise, i.e.

\[\frac{\mathrm{d} \text{prey}}{\mathrm{d} t} = \alpha \cdot \text{prey} - \beta\]

\[\frac{\mathrm{d} \text{predator}}{\mathrm{d} t} = \gamma - \delta \cdot \text{predator}\]

Whichever model format is used, the idea is that parameters in the model are replaced by network elements. There are a large number of tools available for creating SBML model files like these, such as GUI based COPASI.

Neural Network File

This file defines the network architecture, and should be provided in the PEtab SciML associated network YAML format. You can generate a file like this by defining a neural network in PyTorch and exporting it to the SciML YAML format. The provided example defines a network with three Linear layers and a tanh activation function. See the page on supported layers and activation functions for a complete list.

[4]:

from petab_sciml.standard.nn_model import Input, NNModel, NNModelStandard
import torch
from torch import nn
import torch.nn.functional as F

class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = torch.nn.Linear(2, 5)
        self.layer2 = torch.nn.Linear(5, 5)
        self.layer3 = torch.nn.Linear(5, 2)

    def forward(self, net_input):
        x = self.layer1(net_input)
        x = F.tanh(x)
        x = self.layer2(x)
        x = F.tanh(x)
        x = self.layer3(x)
        return x

net1 = NeuralNetwork()
nn_model1 = NNModel.from_pytorch_module(
    module=net1, nn_model_id="net1", inputs=[Input(input_id="input0")]
)
NNModelStandard.save_data(
    data=nn_model1, filename="net1_from_pytorch.yaml"
)

Note that where any specific network layers or parameters are referenced in the mapping.tsv, it should refer to them by the layer ids in this file.

Mapping Table

The purpose of the mapping table is to define PEtab compatible identifiers for model entities that would otherwise not be valid PEtab. Inputs, outputs and parameters of our neural network are all examples of model entities that would not be valid PEtab identifiers, so we define PEtab ids for these in this file.

Take the first row for example. The valid PEtab identifier net1_input1 is mapped to the model id net1.inputs[0][0]. This model id denotes the first index of the first input to the neural network (note that zero based indexing is used).

net1.parameters refers to all the trainable parameters in the network i.e. weights and biases of all the layers.

[5]:

petab_problem.mapping_df

[5]:

	modelEntityId
petabEntityId
net1_input1	net1.inputs[0][0]
net1_input2	net1.inputs[0][1]
net1_output1	net1.outputs[0][0]
net1_output2	net1.outputs[0][1]
net1_ps	net1.parameters

Hybridization Table

The hybridization table defines where our neural network inputs and outputs get used in the model. As these hybridization elements are integral to the construction of the model, the supporting tools add hybridization information when the model is defined. At this point we will therefore build the model from the PEtab problem in order to demonstrate how the neural network inputs and outputs are inserted into the ODE system.

[ ]:

from amici.petab import import_petab_problem
from amici.jax import (
    JAXProblem,
    run_simulations,
)

# Create AMICI model for the petab problem
jax_model = import_petab_problem(
    petab_problem,
    model_output_dir="model",
    compile_=True,
    jax=True
)

# Create the JAXProblem - wrapper for the AMICI model
jax_problem = JAXProblem(jax_model, petab_problem)

[7]:

jax_problem._petab_problem.hybridization_df

[7]:

	targetValue
targetId
net1_input1	prey
net1_input2	predator
beta	net1_output1
gamma	net1_output2

In the example, the inputs to the network are the prey and predator values and the outputs of the network provide \(\beta\) and \(\gamma\). These substitutions hold for all simulation conditions. Note that we refer to the inputs and outputs of the network by their PEtab identifiers, as defined in the mapping file.

Parameters Table

This file contains configuration information on all model parameters, which in our case, includes parameters of the neural network. Actual values of the network parameters should be provided in a separate HDF5 array file (see more details in the next section). The parameters file specifies the scale, bounds and nominal values of all the parameters. We want our neural network parameters to be freely optimised during training, so we set the bounds to [-inf, inf] and leave the nominal value blank (translates to a NaN when loaded into the PEtab problem).

[8]:

petab_problem.parameter_df

[8]:

	parameterScale	lowerBound	upperBound	nominalValue	estimate
parameterId
alpha	lin	0.0	15.0	1.3	1
delta	lin	0.0	15.0	1.8	1
net1_ps	lin	-inf	inf	NaN	1

Array inputs

In this example, the values of parameters for the neural network are provided in HDF5 format. The structure inside the HDF5 file is as follows.

arrays.hdf5
├── metadata
│   └── perm
└── parameters
    └── net1
        ├── layer1
        │   ├── weight
        │   └── bias
        ├── layer2
        │   ├── weight
        │   └── bias
        └── layer3
            ├── weight
            └── bias

Note the network id net1 matches the identifier in the network YAML file and the modelEntityId in the mapping.tsv. Likewise the layer names (layer1, layer2, layer3) in the HDF5 are identifiers and must match the layer names in the network YAML and mapping files.

Measurements, Observables and Conditions Tables

These tables are unchanged from a regular PEtab probem. For completeness, the measurement table contains the measurement data, which experimental condition they where collected, and which model entities each measurement maps to.

[9]:

petab_problem.measurement_df

[9]:

	observableId	simulationConditionId	measurement	time
0	prey_o	cond1	0.173017	1.0
1	prey_o	cond1	0.489177	2.0
2	prey_o	cond1	1.643996	3.0
3	prey_o	cond1	5.451963	4.0
4	prey_o	cond1	2.977522	5.0
5	prey_o	cond1	0.181663	6.0
6	prey_o	cond1	0.348112	7.0
7	prey_o	cond1	0.937919	8.0
8	prey_o	cond1	3.113240	9.0
9	prey_o	cond1	8.863933	10.0
10	predator_o	cond1	0.847416	1.0
11	predator_o	cond1	0.211135	2.0
12	predator_o	cond1	-0.025054	3.0
13	predator_o	cond1	0.125010	4.0
14	predator_o	cond1	6.700455	5.0
15	predator_o	cond1	2.007158	6.0
16	predator_o	cond1	0.420092	7.0
17	predator_o	cond1	0.048032	8.0
18	predator_o	cond1	0.128669	9.0
19	predator_o	cond1	1.192784	10.0

The observables table links the model entities to the measured values.

[10]:

petab_problem.observable_df

[10]:

	observableFormula	noiseFormula	observableTransformation	noiseDistribution
observableId
prey_o	prey	0.05	lin	normal
predator_o	predator	0.05	lin	normal

The conditions table in intended to fully describe the experimental conditions the measurements were collected under. It can have any number of columns, but in this example there is only one experimental condition and no additional information about it.

[11]:

petab_problem.condition_df

[11]:


conditionId
cond1

YAML File

Finally, the problem.yaml file brings everything together by describing how the problem should be constructed from the PEtab files. The extensions section contains the details of the SciML elements of our problem.

format_version: 2
parameter_file: "parameters.tsv"
problems:
    - model_files:
        lv:
            location: "lv.xml"
            language: "sbml"
        measurement_files:
        - "measurements.tsv"
        observable_files:
        - "observables.tsv"
        condition_files:
        - "conditions.tsv"
        mapping_files:
        - "mapping.tsv"
extensions:
    sciml:
        array_files:
        - "net1_ps.hdf5"
        hybridization_files:
        - "hybridization.tsv"
        neural_nets:
        net1:
            location: "net1.yaml"
            static: false
            format: "YAML"

Where the network is referenced in the mapping.tsv file, the modelEntityId must match the network id listed in this YAML, net1 in this case.

Running Simulations

We are now ready to run a simulation using the jax problem. AMICI’s run_simulations will simulate all conditions provided. In this example though, there is only one. See the AMICI docs for more examples such as training loops.

[12]:

import equinox as eqx

# Run simulations - results in llh - metrics in r
llh, r = run_simulations(jax_problem)

# Run simulations in gradient mode - gradients given in sllh
sllh, rgrad = eqx.filter_grad(run_simulations, has_aux=True)(jax_problem)

Next Steps

This tutorial has outlined how to set up a PEtab SciML problem where a neural network appears in the ODE equations (creating what is sometimes referred to as an UDE). The PEtab SciML format also supports others ways of formulating a problem, which you can find examples of in the how-to-guides. These include:

neural ODEs (all ODE equations are replaced with a neural network)
using networks in the observable function which links simulated values to measurements
having a neural network upstream of the ODE mapping high-dimensional inputs (e.g., images) to ODE model parameters.

We implemented a simple network in this tutorial for demonstration purposes but the PEtab SciML format supports a range of standard neural network layers and activation functions based on the PyTorch framework. The dedicated docs page lists these layers and functions, along with the tools that support each.

For a complete description of the PEtab SciML format visit the format specification page and for more information about more complex problem specifications supported by the PEtab format (e.g., models with partial observabillity, multiple experimental/simulations conditions, events), see the PEtab documentation.