Getting Started
Overview
This getting started tutorial outlines how to set up and simulate a PEtab SciML problem using AMICI. Some familiarity with the PEtab format is assumed, see PEtab for a refresher. The environment and example PEtab files to run this notebook are provided in the PEtab SciML repo.
As an example, we will use the Lotka-Voltera system:
We will replace the two interactive terms (\(\beta\) and \(\gamma\)) with outputs from a neural network. The prey and predator variables in the model will be used as the inputs to this network. This example follows the Universal Differential Equation (UDE) case in Rackauckas et al. 2020.
Environment
Support for PEtab SciML models is under active development. A requirements.txt file is provided with these docs which checks out branches of the external dependencies where support has been implemented.
Loading the PEtab problem
We will start by loading the PEtab problem using the PEtab Python library, so that we can interactively explore the files that are used to build and simulate the problem.
[ ]:
from petab.v2 import Problem
petab_problem = Problem.from_yaml("problem.yaml")
The files needed to define a PEtab SciML problem are:
model file
neural network file
mapping table
hybridization table
parameters table
network array inputs
measurements table
observables table
conditions table
YAML file
The hybridization, network YAML and network input files are new and specific to PEtab SciML problems. The other files already exist in the PEtab format but have been expanded in order to support PEtab SciML problems.
Model File
PEtab SciML importers accept models in various exchangeable standard formats (e.g., SBML, CellML, BioNetGen). For our example we use an SBML model because this standard is widely supported across PEtab importers. The SBML model will need modifying because outputs from the neural network can only be assigned to parameters in the PEtab problem. This means the prey and predator species will need to be removed from the expressions we want to parameterise, i.e.
Whichever model format is used, the idea is that parameters in the model are replaced by network elements. There are a large number of tools available for creating SBML model files like these, such as GUI based COPASI.
Neural Network File
This file defines the network architecture, and should be provided in the PEtab SciML associated network YAML format. You can generate a file like this by defining a neural network in PyTorch and exporting it to the SciML YAML format. The provided example defines a network with three Linear layers and a tanh activation function. See the page on supported layers and activation functions for a complete list.
[4]:
from petab_sciml.standard.nn_model import Input, NNModel, NNModelStandard
import torch
from torch import nn
import torch.nn.functional as F
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.layer1 = torch.nn.Linear(2, 5)
self.layer2 = torch.nn.Linear(5, 5)
self.layer3 = torch.nn.Linear(5, 2)
def forward(self, net_input):
x = self.layer1(net_input)
x = F.tanh(x)
x = self.layer2(x)
x = F.tanh(x)
x = self.layer3(x)
return x
net1 = NeuralNetwork()
nn_model1 = NNModel.from_pytorch_module(
module=net1, nn_model_id="net1", inputs=[Input(input_id="input0")]
)
NNModelStandard.save_data(
data=nn_model1, filename="net1_from_pytorch.yaml"
)
Note that where any specific network layers or parameters are referenced in the mapping.tsv, it should refer to them by the layer ids in this file.
Mapping Table
The purpose of the mapping table is to define PEtab compatible identifiers for model entities that would otherwise not be valid PEtab. Inputs, outputs and parameters of our neural network are all examples of model entities that would not be valid PEtab identifiers, so we define PEtab ids for these in this file.
Take the first row for example. The valid PEtab identifier net1_input1 is mapped to the model id net1.inputs[0][0]. This model id denotes the first index of the first input to the neural network (note that zero based indexing is used).
net1.parameters refers to all the trainable parameters in the network i.e. weights and biases of all the layers.
[5]:
petab_problem.mapping_df
[5]:
| modelEntityId | |
|---|---|
| petabEntityId | |
| net1_input1 | net1.inputs[0][0] |
| net1_input2 | net1.inputs[0][1] |
| net1_output1 | net1.outputs[0][0] |
| net1_output2 | net1.outputs[0][1] |
| net1_ps | net1.parameters |
Hybridization Table
The hybridization table defines where our neural network inputs and outputs get used in the model. As these hybridization elements are integral to the construction of the model, the supporting tools add hybridization information when the model is defined. At this point we will therefore build the model from the PEtab problem in order to demonstrate how the neural network inputs and outputs are inserted into the ODE system.
[ ]:
from amici.petab import import_petab_problem
from amici.jax import (
JAXProblem,
run_simulations,
)
# Create AMICI model for the petab problem
jax_model = import_petab_problem(
petab_problem,
model_output_dir="model",
compile_=True,
jax=True
)
# Create the JAXProblem - wrapper for the AMICI model
jax_problem = JAXProblem(jax_model, petab_problem)
[7]:
jax_problem._petab_problem.hybridization_df
[7]:
| targetValue | |
|---|---|
| targetId | |
| net1_input1 | prey |
| net1_input2 | predator |
| beta | net1_output1 |
| gamma | net1_output2 |
In the example, the inputs to the network are the prey and predator values and the outputs of the network provide \(\beta\) and \(\gamma\). These substitutions hold for all simulation conditions. Note that we refer to the inputs and outputs of the network by their PEtab identifiers, as defined in the mapping file.
Parameters Table
This file contains configuration information on all model parameters, which in our case, includes parameters of the neural network. Actual values of the network parameters should be provided in a separate HDF5 array file (see more details in the next section). The parameters file specifies the scale, bounds and nominal values of all the parameters. We want our neural network parameters to be freely optimised during training, so we set the bounds to [-inf, inf] and leave the nominal value
blank (translates to a NaN when loaded into the PEtab problem).
[8]:
petab_problem.parameter_df
[8]:
| parameterScale | lowerBound | upperBound | nominalValue | estimate | |
|---|---|---|---|---|---|
| parameterId | |||||
| alpha | lin | 0.0 | 15.0 | 1.3 | 1 |
| delta | lin | 0.0 | 15.0 | 1.8 | 1 |
| net1_ps | lin | -inf | inf | NaN | 1 |
Array inputs
In this example, the values of parameters for the neural network are provided in HDF5 format. The structure inside the HDF5 file is as follows.
arrays.hdf5
├── metadata
│ └── perm
└── parameters
└── net1
├── layer1
│ ├── weight
│ └── bias
├── layer2
│ ├── weight
│ └── bias
└── layer3
├── weight
└── bias
Note the network id net1 matches the identifier in the network YAML file and the modelEntityId in the mapping.tsv. Likewise the layer names (layer1, layer2, layer3) in the HDF5 are identifiers and must match the layer names in the network YAML and mapping files.
Measurements, Observables and Conditions Tables
These tables are unchanged from a regular PEtab probem. For completeness, the measurement table contains the measurement data, which experimental condition they where collected, and which model entities each measurement maps to.
[9]:
petab_problem.measurement_df
[9]:
| observableId | simulationConditionId | measurement | time | |
|---|---|---|---|---|
| 0 | prey_o | cond1 | 0.173017 | 1.0 |
| 1 | prey_o | cond1 | 0.489177 | 2.0 |
| 2 | prey_o | cond1 | 1.643996 | 3.0 |
| 3 | prey_o | cond1 | 5.451963 | 4.0 |
| 4 | prey_o | cond1 | 2.977522 | 5.0 |
| 5 | prey_o | cond1 | 0.181663 | 6.0 |
| 6 | prey_o | cond1 | 0.348112 | 7.0 |
| 7 | prey_o | cond1 | 0.937919 | 8.0 |
| 8 | prey_o | cond1 | 3.113240 | 9.0 |
| 9 | prey_o | cond1 | 8.863933 | 10.0 |
| 10 | predator_o | cond1 | 0.847416 | 1.0 |
| 11 | predator_o | cond1 | 0.211135 | 2.0 |
| 12 | predator_o | cond1 | -0.025054 | 3.0 |
| 13 | predator_o | cond1 | 0.125010 | 4.0 |
| 14 | predator_o | cond1 | 6.700455 | 5.0 |
| 15 | predator_o | cond1 | 2.007158 | 6.0 |
| 16 | predator_o | cond1 | 0.420092 | 7.0 |
| 17 | predator_o | cond1 | 0.048032 | 8.0 |
| 18 | predator_o | cond1 | 0.128669 | 9.0 |
| 19 | predator_o | cond1 | 1.192784 | 10.0 |
The observables table links the model entities to the measured values.
[10]:
petab_problem.observable_df
[10]:
| observableFormula | noiseFormula | observableTransformation | noiseDistribution | |
|---|---|---|---|---|
| observableId | ||||
| prey_o | prey | 0.05 | lin | normal |
| predator_o | predator | 0.05 | lin | normal |
The conditions table in intended to fully describe the experimental conditions the measurements were collected under. It can have any number of columns, but in this example there is only one experimental condition and no additional information about it.
[11]:
petab_problem.condition_df
[11]:
| conditionId |
|---|
| cond1 |
YAML File
Finally, the problem.yaml file brings everything together by describing how the problem should be constructed from the PEtab files. The extensions section contains the details of the SciML elements of our problem.
format_version: 2
parameter_file: "parameters.tsv"
problems:
- model_files:
lv:
location: "lv.xml"
language: "sbml"
measurement_files:
- "measurements.tsv"
observable_files:
- "observables.tsv"
condition_files:
- "conditions.tsv"
mapping_files:
- "mapping.tsv"
extensions:
sciml:
array_files:
- "net1_ps.hdf5"
hybridization_files:
- "hybridization.tsv"
neural_nets:
net1:
location: "net1.yaml"
static: false
format: "YAML"
Where the network is referenced in the mapping.tsv file, the modelEntityId must match the network id listed in this YAML, net1 in this case.
Running Simulations
We are now ready to run a simulation using the jax problem. AMICI’s run_simulations will simulate all conditions provided. In this example though, there is only one. See the AMICI docs for more examples such as training loops.
[12]:
import equinox as eqx
# Run simulations - results in llh - metrics in r
llh, r = run_simulations(jax_problem)
# Run simulations in gradient mode - gradients given in sllh
sllh, rgrad = eqx.filter_grad(run_simulations, has_aux=True)(jax_problem)
Next Steps
This tutorial has outlined how to set up a PEtab SciML problem where a neural network appears in the ODE equations (creating what is sometimes referred to as an UDE). The PEtab SciML format also supports others ways of formulating a problem, which you can find examples of in the how-to-guides. These include:
neural ODEs (all ODE equations are replaced with a neural network)
using networks in the observable function which links simulated values to measurements
having a neural network upstream of the ODE mapping high-dimensional inputs (e.g., images) to ODE model parameters.
We implemented a simple network in this tutorial for demonstration purposes but the PEtab SciML format supports a range of standard neural network layers and activation functions based on the PyTorch framework. The dedicated docs page lists these layers and functions, along with the tools that support each.
For a complete description of the PEtab SciML format visit the format specification page and for more information about more complex problem specifications supported by the PEtab format (e.g., models with partial observabillity, multiple experimental/simulations conditions, events), see the PEtab documentation.