Parameters¶
Types of parameter¶
Parameters are all defined on the ‘Parameters’ sheet of the framework. Depending on the structure of the model defined in the framework, a parameter may have certain attributes or restrictions if used in a particular way. We therefore introduce three classifications to help refer to such parameters. These classifications are:
Transition parameter - A parameter is referred to as a ‘transition parameter’ if it appears in the transition matrix and is thus directly associated with flow rates.
Function parameter - A parameter is referred to as a ‘function parameter’ if a function is provided in the Framework for the parameter
Output parameter - A parameter is referred to as an ‘output parameter’ if it does not directly or indirectly contribute to transitions. A parameter contributes directly to transitions if it is an output parameter, but it can also contribute indirectly if it appears in a transition parameter’s function (or even further back, if it appears in the function of any parameter that ends up affecting a transition parameter)
Functions¶
In the framework, a parameter can have a function associated with it, which can compute the value of a parameter based on the values of other integration objects like compartments, characteristics, parameters, and links (flow rates). You can also refer to the simulation time as t and the time step as dt. Standard math operations can be performed, as well as common functions like max
, min
, exp
etc. Valid functions can be found in function_parser.py
.
Parameters are updated in the order in which they appear in the framework. This means that parameter functions can only refer to parameters that appear above them in the framework. Atomica will automatically validate this when the framework is loaded. This also means that any program overwrites that affect terms inside a parameter function will be performed prior to the function being evaluated.
As mentioned above, it is possible to include actual flow rates in parameter functions. These can be specified in two ways
You can append
:flow
to the name of any transition parameter which resolves to the sum of all transitions driven by that parameter. This means if a parameter drives multiple transitions (e.g. deaths from multiple compartments) they will also be includedYou can use the compartment syntax to specify flows between compartments e.g.
sus:vac
for flows between susceptible and vaccinated,sus:
for all flows out of susceptible, and:vac
for all flows into vaccinated. You can also usesus:vac:par_name
to select only flows between compartments driven by a particular parameter, in the case where multiple parameters contribute independently to the same transition.
Caution
All link values are automatically annualized inside the parameter function and are in units of ‘people/year’ - this prevents the values from changing relative to other parameters if the model step size is changed.
Finally, all parameters are updated prior to computing flow rates. This is because flow rates may need to be rescaled in order to prevent negative compartment sizes, and this rescaling can only take place after all parameters have been computed. Thus, flow rates cannot appear in any parameter that contributes directly or indirectly to transitions, as there would otherwise be a circular dependency.
Caution
Links can only be used inside parameter functions for output parameters
Value precedence¶
The value for a parameter is derived from (in order of precedence):
Data entered in the databook
A parameter function
The result of a program calculation
The workflow in the model is
First any data points in the parameter set (that is, the values from the databook after any parameter scenarios have been applied) are inserted into the parameter
Next, if a parameter function is present, it is evaluated, and the result of the function overwrites the parameter’s current value
Finally, program outcomes are evaluated, and used to overwrite any parameters that are being reached by active programs
Caution
If a parameter has both a databook page and a function, the simulation dynamics will reflect the function. The data in the databook is used for plotting and calibration only.
In the ParameterSet
, it is possible to set the skip_function
attribute in order to skip evaluating the parameter function, in which case the parameter value will be drawn from either the data points or from the program calculation, if present. This functionality is used internally by parameter scenarios, so that it is possible to use a parameter overwrite to override a function (e.g. to use a parameter scenario to explicitly set the force of infection). However, using the skip_function
attribute directly is a non-standard workflow and must be implemented consistently in all scripts. If a script omits setting this flag then the simulation will run without errors, but will produce different results. It is therefore recommended not to use this functionality directly, but to modify the framework to avoid the need to disable the parameter function.
Performance considerations¶
The structure of the parameters in the model can have a significant effect on performance. The root causes is that in Python, it is much faster to perform calculations with vectorized operations rather than repeated scalar operations.
A vectorized operation is one that can be performed on all time points simultaneously, whereas a scalar operation operates on only one time point at a time. Operating one time point at a time is required if two conditions are met
The parameter is required to directly or indirectly drive a transition, and
The parameter depends on a compartment or characteristic
If a parameter depends on a compartment or characteristic, then it can only be evaluated after the compartment sizes have been computed. In the case of output parameters, this can be performed as a fast vector operation after integration is complete. If that same parameter drives transitions however, then it must be evaluated during integration using slower scalar operations. Similarly, if a parameter only depends on quantities in the databook (even if it depends on other parameters, as long as those parameters only depend on the databook) then the parameter can be evaluated in a fast vector operation either before or after integration, depending on whether it is needed for transitions.
Atomica will automatically select the fastest possible computation, but be aware that the inclusion of compartments and characteristics in parameter functions can have a significant effect on performance.
Timescales¶
Transitions in the model are driven by parameters. If the parameter is specified as a probability or a number of people, then there is an implicit time period associated with the flow. This is 1 year by default. So for example,
A transition parameter in probability units is interpreted as ‘probability (per year)’
A transition parameter in number units is interpreted as ‘number of people (per year)’
A transition parameter in duration units is interpreted in years
However, working exclusively with years can be problematic for two reasons
The values entered in the databook are entered in the same units as the parameter. In some cases, it may be preferable for end users to enter values in different units. For example, for fast processes it may be clearer to enter values in days rather than years. Similarly, in some cases available data may list outcomes over different timescales e.g. probability of treatment success over a 1 month period. Although this can be inconvenient, it is possible to use formulas in the databook to work around this, or conversions in the framework. However, either option adds considerable complexity.
If models are run with very small timesteps, such as daily time steps (required for frameworks where the fastest processes are on this timescale - for example, the mosquito lifespan for malaria is on this scale) - then the rescaling of probabilities by repeated sampling can suffer from loss of numerical precision. This is particularly problematic when moving between daily and annual probabilities. This is a critical issue that can prevent implementation of a model.
To address this, it is possible to optionally specify a timescale associated with a parameter. The timescale specifies the period in years associated with a parameter. If not specified, parameters are treated as having a timescale of 1, which corresponds to annual units. If a different value is entered, the effect is to change the units of the parameter. For example, if the timescale was entered as \(1/365 = 0.00274\), then the parameter would be treated as a daily quantity. The exact units depend on the format of the quantity:
If the transition parameter has timescale \(1/365\) and format ‘duration’ then the units are ‘days’
If the transition parameter has timescale \(1/365\) and format ‘probability’ then the units are ‘probability per day’
If the transition parameter has timescale \(1/365\) and format ‘number’ then the units are ‘number per day’
If a parameter has a timescale, then the units in the databook will automatically reflect the timescale. If a timescale is present in the databook, then the framework must have a matching timescale. So for example, if the framework declares that a parameter has timescale \(1/365\) (days) then the databook must provide the value in days (i.e. users cannot change this in the databook by changing it to read ‘weeks’ without making a corresponding change to the framework).
The timescales affect simulations and plotting in two ways
During model integration, the parameter is in the units specified in the framework, but flows in the model (the values stored in
Link
objects) are always in ‘number per timestep’ units. If a parameter has a timescale, that timescale will affect the conversionWhen constructing a
PlotData
object with time aggregation or accumulation, the timescale will be taken in to account. Where applicable, plots will be labelled including the timescale (e.g. a probability parameter with a timescale of \(1/52\) would have an axis label of ‘Probability (per week)’ on plots).
Thus, Parameter objects themselves always store values in the units specified in the framework. Conversion only takes place during computation of flow rates or computation of PlotData
aggregations.
Note
Because Parameter timescales are only converted when computing flow rates, user-defined parameter functions always operate on parameters in their native units.
This means that it is up to the modeller to explicitly handle any unit conversions required - for example, if combining a ‘Number per day’ and a ‘Number per week’ in a function, it would likely be necessary to explicitly introduce a factor of 7
. This would be done in the Framework in conjunction with specification of the timescales for the relevant parameters. Users are unable to change the timescale in the databook by design because the conversion is an arbitrary operation and depends on the parameter timescale - the modeller has complete control in the Framework to define the format and units and construct the model accordingly, without having to deal with the possibility of the user entering different formats or units.
Derivatives¶
Sometimes, it is important to track quantities dynamically. For example, you may want to model a change in pollution level depending on the number of people in the model, where the derivative of pollution depends on the number of people. This cannot be achieved with normal parameters, because parameter functions explicitly supply the value for the parameter at each timestep. Compartment values are effectively governed by derivatives, but setting the values is nontrivial because of the need to have source and sink compartments.
Instead, it is possible to mark a parameter function as supplying a derivative rather than the value itself. Suppose we have a parameter function \(f(\theta,t)\) where \(\theta\) represents the state of the model (so it encompasses any quantities that could appear in the parameter function), and \(t\) is the current simulation time. If we want the parameter function to act as a derivative, enter ‘Y’ in the ‘Is derivative’ column of the databook, as shown below. This column can be added in if it is not already present.
When marking the parameter as a derivative:
The parameter value is initialized as normal from values entered in the databook
The parameter function \(f(\theta,t)\) is evaluated like normal. The parameter function can refer to any other parameters previously defined in the framework. However, unlike normal parameters, a derivative parameter can also refer to its own value. Derivative parameters are updated out of sync with the rest of the model - after the value of the derivative is computed at time ti, we can immediately assign the value of time
ti+1
, whereas with normal parameter, the parameter function needs to be executed at timeti+1
in order to determine the value. For derivative parameters, the parameter function being evaluated at timeti+1
yields the value of the derivative at timeti+1
, rather than the actual value. Therefore, it is possible to refer to the parameter’s current value inside the function, because the function is not being used to compute that value. In effect, you can simply refer to the parameter by name inside a derivative function, as illustrated in theframework_derivative_test.xlsx
example in thetests
folder.Normally, the parameter value is updated as \(P(t) = f(\theta,t)\) - that is, the function is evaluated and directly overwrites the parameter value. However, if the parameter is marked as a derivative, then the update is \(P(t+1) = P(t) + \Delta t f(\theta,t)\). That is, a single step of an Euler integration is performed.
Note
Don’t forget that the databook contains the initialization values for the parameter, and thus needs to be entered in the same units as the parameter itself.
Derivative parameters must have both a databook page (for the initialization value) and a function (for the derivative value) and an error will be raised if either is missing.
Program overwrites are applied to the derivative. So if you define a derivative parameter, don’t forget that programs do not overwrite the actual value, but they overwrite the derivative. You could set another parameter equal to the derivative parameter and target that instead, if you wish to explicitly overwrite the value of a derivative parameter - however, be cautious and inspect results carefully if you then later switch programs off.
After integration, the parameter value can be inspected and analyzed as normal. Note that this means you cannot directly read out \(f(\theta,t)\) because the parameter has integrated \(f(\theta,t)\) over time. Note also that the update step multiplies \(f(\theta,t)\) by \(\Delta t\). This means that the units of the value returned by the parameter function should be the same as the units of the Parameter, but on a ‘per year’ basis. This is because \(\Delta t\) is in units of years. So in the example above, m_prev
is a dimensionless fraction, and the parameter function has the same units as dm_prev
, which is in units of 1/years
, thus when integrated, m_prev
will be dimensionless. It is important that this multiplication by \(\Delta t\) is taken into account, because this formulation allows consistent results to be obtained even when the step size is changed.
Warning
Notice that the parameter timescale does not enter here. The parameter timescale affects the conversion from the parameter value into flow rates. The derivative is always with respect to time, in the simulation units, which are always years. Thus the function value’s units should differ from the parameter’s units by a factor of 1/years
regardless of the timescale of the parameter. If you have set different timescales for quantities appearing in the derivative parameter’s function, you may need to include conversion factors in the parameter function to ensure that the timescale of the derivative comes out correctly.
Interpolation and smoothing¶
Parameters generally contain sparse time-dependent values matching those entered in the databook. As part of running the model, these automatically get interpolated onto the simulation time points. This interpolation is linear by default (although legacy projects use spline interpolation). Generally, this means that you can look at the values in the databook and they will be linearly interpolated when running the simulation.
However, in some cases, it is desirable to incorporate additional assumptions into the input parameters - for example, that they are smoothly changing. To support this, Parameter
objects contain a smoothing method, Parameter.smooth
. This applies smoothing in-place, onto a set of specified time points, replacing the parameter’s value. The most common usage would be to simply use Parameter.smooth(proj.settings.tvec)
which will smooth the parameter onto the simulation times (assuming the step size is not subsequently modified).
By specifying a different set of times, you can apply different smoothing methods to different parts of the parameter. For example, you could smooth parameter scenario values differently to the databook values. Don’t forget that you can always modify the TimeSeries
object contained in the Parameter
as well, to perform arbitrarily specific modifications.
Maximum compartment duration¶
A common, complex situation occurs if there are durations in the model that are significantly larger than the simulation time step (such as the duration of protection of a vaccine). See the documentation on Timed Transitions for details on how to implement this in a framework.