Utility functions for working with cascades

Cascades are defined in a ProjectFramework object. This module implements functions that are useful for working with the cascades, including

• Validation

• Plotting

• Value extraction

On the plotting side, the two key functions are

The plot takes in as arguments the cascade and populations. Users can specify cascades as

• The name of a cascade in the Framework

• The index of a cascade in the Framework

• A list of comps/characs in stage order, with stage names matching the comps/characs

• A ordered dict with {stage:comps/characs}, with a customized stage name. This is referred to in code as a cascade dict

The first two representations map to cascades defined in the framework, while the last two representations relate to defining custom cascades on the fly. They are therefore sanitized in two stages

• sanitize_cascade_inputs() turns cascade indices into names, and cascade lists into dicts. Returning string names for predefined cascades allows the name to be used in the title of plots

• get_cascade_outputs() turns cascade names into cascade dicts

The dictionary representation is always required when retrieving the values of cascades. There are two types of value retrieval:

• get_cascade_vals() which returns values for each cascade stage from a model result

• get_cascade_data() which attempts to compute values for each cascade stage from a ProjectData instance. This is used when plotting data points on the cascade plot. Compartments and characteristics are automatically summed as required. Data points will only be displayed if the data has values for all of the included quantities in the year being plotted.

Functions

 cascade_summary(source_data, year[, pops, …]) Print summary of cascade get_cascade_data(data, framework, cascade[, …]) Get data values for a cascade get_cascade_vals(result, cascade[, pops, year]) Get values for a cascade plot_cascade([results, cascade, pops, year, …]) Plot single or multiple cascade plot plot_multi_cascade([results, cascade, pops, …]) ” plot_single_cascade([result, cascade, pops, …]) Plot cascade for a single result plot_single_cascade_series([result, …]) Plot stacked timeseries sanitize_cascade(framework, cascade[, …]) Normalize cascade inputs sanitize_pops(pops, pop_source, pop_type) Sanitize input populations validate_cascade(framework, cascade[, …]) Check if a cascade is valid

Classes

 CascadeEnsemble(framework, cascade[, years, …]) Ensemble for cascade plots

Exceptions

 InvalidCascade Error if cascade is not valid
class atomica.cascade.CascadeEnsemble(framework, cascade, years=None, baseline_results=None, pops=None)[source]

This specialized Ensemble type is oriented to working with cascades. It has pre-defined mapping functions for retrieving cascade values and wrappers to plot cascade data.

Conceptually, the idea is that using cascades with ensembles requires doing two things

• Having a mapping function that generates PlotData instances where the outputs are cascade stages

• Having a plotting function that makes bar plots where all of the bars for the same year/result are the same color (which rules out Ensemble.plot_bars()) where the bars are grouped by output (which rules out plotting.plot_bars()) and where the plot data is stored in PlotData instances rather than in Result object (which rules out cascade.plot_multi_cascade)

This specialized Ensemble class implements both of the above steps

• The constructor takes in the name of the cascade (or a cascade dict) and internally generates a suitable mapping function

Parameters
• framework – A ProjectFramework instance

• cascade – A cascade representation supported by sanitize_cascade(). However, if the cascade is a dict, then it will not be sanitized. This allows advanced aggregations to be used. A CascadeEnsemble can only store results for one cascade - to record multiple cascades, create further CascadeEnsemble instances as required.

• years – Optionally interpolate results onto these years, to reduce storage requirements

• baseline_results – Optionally store baseline result obtained without uncertainty

• pops – A population aggregation dict. Can evaluate to more than one aggregated population

get_vals(pop=None, years=None)[source]

This method returns arrays of cascade values and uncertainties. Unlike get_cascade_vals() this method returns uncertainties and works for multiple Results (which can be stored in a single PlotData instance).

This is implemented in CascadeEnsemble and not Ensemble for now because we make certain assumptions in CascadeEnsemble that are not valid more generally - specifically, that the outputs all correspond to a single set of cascade stages, and the

The year must match a year contained in the CascadeEnsemble - the match is made by finding the year, rather than interpolation. This is because interpolation may have occurred when the Result was initially stored as a PlotData in the CascadeEnsemble - in that case, double interpolation may occur and provide incorrect results (e.g. if the simulation is interpolated onto two years, and then interpolated again as part of getting the values). To prevent this from happening, interpolation is not performed again here

Parameters
• pop – Any population aggregations should have been completed when the results were loaded into the Ensemble. Thus, we only prompt for a single population name here

• years – Select subset of years from the Ensemble. Must match items in self.tvec

Return type

tuple

Returns

Tuple of (vals,uncertainty,t) where vals and uncertainty are doubly-nested dictionaries of the form vals[result_name][stage_name]=np.array with arrays the same sie as t (which matches the input argument years if provided)

plot_multi_cascade(pop=None, years=None)[source]

The multi-cascade with uncertainties differs from the normal plot_multi_cascade primarily in the fact that this plot is based around PlotData instances while plot_multi_cascade is a simplified routine that takes in results and calls get_cascade_vals. Thus, while this method assumes that the PlotData contains a properly nested cascade, it’s not actually valided which allows more flexibility in terms of defining arbitrary quantities to include on the plot (like ‘virtual’ stages that are functions of cascade stages)

Intended usage is for

• One population/population aggregation

• Multiple years OR multiple results, but not both

Thus, the legend will either show result names for a single year, or years for a single result

Population aggregation here is assumed to have been done at the time the Result was loaded into the Ensemble, so the pop argument here simply specifies which one of the already aggregated population groups should be used.

Could be generalized further once applications are clearer

exception atomica.cascade.InvalidCascade[source]

Error if cascade is not valid

This error gets thrown if a cascade failed validation - for example, because the requested stages were not correctly nested

atomica.cascade._cascade_ensemble_mapping(results, cascade_dict, years, pops)[source]
atomica.cascade.cascade_summary(source_data, year, pops=None, cascade=0)[source]

This function takes in results, either as a Result or list of Results, or as a CascadeEnsemble.

Parameters
• source_data – A Result or a CascadeEnsemble

• year (float) – A scalar year to print results in

• pops – If a Result was passed in, this can be any valid population aggregation. If a CascadeEnsemble was passed in, then this must match the name of one of the population aggregations stored in the Ensemble (i.e. it must be an item contained in CascadeEnsemble.pops)

• cascade – If a Result was passed in, this argument specifies which cascade to use. If a CascadeEnsemble was passed in, then this argument is ignored because the CascadeEnsemble already uniquely specifies the cascade

• pretty – If True, absolute values will be rounded to integers and percentages to 2 sig figs

Return type

None

Returns

atomica.cascade.get_cascade_data(data, framework, cascade, pops=None, year=None)[source]

Get data values for a cascade

This function is the counterpart to get_cascade_vals() but it returns values from data rather than values from a Result. Note that the inputs and outputs are slightly different - this function still needs the framework so that it can sanitize the requested cascade. If year is specified, the output is guaranteed to be the same size as the input year array, the same as get_cascade_vals(). However, the get_cascade_vals() defaults to all time points in the simulation output, whereas this function defaults to all data years. Thus, if the year is omitted, the returned time points may be different between the two functions. To make a plot superimposing data and model output, the year should be specified explicitly to ensure that the years match up.

NB - In general, data probably will NOT exist Set the logging level to ‘DEBUG’ to have messages regarding this printed out

Parameters
• data – A ProjectData instance

• framework – A ProjectFramework instance

• cascade – A cascade representation supported by sanitize_cascade()

• pops – Supported population representation. Can be ‘all’, or a pop name, or a list of pop names, or a dict with one key

• year – Optionally specify a subset of years to retrieve values for. Can be a scalar, list, or array. If None, all time points in the ProjectData instance will be used

Returns

A tuple with (cascade_vals,t) where cascade_vals is the form {stage_name:np.array} and t is a np.array with the year values

atomica.cascade.get_cascade_vals(result, cascade, pops=None, year=None)[source]

If the population list :param result: A single Result instance :param cascade: A cascade representation supported by sanitize_cascade() :param pops: A population representation supported by sanitize_pops() :param year: Optionally specify a subset of years to retrieve values for.

Can be a scalar, list, or array. If None, all time points in the result will be used

Return type

tuple

Returns

A tuple with (cascade_vals,t) where cascade_vals is the form {stage_name:np.array} and t is a np.array with the year values

atomica.cascade.plot_cascade(results=None, cascade=None, pops=None, year=None, data=None, show_table=None)[source]

Plot single or multiple cascade plot

plot_single_cascade() generates a plot where multiple results each have their own figure. A common requirement (used on the FE) is to decide between calling plot_single_cascade() or calling plot_multi_cascade() based on whether there are multiple Result instances or not.

A multi-cascade plot will be displayed if there are multiple years or if there are multiple results. Thus this function is always guaranteed to return a single figure.

Parameters
• results – A single Result instance, or list of instances

• cascade – A cascade specification supported by sanitize_cascade()

• pops – A population specification supported by sanitize_pops() - must correspond to a single aggregation

• year – A single year, or multiple years (can be a scalar, list, or array)

• data – A ProjectData instance

• show_table (Optional[bool]) – If True and a multi-cascade plot is generated, then the loss table will also be shown

Returns

Figure object containing the plot that was produced

atomica.cascade.plot_multi_cascade(results=None, cascade=None, pops=None, year=None, data=None, show_table=None)[source]

” Plot cascade for multiple results

This is a cascade plot that handles multiple results and times Results are grouped by stage/output, which is not possible to do with plot_bars()

Parameters
• results – A single result, or list of results. A single figure will be generated

• cascade – A cascade specification supported by sanitize_cascade()

• pops – A population specification supported by sanitize_pops() - must correspond to a single aggregation

• year – A scalar, or array of time points. Bars will be plotted for every time point

• data – A ProjectData instance (currently not used)

• show_table – If True then a table with loss values will be rendered in the figure

Returns

Figure object containing the plot

atomica.cascade.plot_single_cascade(result=None, cascade=None, pops=None, year=None, data=None, title=False)[source]

Plot cascade for a single result

This is the fancy cascade plot, which only applies to a single result at a single time

Parameters
• results – A single result, or list of results. One figure will be generated for each result

• cascade – A cascade specification supported by sanitize_cascade()

• pops – A population specification supported by sanitize_pops() - must correspond to a single aggregation

• year – A single year, can be a scalar or an iterable of length 1

• data – A ProjectData instance

• title – Optionally override the title of the plot

Returns

Figure object containing the plot, or list of figures if multiple figures were produced

atomica.cascade.plot_single_cascade_series(result=None, cascade=None, pops=None, data=None)[source]

Plot stacked timeseries

Plot a stacked timeseries of the cascade. Unlike a normal stacked plot, the shaded areas show losses so for example the overall height of the plot corresponds to the number of people in the first cascade stage. Thus instead of the cascade progressing from left to right, the cascade progresses from top to bottom. This way, the left-right axis can be used to show the change in cascade flow over time.

Parameters
• results – A single result, or list of results. One figure will be generated for each result

• cascade – A cascade specification supported by sanitize_cascade()

• pops – A population specification supported by sanitize_pops() - must correspond to a single aggregation

• data – A ProjectData instance

Return type

list

Returns

List of Figure objects for all figures that were generated

atomica.cascade.sanitize_cascade(framework, cascade, fallback_used=False)[source]

For convenience, users can specify cascades in one of several representations. To facilitate working with these representations on the backend, this function turns any valid representation into a dictionary mapping cascade stage names to a list of compartments/characs. It also returns the name of the cascade (if one is present) for use in plot titles.

• Stage 1 - sus,vac,inf

• Stage 2 - vac,inf

Then example usage would be:

>>> sanitize_cascade(framework,'main')[1]
{'Stage 1':['sus','vac','inf'],'Stage 2':['vac','inf']


This function also validates the cascade, so it is not necessary to call validate_cascade() separately.

Parameters
• framework – A ProjectFramework instance

Supported cascade representation. Could be - A string cascade name - An integer specifying the index of the cascade - None, which maps to the first cascade in the framework - A list of cascade stages - A dict defining the cascade The first three input formats will

result in the cascade name also being returned (otherwise it will be assigned None

Return type

tuple

Returns

A tuple with (cascade_name,cascade_dict) - the cascade name is None if the cascade was specified as a list or dict

atomica.cascade.sanitize_pops(pops, pop_source, pop_type)[source]

Sanitize input populations

The input populations could be specified as

• A list or dict (with single key) containing either population code names. List inputs can contain full names (e.g. from the FE)

• A string like ‘all’ or ‘total’

• None, which is shorthand for all populations

For cascade purposes, the specified populations must evaluate to a single aggregation. That is, a cascade plot can only be made for a single group of people at a time.

Parameters
• pops – The population representation to sanitize (list, dict, string)

• pop_source – Object to draw available populations from (a Result or PlotData)

• pop_type – Population type to select. All returned populations will match this type

Return type

dict

Returns

A dict with a single key that can be used by PlotData to specify populations

atomica.cascade.validate_cascade(framework, cascade, cascade_name=None, fallback_used=False)[source]

Check if a cascade is valid

A cascade is invalid if any stage does not contain a compartment that appears in subsequent stages i.e. if the stages are not all nested. Also, all compartments referred to must exist in the same population type, otherwise it is not possible to define a population-specific cascade as it would intrinsically span populations.

Parameters
• framework – A ProjectFramework instance

• cascade – A cascade representation supported by sanitize_cascade()

• cascade_name – Name of cascade to be printed in error messages

• fallback_used (bool) – If True, then in the event that the cascade is not valid, the error message will reflect the fact that it was not a user-defined cascade

Return type

str

Returns

The population type if the cascade is valid

Raises

InvalidCascade if the cascade is not valid