atomica.data

Implementation of Databook functionality

This module defines the ProjectData class, which serves as a Python-based representation of the Databook, as well as providing methods for reading Databooks into ProjectData instances, and saving ProjectData back to Excel files.

Classes

ProjectData(framework)

Store project data: class-equivalent of Databooks

class atomica.data.ProjectData(framework)[source]

Store project data: class-equivalent of Databooks

This class is used to load and work with data that is entered in databooks. It provides the interface for

  • Loading data

  • Modifying data (values, adding/removing populations etc.

  • Saving modified data

  • Writing new databooks

To instantiate, the ProjectData constructor is normally not used. Instead, use the static methods

  • ProjectData.new() to create a new instance/databook given a ProjectFramework

  • ProjectData.from_spreadsheet() to load a databook

_book = None

Temporary storage for the workbook while writing a databook

_formats = None

Temporary storage for the Excel formatting while writing a databook

_pop_types = None

Store set of valid population types from framework

_read_interpops(sheet)[source]

Writes the ‘Interactions’ sheet

Return type

None

_read_pops(sheet)[source]

Reads the ‘Population Definitions’ sheet

Return type

None

_read_transfers(sheet)[source]

Writes the ‘Transfers’ sheet

Return type

None

_references = None

Temporary storage for cell references while writing a databook

_write_interpops()[source]

Writes the ‘Interactions’ sheet

Return type

None

_write_pops()[source]

Writes the ‘Population Definitions’ sheet

Return type

None

_write_tdve()[source]

Writes the TDVE tables

This method will create multiple sheets, one for each custom page specified in the Framework.

Return type

None

_write_transfers()[source]

Writes the ‘Transfers’ sheet

Return type

None

add_interaction(code_name, full_name, from_pop_type=None, to_pop_type=None)[source]

Add a new empty interaction

Normally this method would only be manually called if a framework had been updated to contain a new interaction, and the databook now required updating. Therefore, this method would generally only be used when an interaction with given code name, full name, and pop type had already been added to a framework.

Parameters
  • code_name (str) – The code name of the interaction to create

  • full_name (str) – The full name of the interaction to create

  • from_pop_type (Optional[str]) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework

  • to_pop_type (Optional[str]) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework

Return type

TimeDependentConnections

Returns

Newly instantiated TimeDependentConnections object (also added to ProjectData.interpops)

add_pop(code_name, full_name, pop_type=None)[source]

Add a population

This will add a population to the databook. The population type should match one of the population types in the framework

Parameters
  • code_name (str) – The code name for the new population

  • full_name (str) – The full name/label for the new population

  • pop_type (Optional[str]) – String with the population type code name

Return type

None

add_transfer(code_name, full_name, pop_type=None)[source]

Add a new empty transfer

Parameters
  • code_name (str) – The code name of the transfer to create

  • full_name (str) – The full name of the transfer to create

  • pop_type (Optional[str]) – Code name of the population type. Default is first population type in the framework

Return type

TimeDependentConnections

Returns

Newly instantiated TimeDependentConnections object (also added to ProjectData.transfers)

change_tvec(tvec)[source]

Change the databook years

This function can be used to change the time vector in all of the TDVE/TDC tables. There are two ways to change the time arrays:

  • Setting ProjectData.tvec directly will only affect newly added tables, and will keep existing tables as they are

  • Calling ProjectData.change_tvec() will modify all existing tables

Note that the TDVE/TDC tables store time/value pairs sparsely within their TimeSeries objects. Therefore, changing the time array won’t modify any of the data - it will only have an effect the next time a databook is written (so typically this method would be called as part of preparing a modified databook).

Parameters

tvec (<built-in function array>) – A float, list, or array containing time values (in years) for the databook

Return type

None

property end_year

Return the start year from the databook

The ProjectData end year is defined as the latest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation end year, if using all of the data in the databook is desired.

Return type

float

Returns

The latest year in the databook

static from_spreadsheet(spreadsheet, framework)[source]

Construct ProjectData from spreadsheet

The framework is needed because the databook does not read in or otherwise store
  • The valid units for quantities

  • Which population type is associated with TDVE tables

Parameters
  • spreadsheet – The name of a spreadsheet, or a sc.Spreadsheet

  • framework – A ProjectFramework instance

Returns

A new ProjectData instance

get_tdve_page(code_name)[source]

Given a code name for a TDVE quantity, find which page it is on

Parameters

code_name – The code name for a TDVE quantity

Return type

str

Returns

The sheet that it appears on

get_ts(name, key=None)[source]

Extract a TimeSeries from a TDVE table or TDC table

Parameters
  • name (str) – The code name for the container storing the TimeSeries - The code name of a transfer, interaction, or compartment/characteristic/parameter - The name of a transfer parameter instantiated in model.build e.g. ‘age_0-4_to_5-14’. this is mainly useful when retrieving data for plotting, where variables are organized according to names like ‘age_0-4_to_5-14’

  • key – Specify the identifier for the TimeSeries - If name is a comp/charac/par, then key should be a pop name - If name is a transfer or interaction, then key should be a tuple (from_pop,to_pop) - If name is the name of a model transfer parameter, then key should be left as None

Returns

A TimeSeries, or None if there were no matches

Regarding the specification of the key - the same transfer could be specified as

  • name='age', key=('0-4','5-14')

  • name='age_0-4_to_5-14', key=None

where the former is typically used when working with data and calibrations, and the latter is used in Model and is therefore encountered on the Result and plotting side.

interpops = None

This stores a list of TimeDependentConnections instances for interactions

static new(framework, tvec, pops, transfers)[source]

Make a new databook/ProjectData instance

This method should be used (instead of the standard constructor) to produce a new class instance (e.g. if creating a new databook).

Parameters
  • framework – A ProjectFramework instance

  • tvec – A scalar, list, or array of times (typically would be generated with numpy.arange())

  • pops – A number of populations, or a dict with either {name:label} or {name:{label:label,type:type}}. Type defaults to the first population type in the framework

  • transfers – A number of transfers, or a dict with either {name:label} or {name:{label:label,type:type}}. The type defaults to the first population type in the framework. Transfers can only take place between populations of the same type.

Returns

A new ProjectData instance

pops = None

full_name, ‘type’:pop_type}

Type

This is an odict mapping code_name

Type

{‘label’

remove_interaction(code_name)[source]

Remove an interaction

Parameters

code_name (str) – Code name of the interaction to remove

Return type

None

remove_pop(pop_name)[source]
remove_transfer(code_name)[source]

Remove a transfer

Parameters

code_name (str) – Code name of the transfer to remove

Return type

None

rename_pop(existing_code_name, new_code_name, new_full_name)[source]

Rename a population

Parameters
  • existing_code_name (str) – Existing code name of a population

  • new_code_name (str) – New code name to assign

  • new_full_name (str) – New full name/label to assign

Return type

None

rename_transfer(existing_code_name, new_code_name, new_full_name)[source]

Rename an existing transfer

Parameters
  • existing_code_name (str) – The existing code name to change

  • new_code_name (str) – The new code name

  • new_full_name (str) – The new full name

Return type

None

save(fname)[source]

Save databook to disk

This function provides a shortcut to generate a spreadsheet and immediately save it to disk.

Parameters

fname – File name to write on disk

Return type

None

property start_year

Return the start year from the databook

The ProjectData start year is defined as the earliest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation start year, if using all of the data in the databook is desired.

Return type

float

Returns

The earliest year in the databook

tdve = None

This is an odict storing TimeDependentValuesEntry instances keyed by the code name of the TDVE

tdve_pages = None

This is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet

to_spreadsheet()[source]

Return content as an AtomicaSpreadsheet

Returns

An AtomicaSpreadsheet instance

transfers = None

This stores a list of TimeDependentConnections instances for transfers

tvec = None

This is the data’s tvec used when instantiating new tables. Not _guaranteed_ to be the same for every TDVE/TDC table

validate(framework)[source]

Check if the ProjectData instance can be used to run simulations

A databook can be ‘valid’ in two senses

  • The Excel file adheres to the correct syntax and it can be parsed into a ProjectData object

  • The resulting ProjectData object contains sufficient information to run a simulation

Sometimes it is desirable for ProjectData to be valid in one sense rather than the other. For example, in order to run a simulation, the ProjectData needs to contain at least one value for every TDVE table. However, the TDVE table does _not_ need to contain values if all we want to do is add another key pop Thus, the first stage of validation is the ProjectData constructor - if that runs, then users can access methods like ‘add_pop’,’remove_transfer’ etc.

On the other hand, to actually run a simulation, the _contents_ of the databook need to satisfy various conditions These tests are implemented here. The typical workflow would be that ProjectData.validate() should be used if a simulation is going to be run. In the first instance, this can be done in Project.load_databook but the FE might want to perform this check at a different point if the databook manipulation methods e.g. add_pop are going to be exposed in the interface

This function throws an informative error if there are any problems identified or otherwise returns True

Parameters

framework – A ProjectFramework instance to validate the data against

Return type

bool

Returns

True if ProjectData is valid. An error will be raised otherwise