atomica.data.ProjectData¶

class atomica.data.ProjectData(framework)[source]¶

Bases: prettyobj

Store project data: class-equivalent of Databooks

This class is used to load and work with data that is entered in databooks. It provides the interface for

Loading data
Modifying data (values, adding/removing populations etc.
Saving modified data
Writing new databooks

To instantiate, the ProjectData constructor is normally not used. Instead, use the static methods

ProjectData.new() to create a new instance/databook given a ProjectFramework
ProjectData.from_spreadsheet() to load a databook

Simple initialization

Attributes

`end_year`	Return the start year from the databook
`start_year`	Return the start year from the databook
`pops`	full_name, 'type':pop_type}
`transfers`	This stores a list of `TimeDependentConnections` instances for transfers
`interpops`	This stores a list of `TimeDependentConnections` instances for interactions
`tvec`	This is the data's tvec used when instantiating new tables.
`tdve`	This is an odict storing `TimeDependentValuesEntry` instances keyed by the code name of the TDVE
`tdve_pages`	This is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet

Methods

`add_interaction`	Add a new empty interaction
`add_pop`	Add a population
`add_transfer`	Add a new empty transfer
`change_tvec`	Change the databook years
`from_spreadsheet`	Construct ProjectData from spreadsheet
`get_tdve_page`	Given a code name for a TDVE quantity, find which page it is on
`get_ts`	Extract a TimeSeries from a TDVE table or TDC table
`new`	Make a new databook/`ProjectData` instance
`remove_interaction`	Remove an interaction
`remove_pop`
`remove_transfer`	Remove a transfer
`rename_pop`	Rename a population
`rename_transfer`	Rename an existing transfer
`save`	Save databook to disk
`tables`	Return iterator over all TDVE and TDC tables
`to_spreadsheet`	Return content as a Sciris Spreadsheet
`to_workbook`	Return an open workbook for the databook
`validate`	Check if the ProjectData instance can be used to run simulations

add_interaction(code_name, full_name, from_pop_type=None, to_pop_type=None)[source]¶

Add a new empty interaction

Normally this method would only be manually called if a framework had been updated to contain a new interaction, and the databook now required updating. Therefore, this method would generally only be used when an interaction with given code name, full name, and pop type had already been added to a framework.

Parameters:

code_name (str) – The code name of the interaction to create
full_name (str) – The full name of the interaction to create
from_pop_type (str) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework
to_pop_type (str) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework

Return type:

TimeDependentConnections

Returns:

Newly instantiated TimeDependentConnections object (also added to ProjectData.interpops)

add_pop(code_name, full_name, pop_type=None)[source]¶

Add a population

This will add a population to the databook. The population type should match one of the population types in the framework

Parameters:

code_name (str) – The code name for the new population
full_name (str) – The full name/label for the new population
pop_type (str) – String with the population type code name

Return type:

None

add_transfer(code_name, full_name, pop_type=None)[source]¶

Add a new empty transfer

Parameters:

code_name (str) – The code name of the transfer to create
full_name (str) – The full name of the transfer to create
pop_type (str) – Code name of the population type. Default is first population type in the framework

Return type:

TimeDependentConnections

Returns:

Newly instantiated TimeDependentConnections object (also added to ProjectData.transfers)

change_tvec(tvec)[source]¶

Change the databook years

This function can be used to change the time vector in all of the TDVE/TDC tables. There are two ways to change the time arrays:

Setting ProjectData.tvec directly will only affect newly added tables, and will keep existing tables as they are
Calling ProjectData.change_tvec() will modify all existing tables

Note that the TDVE/TDC tables store time/value pairs sparsely within their TimeSeries objects. Therefore, changing the time array won’t modify any of the data - it will only have an effect the next time a databook is written (so typically this method would be called as part of preparing a modified databook).

Parameters:: tvec (array) – A float, list, or array containing time values (in years) for the databook
Return type:: None

property end_year: float¶

Return the start year from the databook

The ProjectData end year is defined as the latest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation end year, if using all of the data in the databook is desired.

Returns:: The latest year in the databook

static from_spreadsheet(spreadsheet, framework)[source]¶

Construct ProjectData from spreadsheet

The framework is needed because the databook does not read in or otherwise store

The valid units for quantities
Which population type is associated with TDVE tables

Parameters:

spreadsheet – The name of a spreadsheet, or a sc.Spreadsheet
framework – A ProjectFramework instance

Returns:

A new ProjectData instance

get_tdve_page(code_name)[source]¶

Given a code name for a TDVE quantity, find which page it is on

Parameters:: code_name – The code name for a TDVE quantity
Return type:: str
Returns:: The sheet that it appears on

get_ts(name, key=None)[source]¶

Extract a TimeSeries from a TDVE table or TDC table

Parameters:

name (str) – The code name for the container storing the TimeSeries - The code name of a transfer, interaction, or compartment/characteristic/parameter - The name of a transfer parameter instantiated in model.build e.g. ‘age_0-4_to_5-14’. this is mainly useful when retrieving data for plotting, where variables are organized according to names like ‘age_0-4_to_5-14’
key – Specify the identifier for the TimeSeries - If name is a comp/charac/par, then key should be a pop name - If name is a transfer or interaction, then key should be a tuple (from_pop,to_pop) - If name is the name of a model transfer parameter, then key should be left as None

Returns:

A TimeSeries, or None if there were no matches

Regarding the specification of the key - the same transfer could be specified as

name='age', key=('0-4','5-14')
name='age_0-4_to_5-14', key=None

where the former is typically used when working with data and calibrations, and the latter is used in Model and is therefore encountered on the Result and plotting side.

If retrieving values for a comp/charac/par and the databook contains an entry for ‘all’ rather than specific populations, then the ‘all’ time series will be returned regardless of the key

interpops¶: This stores a list of TimeDependentConnections instances for interactions

static new(framework, tvec, pops, transfers)[source]¶

Make a new databook/ProjectData instance

This method should be used (instead of the standard constructor) to produce a new class instance (e.g. if creating a new databook).

Parameters:

framework – A ProjectFramework instance
tvec – A scalar, list, or array of times (typically would be generated with numpy.arange())
pops – A number of populations, or a dict with either {name:label} or {name:{label:label,type:type}}. Type defaults to the first population type in the framework
transfers – A number of transfers, or a dict with either {name:label} or {name:{label:label,type:type}}. The type defaults to the first population type in the framework. Transfers can only take place between populations of the same type.

Returns:

A new ProjectData instance

pops¶

full_name, ‘type’:pop_type}

Type:: This is an odict mapping code_name
Type:: {‘label’

remove_interaction(code_name)[source]¶

Remove an interaction

Parameters:: code_name (str) – Code name of the interaction to remove
Return type:: None

remove_transfer(code_name)[source]¶

Remove a transfer

Parameters:: code_name (str) – Code name of the transfer to remove
Return type:: None

rename_pop(existing_code_name, new_code_name, new_full_name)[source]¶

Rename a population

Parameters:

existing_code_name (str) – Existing code name of a population
new_code_name (str) – New code name to assign
new_full_name (str) – New full name/label to assign

Return type:

None

rename_transfer(existing_code_name, new_code_name, new_full_name)[source]¶

Rename an existing transfer

Parameters:

existing_code_name (str) – The existing code name to change
new_code_name (str) – The new code name
new_full_name (str) – The new full name

Return type:

None

save(fname)[source]¶

Save databook to disk

This function provides a shortcut to generate a spreadsheet and immediately save it to disk.

Parameters:: fname – File name to write on disk
Return type:: None

property start_year: float¶

Return the start year from the databook

The ProjectData start year is defined as the earliest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation start year, if using all of the data in the databook is desired.

Returns:: The earliest year in the databook

tables()[source]¶

Return iterator over all TDVE and TDC tables

Returns:: An iterator

tdve¶: This is an odict storing TimeDependentValuesEntry instances keyed by the code name of the TDVE

tdve_pages¶: This is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet

to_spreadsheet()[source]¶

Return content as a Sciris Spreadsheet

Return type:: Spreadsheet
Returns:: A sciris.Spreadsheet instance

to_workbook()[source]¶

Return an open workbook for the databook

This allows the xlsxwriter workbook to be manipulated prior to closing the filestream e.g. to append extra sheets. This prevents issues related to cached data values when reloading a workbook to append or modify content

Warning - the workbook is backed by a BytesIO instance and needs to be closed. See the usage of this method in the :meth`to_spreadsheet` function.

Return type:: tuple
Returns:: A tuple (bytes, workbook) with a BytesIO instance and a corresponding open xlsxwriter workbook instance

transfers¶: This stores a list of TimeDependentConnections instances for transfers

tvec¶: This is the data’s tvec used when instantiating new tables. Not _guaranteed_ to be the same for every TDVE/TDC table

validate(framework)[source]¶

Check if the ProjectData instance can be used to run simulations

A databook can be ‘valid’ in two senses

The Excel file adheres to the correct syntax and it can be parsed into a ProjectData object
The resulting ProjectData object contains sufficient information to run a simulation

Sometimes it is desirable for ProjectData to be valid in one sense rather than the other. For example, in order to run a simulation, the ProjectData needs to contain at least one value for every TDVE table. However, the TDVE table does _not_ need to contain values if all we want to do is add another key pop Thus, the first stage of validation is the ProjectData constructor - if that runs, then users can access methods like ‘add_pop’,’remove_transfer’ etc.

On the other hand, to actually run a simulation, the _contents_ of the databook need to satisfy various conditions These tests are implemented here. The typical workflow would be that ProjectData.validate() should be used if a simulation is going to be run. In the first instance, this can be done in Project.load_databook but the FE might want to perform this check at a different point if the databook manipulation methods e.g. add_pop are going to be exposed in the interface

This function throws an informative error if there are any problems identified or otherwise returns True

Parameters:: framework – A ProjectFramework instance to validate the data against
Return type:: bool
Returns:: True if ProjectData is valid. An error will be raised otherwise