atomica.data.ProjectData¶
- class atomica.data.ProjectData(framework)[source]¶
Bases:
prettyobj
Store project data: class-equivalent of Databooks
This class is used to load and work with data that is entered in databooks. It provides the interface for
Loading data
Modifying data (values, adding/removing populations etc.
Saving modified data
Writing new databooks
To instantiate, the
ProjectData
constructor is normally not used. Instead, use the static methodsProjectData.new()
to create a new instance/databook given aProjectFramework
ProjectData.from_spreadsheet()
to load a databook
Simple initialization
Attributes
Return the start year from the databook
Return the start year from the databook
full_name, 'type':pop_type}
This stores a list of
TimeDependentConnections
instances for transfersThis stores a list of
TimeDependentConnections
instances for interactionsThis is the data's tvec used when instantiating new tables.
This is an odict storing
TimeDependentValuesEntry
instances keyed by the code name of the TDVEThis is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet
Methods
Add a new empty interaction
Add a population
Add a new empty transfer
Change the databook years
Construct ProjectData from spreadsheet
Given a code name for a TDVE quantity, find which page it is on
Extract a TimeSeries from a TDVE table or TDC table
Make a new databook/
ProjectData
instanceRemove an interaction
remove_pop
Remove a transfer
Rename a population
Rename an existing transfer
Save databook to disk
Return iterator over all TDVE and TDC tables
Return content as a Sciris Spreadsheet
Return an open workbook for the databook
Check if the ProjectData instance can be used to run simulations
- add_interaction(code_name, full_name, from_pop_type=None, to_pop_type=None)[source]¶
Add a new empty interaction
Normally this method would only be manually called if a framework had been updated to contain a new interaction, and the databook now required updating. Therefore, this method would generally only be used when an interaction with given code name, full name, and pop type had already been added to a framework.
- Parameters:
code_name (
str
) – The code name of the interaction to createfull_name (
str
) – The full name of the interaction to createfrom_pop_type (
str
) – The name of a population type, which will identify the populations to be added. Default is first population type in the frameworkto_pop_type (
str
) – The name of a population type, which will identify the populations to be added. Default is first population type in the framework
- Return type:
- Returns:
Newly instantiated TimeDependentConnections object (also added to
ProjectData.interpops
)
- add_pop(code_name, full_name, pop_type=None)[source]¶
Add a population
This will add a population to the databook. The population type should match one of the population types in the framework
- add_transfer(code_name, full_name, pop_type=None)[source]¶
Add a new empty transfer
- Parameters:
- Return type:
- Returns:
Newly instantiated TimeDependentConnections object (also added to
ProjectData.transfers
)
- change_tvec(tvec)[source]¶
Change the databook years
This function can be used to change the time vector in all of the TDVE/TDC tables. There are two ways to change the time arrays:
Setting
ProjectData.tvec
directly will only affect newly added tables, and will keep existing tables as they areCalling
ProjectData.change_tvec()
will modify all existing tables
Note that the TDVE/TDC tables store time/value pairs sparsely within their
TimeSeries
objects. Therefore, changing the time array won’t modify any of the data - it will only have an effect the next time a databook is written (so typically this method would be called as part of preparing a modified databook).- Parameters:
tvec (
array
) – A float, list, or array containing time values (in years) for the databook- Return type:
- property end_year: float¶
Return the start year from the databook
The ProjectData end year is defined as the latest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation end year, if using all of the data in the databook is desired.
- Returns:
The latest year in the databook
- static from_spreadsheet(spreadsheet, framework)[source]¶
Construct ProjectData from spreadsheet
- The framework is needed because the databook does not read in or otherwise store
The valid units for quantities
Which population type is associated with TDVE tables
- Parameters:
spreadsheet – The name of a spreadsheet, or a sc.Spreadsheet
framework – A
ProjectFramework
instance
- Returns:
A new
ProjectData
instance
- get_tdve_page(code_name)[source]¶
Given a code name for a TDVE quantity, find which page it is on
- Parameters:
code_name – The code name for a TDVE quantity
- Return type:
- Returns:
The sheet that it appears on
- get_ts(name, key=None)[source]¶
Extract a TimeSeries from a TDVE table or TDC table
- Parameters:
name (
str
) – The code name for the container storing theTimeSeries
- The code name of a transfer, interaction, or compartment/characteristic/parameter - The name of a transfer parameter instantiated in model.build e.g. ‘age_0-4_to_5-14’. this is mainly useful when retrieving data for plotting, where variables are organized according to names like ‘age_0-4_to_5-14’key – Specify the identifier for the
TimeSeries
- If name is a comp/charac/par, then key should be a pop name - If name is a transfer or interaction, then key should be a tuple (from_pop,to_pop) - If name is the name of a model transfer parameter, then key should be left as None
- Returns:
A
TimeSeries
, orNone
if there were no matches
Regarding the specification of the key - the same transfer could be specified as
name='age', key=('0-4','5-14')
name='age_0-4_to_5-14', key=None
where the former is typically used when working with data and calibrations, and the latter is used in
Model
and is therefore encountered on theResult
and plotting side.If retrieving values for a comp/charac/par and the databook contains an entry for ‘all’ rather than specific populations, then the ‘all’ time series will be returned regardless of the key
- interpops¶
This stores a list of
TimeDependentConnections
instances for interactions
- static new(framework, tvec, pops, transfers)[source]¶
Make a new databook/
ProjectData
instanceThis method should be used (instead of the standard constructor) to produce a new class instance (e.g. if creating a new databook).
- Parameters:
framework – A
ProjectFramework
instancetvec – A scalar, list, or array of times (typically would be generated with
numpy.arange()
)pops – A number of populations, or a
dict
with either{name:label}
or{name:{label:label,type:type}}
. Type defaults to the first population type in the frameworktransfers – A number of transfers, or a
dict
with either{name:label}
or{name:{label:label,type:type}}
. The type defaults to the first population type in the framework. Transfers can only take place between populations of the same type.
- Returns:
A new
ProjectData
instance
- pops¶
full_name, ‘type’:pop_type}
- Type:
This is an odict mapping code_name
- Type:
{‘label’
- rename_transfer(existing_code_name, new_code_name, new_full_name)[source]¶
Rename an existing transfer
- save(fname)[source]¶
Save databook to disk
This function provides a shortcut to generate a spreadsheet and immediately save it to disk.
- Parameters:
fname – File name to write on disk
- Return type:
- property start_year: float¶
Return the start year from the databook
The ProjectData start year is defined as the earliest time point in any of the TDVE/TDC tables (noting that it it is possible for the TDVE tables to have different time values). This quantity should be used when changing the simulation start year, if using all of the data in the databook is desired.
- Returns:
The earliest year in the databook
- tdve¶
This is an odict storing
TimeDependentValuesEntry
instances keyed by the code name of the TDVE
- tdve_pages¶
This is an odict mapping worksheet name to an (ordered) list of TDVE code names appearing on that sheet
- to_spreadsheet()[source]¶
Return content as a Sciris Spreadsheet
- Return type:
- Returns:
A
sciris.Spreadsheet
instance
- to_workbook()[source]¶
Return an open workbook for the databook
This allows the xlsxwriter workbook to be manipulated prior to closing the filestream e.g. to append extra sheets. This prevents issues related to cached data values when reloading a workbook to append or modify content
Warning - the workbook is backed by a BytesIO instance and needs to be closed. See the usage of this method in the :meth`to_spreadsheet` function.
- Return type:
- Returns:
A tuple (bytes, workbook) with a BytesIO instance and a corresponding open xlsxwriter workbook instance
- transfers¶
This stores a list of
TimeDependentConnections
instances for transfers
- tvec¶
This is the data’s tvec used when instantiating new tables. Not _guaranteed_ to be the same for every TDVE/TDC table
- validate(framework)[source]¶
Check if the ProjectData instance can be used to run simulations
A databook can be ‘valid’ in two senses
The Excel file adheres to the correct syntax and it can be parsed into a ProjectData object
The resulting ProjectData object contains sufficient information to run a simulation
Sometimes it is desirable for ProjectData to be valid in one sense rather than the other. For example, in order to run a simulation, the ProjectData needs to contain at least one value for every TDVE table. However, the TDVE table does _not_ need to contain values if all we want to do is add another key pop Thus, the first stage of validation is the ProjectData constructor - if that runs, then users can access methods like ‘add_pop’,’remove_transfer’ etc.
On the other hand, to actually run a simulation, the _contents_ of the databook need to satisfy various conditions These tests are implemented here. The typical workflow would be that ProjectData.validate() should be used if a simulation is going to be run. In the first instance, this can be done in Project.load_databook but the FE might want to perform this check at a different point if the databook manipulation methods e.g. add_pop are going to be exposed in the interface
This function throws an informative error if there are any problems identified or otherwise returns True
- Parameters:
framework – A
ProjectFramework
instance to validate the data against- Return type:
- Returns:
True if ProjectData is valid. An error will be raised otherwise