atomica.excel

Miscellaneous utility functions for Excel files

This module implements utility functions for Excel functionality that is common to different kinds of spreadsheets used in Atomica (e.g. Databooks and Program Books). For example, Excel formatting, and time-varying data entry tables, are implemented here.

Functions

apply_widths(worksheet, width_dict)

cell_get_number(cell[, dtype])

Return numeric value from cell

cell_require_string(cell)

Check that a cell contains a string

read_tables(worksheet)

standard_formats(workbook)

the formats used in the spreadsheet

transfer_comments(target, comment_source)

Copy comments between spreadsheets

update_widths(width_dict, column_index, contents)

Keep track of required width for a column

validate_category(workbook, expected_category)

Check Atomica workbook type

Classes

TimeDependentConnections(code_name, …[, …])

Structure for reading/writing interactions

TimeDependentValuesEntry(name[, tvec, ts, …])

Table for time-dependent data entry

class atomica.excel.TimeDependentConnections(code_name, full_name, tvec, from_pops, to_pops, interpop_type, ts=None, from_pop_type=None, to_pop_type=None)[source]

Structure for reading/writing interactions

A TimeDependentConnections object is suitable when there are time dependent interactions between two quantities This class is used for both transfers and interactions. The content that it writes consists of

  • A connection matrix table that has Y/N selection of which interactions are present between two things

  • A set of pairwise connections specifying to, from, units, assumption, and time

Interactions can have a diagonal, whereas transfers cannot (e.g. a population can infect itself but cannot transfer to itself).

In Excel, a TimeDependentConnections maps to three tables

  1. A table to enter the code name and full name

  2. An interactions matrix with Y/N indicating whether an interaction between two populations exists

  3. A set of rows for entering time varying data for each pair of populations

Parameters
  • code_name (str) – the code name of this quantity e.g. ‘aging’

  • full_name (str) – the full name of this quantity e.g. ‘Aging’

  • tvec – time values for the time-dependent rows

  • pops – list of strings to use as the rows and columns - these are typically lists of population code names

  • type – ‘transfer’ or ‘interaction’. A transfer cannot have diagonal entries, and can have Number or Probability formats. An Interaction can have diagonal entries and only has N.A. formats

  • ts (Optional[dict]) – Optionally specify a dict containing all of the non-empty TimeSeries objects used. The format is {(from_pop, to_pop):TimeSeries}. An interaction can only be Y/N for clarity, if it is Y then a row is displayed for the TimeSeries. Actually, the Y/N can be decided in the first instance based on the provided TimeSeries i.e. if a TimeSeries is provided for an interaction, then the interaction must have been marked with Y

  • pop_type – Specify pop_type, which is used by ProjectData.add_pop() to determine which TDCs to add new populations to

_write_pop_matrix(worksheet, start_row, formats, references=None, boolean_choice=False, widths=None)[source]

Write a square matrix to Excel

This function writes the Y/N matrix

  • Transfer matrix

  • Interactions matrix

If self.enable_diagonal is False then the diagonal will be forced to be 'N.A.'. If an entry is specified for an entry on the diagonal and enable_diagonal=False, an error will be thrown

Parameters
  • worksheet – An xlsxwriter worksheet instance

  • start_row – The first row in which to write values

  • formats – Format dict for the opened workbook - typically the return value of standard_formats() when the workbook was opened

  • references (Optional[dict]) – Optionally supply dict with references, used to link population names in Excel

  • boolean_choice – If True, values will be coerced to Y/N and an Excel validation will be added

  • widths (Optional[dict]) – dict storing column widths

Returns

Tuple with (next_row, table_references, values_written). The references are used for hyperlinking to the Excel matrix

assumption_heading = None

Heading to use for assumption column

static from_tables(tables, interaction_type)[source]

Instantiate based on list of tables

This method instantiates and initializes a new TimeDependentConnections object from tables that have been read in using read_tables(). Note that the parent object such as ProjectData is responsible for extracting the tables and passing them to this function. For instance, the transfers sheet might contain more than one set of tables, so it is the calling function’s responsibility to split those tables up into the groups of three expected by this method.

Parameters
  • tables (list) – A list of tables. A table here is a list of rows, and a row is a list of cells.

  • interaction_type – A string identifying the interaction type - either ‘transfer’ or ‘interaction’

Returns

A new TimeDependentConnections instance

write(worksheet, start_row, formats, references=None, widths=None)[source]

Write to cells in a worksheet

Parameters
  • worksheet – An xlsxwriter worksheet instance

  • start_row – The first row in which to write values

  • formats – Format dict for the opened workbook - typically the return value of standard_formats() when the workbook was opened

  • references (Optional[dict]) – References dict containing cell references for strings in the current workbook

  • widths (Optional[dict]) – dict storing column widths

Return type

int

Returns

The row index for the next available row for writing in the spreadsheet

write_assumption = None

Write a column for units (if None, units will be written if any of the TimeSeries have an assumption)

write_uncertainty = None

Write a column for units (if None, units will be written if any of the TimeSeries have uncertainty)

write_units = None

Write a column for units (if None, units will be written if any of the TimeSeries have units)

class atomica.excel.TimeDependentValuesEntry(name, tvec=None, ts=None, allowed_units=None, comment=None)[source]

Table for time-dependent data entry

This class is Databooks and Program books to enter potentially time-varying data. Conceptually, it maps a set of TimeSeries object to a single name and table in the spreadsheet. For example, a Characteristic might contain a TimeSeries for each population, and the resulting TimeDependentValuesEntry (TDVE) table would have a name matching the population, and TimeSeries for each population.

The TDVE class optionally allows the specification of units, assumptions, and uncertainty, which each map to properties on the underlying TimeSeries objects. It also contains a time vector corresponding to the time values that appear or will appear in the spreadsheet.

Note that the units are stored within the TimeSeries objects, which means that they can are able to differ across rows.

Parameters
  • name – The name/title for this table

  • tvec – Specify the time values for this table. All TimeSeries in the ts dict should have corresponding time values

  • ts – Optionally specify an odict() of TimeSeries objects populating the rows. Could be populated after

  • allowed_units – Optionally specify a list of allowed units that will appear as a dropdown

  • comment – Optionally specify descriptive text that will be added as a comment to the name cell

assumption_heading = None

Heading to use for assumption column

comment = None

A comment that will be added in Excel

static from_rows(rows)[source]

Create new instance from Excel rows

Given a set of openpyxl rows, instantiate a TimeDependentValuesEntry object That is, the parent object e.g. ProjectData is responsible for finding where the TDVE table is, and reading all of the rows associated with it (skipping #ignored rows) and then passing those rows, unparsed, to this function

Headings for ‘units’, ‘uncertainty’, and ‘assumption’/’constant’ are optional and will be read in if they are present in the spreadsheet.

Parameters

rows (list) – A list of rows

Returns

A new TimeDependentValuesEntry instance

property has_data

Check whether all time series have data entered

Return type

bool

Returns

True if all of the TimeSeries objects stored in the TDVE have data

name = None

Name for th quantity printed in Excel

ts = None

dict of TimeSeries objects

tvec = None

time axis (e.g. np.arange(2000,2019)) - all TimeSeries time values must exactly match one of the values here

write(worksheet, start_row, formats, references=None, widths=None)[source]

Write to cells in a worksheet

Note that the year columns are drawn from the tvec attribute. To suppress the year columns (e.g. for the user to enter only an assumption) then set tvec to an empty array/list.

Parameters
  • worksheet – An xlsxwriter worksheet instance

  • start_row – The first row in which to write values

  • formats – Format dict for the opened workbook - typically the return value of standard_formats() when the workbook was opened

  • references (Optional[dict]) – References dict containing cell references for strings in the current workbook

  • widths (Optional[dict]) – dict storing column widths

Return type

int

Returns

The row index for the next available row for writing in the spreadsheet

write_assumption = None

Write a column for units (if None, units will be written if any of the TimeSeries have an assumption)

write_uncertainty = None

Write a column for units (if None, units will be written if any of the TimeSeries have uncertainty)

write_units = None

Write a column for units (if None, units will be written if any of the TimeSeries have units)

atomica.excel.apply_widths(worksheet, width_dict)[source]
atomica.excel.cell_get_number(cell, dtype=<class 'float'>)[source]

Return numeric value from cell

This function is to guard against accidentally having the Excel cell contain a string instead of a number. If a string has been entered, an error will be raised. The added value from this function is that if the Excel cell type is empty but the value is empty or N.A. then the value will be treated as though the cell was correctly set to a numeric type but had been left empty.

The output is cast to dtype which means that code that requires numeric input from Excel can use this input to guarantee that the resulting number is of the correct type, or None.

Parameters
  • cell – An openpyxl cell

  • dtype – If the cell is numeric, cast to this type (default is float but could be int for example)

Returns

A scalar instance of dtype (e.g. float) or None if cell is empty or being treated as empty

Raises

Exception if the cell contains a string

atomica.excel.cell_require_string(cell)[source]

Check that a cell contains a string

Parameters

cell – An openpyxl cell

Raises

Exception with informative message if the cell value is not a string

Return type

None

atomica.excel.read_tables(worksheet)[source]
atomica.excel.standard_formats(workbook)[source]

the formats used in the spreadsheet

atomica.excel.transfer_comments(target, comment_source)[source]

Copy comments between spreadsheets

This function copies comments from one spreadsheet to another. Under the hood, a new spreadsheet is created with values from the target Spreadsheet and cell-wise formatting from the comment_source Spreadsheet. If a cell exists in this spreadsheet and not in the source, it will be retained as-is. If more cells exist in the comment_source than in this spreadsheet, those cells will be dropped. If a sheet exists in the comment_source and not in the current workbook, it will be added

Parameters
  • target (Spreadsheet) – The target spreadsheet to write comments into

  • comment_source (Spreadsheet) – The source spreadsheet containing comments

Return type

None

atomica.excel.update_widths(width_dict, column_index, contents)[source]

Keep track of required width for a column

width_dict is a dict that is keyed by column index e.g. 0,1,2 and the value is the length of the longest contents seen for that column

Parameters
  • width_dict (dict) – Storage dictionary

  • column_index (int) – Index of the column value has been inserted in

  • contents (str) – Content, length of which is used to set width

Return type

None

atomica.excel.validate_category(workbook, expected_category)[source]

Check Atomica workbook type

This function makes sure that a workbook has a particular category property stored within it, and displays an appropriate error message if not. If the category isn’t present or doesn’t start with ‘atomica’, just ignore it for robustness (instead, a parsing error will likely be raised)

Parameters
  • workbook – An openpyxl workbook

  • category – The expected string category

Raises

Exception if the workbook category is not valid

Return type

None