Files-Based Model API

The FilesCoreModel class defines a common implementation system for bespoke models that generate performance measure attributes that can be read from one or more files on disk after a model run. Many of the abstract methods defined in the AbstractCoreModel remain to be overloaded, but a standard load_measures implementation is defined here, along with a default implementation of get_experiment_archive_path and post_process.

class emat.model.FilesCoreModel(configuration: Union[str, Mapping], scope: Union[emat.scope.scope.Scope, str], safe: bool = True, db: Optional[emat.database.database.Database] = None, name: str = 'FilesCoreModel', local_directory: Optional[pathlib.Path] = None)[source]

Bases: emat.model.core_model.AbstractCoreModel

Setup connections and paths to a file reading core model

Parameters:
  • configuration – The configuration for this core model. This can be passed as a dict, or as a str which gives the filename of a YAML file that will be loaded.
  • scope – The exploration scope, as a Scope object or as a str which gives the filename of a YAML file that will be loaded.
  • safe – Load the configuration YAML file in ‘safe’ mode. This can be disabled if the configuration requires custom Python types or is otherwise not compatible with safe mode. Loading configuration files with safe mode off is not secure and should not be done with files from untrusted sources.
  • db – An optional Database to store experiments and results.
  • name – A name for this model, given as an alphanumeric string. The name is required by ema_workbench operations. If not given, “FilesCoreModel” is used.
  • local_directory – Optionally explicitly give this local_directory to use, overriding any directory set in the config file. If not given either here or in the config file, then Python’s cwd is used.
archive_path

The directory where archived models are stored.

Type:Path
get_parser(idx)[source]

Access a FileParser, used to extract performance measures.

Parameters:idx (int) – The position of the parser to get.
Returns:FileParser
property local_directory

The current local working directory for this model.

Type:Path
model_path

The directory of the ‘live’ model instance, relative to the local_directory.

Type:Path
rel_output_path

The path to ‘live’ model outputs, relative to model_path.

Type:Path
property resolved_archive_path

The archive path to use.

If archive_path is set to an absolute path, then that path is returned, otherwise the archive_path is joined onto the local_directory.

Returns:str
property resolved_model_path

The model path to use.

If model_path is set to an absolute path, then that path is returned, otherwise the model_path is joined onto the local_directory.

Returns:str
FilesCoreModel.add_parser(parser)[source]

Add a FileParser to extract performance measures.

Parameters:parser (FileParser) – The parser to add.
FilesCoreModel.load_measures(measure_names: Optional[List[str]] = None, *, rel_output_path=None, abs_output_path=None)[source]

Import selected measures from the core model.

This method is the place to put code that can actually reach into files in the core model’s run results and extract performance measures. It is expected that it should not do any post-processing of results (i.e. it should read from but not write to the model outputs directory).

Imports measures from active scenario

Parameters:
  • measure_names (Collection[str]) – Collection of measures to be loaded.
  • rel_output_path (str, optional) – Path to model output locations, either relative to the model_path directory (when a subclass is a type that has a model path) or as an absolute directory. If neither is given, the default value is equivalent to setting rel_output_path to ‘Outputs’.
  • abs_output_path (str, optional) – Path to model output locations, either relative to the model_path directory (when a subclass is a type that has a model path) or as an absolute directory. If neither is given, the default value is equivalent to setting rel_output_path to ‘Outputs’.
Returns:

dict of measure name and values from active scenario

Raises:

KeyError – If load_measures is not available for specified measure

FilesCoreModel.load_archived_measures(experiment_id, measure_names=None)[source]

Load performance measures from an archived model run.

Parameters:
  • experiment_id (int) – The id for the experiment to load.
  • measure_names (Collection, optional) – A subset of performance measure names to load. If not provided, all measures will be loaded.

Parsing Files

The FilesCoreModel.add_parser() method accepts FileParser objects, which can be used to read performance measures from individual files. For an illustration of how to use parsers, see the source code for the GBNRTCModel.

class emat.model.core_files.parsers.FileParser(filename)[source]

A tool to parse performance measure(s) from an arbitrary file format.

This is an abstract base class, which defines the basic API for file parsing objects. Most users will want to use TableParser for reading perforamce measures from any kind of file that contains a table of data (including one-column, one-row, and one-value tables).

Parameters:filename (str) – The name of the file in which the measure(s) are stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method.
abstract property measure_names

the measure names contained in this TableParser.

Type:List
abstract read(from_dir)[source]

Read the performance measures.

Parameters:from_dir (Path-like) – The base directory from which to read the data.
Returns:The measures read from this file.
Return type:Dict
class emat.model.core_files.parsers.TableParser(filename, measure_getters, reader_method=<function read_csv>, handle_errors='raise', **kwargs)[source]

Bases: emat.model.core_files.parsers.FileParser

A tool to parse performance measure from an arbitrary table format.

This object provides a way to systematically extract values from an output file that has a well defined name and format. This is exactly what we would expect for a files-based core model, which when run (and post-processed, if applicable) will generate one or more named and regularly formatted output files.

Parameters:
  • filename (str) – The name of the file in which the tabular data is stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method. Generally the from_dir will be a directory containing a set of model output files from a single model run, and this filename will be just the name of the file, unless the core model run constructs a sub-directory hierarchy within the output directory (this is unusual).
  • measure_getters (Mapping[str, Getter]) – A mapping that relates scalar performance measure values to Getters that extract values from the tabular data.
  • reader_method (Callable, default pandas.read_csv) – A function that accepts one positional argument (the filename to be read) and optionally some keyword arguments, and returns a pandas.DataFrame.
  • handle_errors (str, default 'raise') – How to handle errors when reading a table, one of {‘raise’, ‘nan’}
  • **kwargs (Mapping, optional) – A set of fixed keyword arguments that will be passed to reader_method each time it is called.
property measure_names

the measure names contained in this TableParser.

Type:List
raw(from_dir)[source]

Read the raw tabular data.

This method will read the raw file, using the reader_method defined for this TableParser and any designated keyword arguments for that reader, but it will not actually run any of the measure_getters that convert the table into individual performance measures. This method is exposed for users primarily to test be able to conveniently test TableParser objects during development.

Parameters:from_dir (Path-like) – The base directory from which to read the data.
Returns:pandas.DataFrame
read(from_dir)[source]

Read the performance measures.

Parameters:from_dir (Path-like) – The base directory from which to read the data.
Returns:The measures read from this file.
Return type:Dict
class emat.model.core_files.parsers.MappingParser(filename, measure_getters, reader_method=None, handle_errors='raise', **kwargs)[source]

Bases: emat.model.core_files.parsers.TableParser

A tool to parse performance measure from an arbitrary mapping format.

This object provides a way to systematically extract values from an output file that has a well defined name and format that defines some kind of mapping (i.e. like a Python dict). This is exactly what we would expect for a files-based core model, which when run (and post-processed, if applicable) will generate one or more named and regularly formatted output files.

Parameters:
  • filename (str) – The name of the file in which the mapping data is stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method. Generally the from_dir will be a directory containing a set of model output files from a single model run, and this filename will be just the name of the file, unless the core model run constructs a sub-directory hierarchy within the output directory (this is unusual).
  • measure_getters (Mapping[str, Getter]) – A mapping that relates scalar performance measure values to Getters that extract values from the mapping data.
  • reader_method (Callable, default pandas.read_csv) – A function that accepts one positional argument (the filename to be read) and optionally some keyword arguments, and returns a Python mapping (i.e. a dict, or something that acts like a dict).
  • handle_errors (str, default 'raise') – How to handle errors when reading a file, one of {‘raise’, ‘nan’}
  • **kwargs (Mapping, optional) – A set of fixed keyword arguments that will be passed to reader_method each time it is called.
property measure_names

the measure names contained in this TableParser.

Type:List
raw(from_dir)[source]

Read the raw mapping data.

This method will read the raw file, using the reader_method defined for this MappingParser and any designated keyword arguments for that reader, but it will not actually run any of the measure_getters that convert the table into individual performance measures. This method is exposed for users primarily to test be able to conveniently test MappingParser objects during development.

Parameters:from_dir (Path-like) – The base directory from which to read the data.
Returns:Mapping
read(from_dir)[source]

Read the performance measures.

Parameters:from_dir (Path-like) – The base directory from which to read the data.
Returns:The measures read from this file.
Return type:Dict
class emat.model.core_files.parsers.Getter[source]

A tool to get defined value[s] from a pandas.DataFrame.

Use a getter by calling it with the DataFrame as the sole argument.