Files-Based Model API¶
The FilesCoreModel
class defines a common implementation system for
bespoke models that generate performance measure attributes that can be read from
one or more files on disk after a model run. Many of the abstract methods
defined in the AbstractCoreModel
remain to be overloaded, but
a standard load_measures implementation is defined here, along with a
default implementation of get_experiment_archive_path and post_process.
-
class
emat.model.
FilesCoreModel
(configuration: Union[str, Mapping], scope: Union[emat.scope.scope.Scope, str], safe: bool = True, db: Optional[emat.database.database.Database] = None, name: str = 'FilesCoreModel', local_directory: Optional[pathlib.Path] = None)[source]¶ Bases:
emat.model.core_model.AbstractCoreModel
Setup connections and paths to a file reading core model
Parameters: - configuration – The configuration for this core model. This can be passed as a dict, or as a str which gives the filename of a YAML file that will be loaded.
- scope – The exploration scope, as a Scope object or as a str which gives the filename of a YAML file that will be loaded.
- safe – Load the configuration YAML file in ‘safe’ mode. This can be disabled if the configuration requires custom Python types or is otherwise not compatible with safe mode. Loading configuration files with safe mode off is not secure and should not be done with files from untrusted sources.
- db – An optional Database to store experiments and results.
- name – A name for this model, given as an alphanumeric string. The name is required by ema_workbench operations. If not given, “FilesCoreModel” is used.
- local_directory – Optionally explicitly give this local_directory to use, overriding any directory set in the config file. If not given either here or in the config file, then Python’s cwd is used.
-
archive_path
¶ The directory where archived models are stored.
Type: Path
-
get_parser
(idx)[source]¶ Access a FileParser, used to extract performance measures.
Parameters: idx (int) – The position of the parser to get. Returns: FileParser
-
property
local_directory
¶ The current local working directory for this model.
Type: Path
-
model_path
¶ The directory of the ‘live’ model instance, relative to the local_directory.
Type: Path
-
rel_output_path
¶ The path to ‘live’ model outputs, relative to model_path.
Type: Path
-
property
resolved_archive_path
¶ The archive path to use.
If archive_path is set to an absolute path, then that path is returned, otherwise the archive_path is joined onto the local_directory.
Returns: str
-
property
resolved_model_path
¶ The model path to use.
If model_path is set to an absolute path, then that path is returned, otherwise the model_path is joined onto the local_directory.
Returns: str
-
FilesCoreModel.
add_parser
(parser)[source]¶ Add a FileParser to extract performance measures.
Parameters: parser (FileParser) – The parser to add.
-
FilesCoreModel.
load_measures
(measure_names: Optional[List[str]] = None, *, rel_output_path=None, abs_output_path=None)[source]¶ Import selected measures from the core model.
This method is the place to put code that can actually reach into files in the core model’s run results and extract performance measures. It is expected that it should not do any post-processing of results (i.e. it should read from but not write to the model outputs directory).
Imports measures from active scenario
Parameters: - measure_names (Collection[str]) – Collection of measures to be loaded.
- rel_output_path (str, optional) – Path to model output locations, either relative to the model_path directory (when a subclass is a type that has a model path) or as an absolute directory. If neither is given, the default value is equivalent to setting rel_output_path to ‘Outputs’.
- abs_output_path (str, optional) – Path to model output locations, either relative to the model_path directory (when a subclass is a type that has a model path) or as an absolute directory. If neither is given, the default value is equivalent to setting rel_output_path to ‘Outputs’.
Returns: dict of measure name and values from active scenario
Raises: KeyError – If load_measures is not available for specified measure
-
FilesCoreModel.
load_archived_measures
(experiment_id, measure_names=None)[source]¶ Load performance measures from an archived model run.
Parameters: - experiment_id (int) – The id for the experiment to load.
- measure_names (Collection, optional) – A subset of performance measure names to load. If not provided, all measures will be loaded.
Parsing Files¶
The FilesCoreModel.add_parser()
method accepts FileParser
objects,
which can be used to read performance measures from individual files. For an
illustration of how to use parsers, see the source code for the GBNRTCModel
.
-
class
emat.model.core_files.parsers.
FileParser
(filename)[source]¶ A tool to parse performance measure(s) from an arbitrary file format.
This is an abstract base class, which defines the basic API for file parsing objects. Most users will want to use TableParser for reading perforamce measures from any kind of file that contains a table of data (including one-column, one-row, and one-value tables).
Parameters: filename (str) – The name of the file in which the measure(s) are stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method. -
abstract property
measure_names
¶ the measure names contained in this TableParser.
Type: List
-
abstract property
-
class
emat.model.core_files.parsers.
TableParser
(filename, measure_getters, reader_method=<function read_csv>, handle_errors='raise', **kwargs)[source]¶ Bases:
emat.model.core_files.parsers.FileParser
A tool to parse performance measure from an arbitrary table format.
This object provides a way to systematically extract values from an output file that has a well defined name and format. This is exactly what we would expect for a files-based core model, which when run (and post-processed, if applicable) will generate one or more named and regularly formatted output files.
Parameters: - filename (str) – The name of the file in which the tabular data is stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method. Generally the from_dir will be a directory containing a set of model output files from a single model run, and this filename will be just the name of the file, unless the core model run constructs a sub-directory hierarchy within the output directory (this is unusual).
- measure_getters (Mapping[str, Getter]) – A mapping that relates scalar performance measure values to Getters that extract values from the tabular data.
- reader_method (Callable, default pandas.read_csv) – A function that accepts one positional argument (the filename to be read) and optionally some keyword arguments, and returns a pandas.DataFrame.
- handle_errors (str, default 'raise') – How to handle errors when reading a table, one of {‘raise’, ‘nan’}
- **kwargs (Mapping, optional) – A set of fixed keyword arguments that will be passed to reader_method each time it is called.
-
property
measure_names
¶ the measure names contained in this TableParser.
Type: List
-
raw
(from_dir)[source]¶ Read the raw tabular data.
This method will read the raw file, using the reader_method defined for this TableParser and any designated keyword arguments for that reader, but it will not actually run any of the measure_getters that convert the table into individual performance measures. This method is exposed for users primarily to test be able to conveniently test TableParser objects during development.
Parameters: from_dir (Path-like) – The base directory from which to read the data. Returns: pandas.DataFrame
-
class
emat.model.core_files.parsers.
MappingParser
(filename, measure_getters, reader_method=None, handle_errors='raise', **kwargs)[source]¶ Bases:
emat.model.core_files.parsers.TableParser
A tool to parse performance measure from an arbitrary mapping format.
This object provides a way to systematically extract values from an output file that has a well defined name and format that defines some kind of mapping (i.e. like a Python dict). This is exactly what we would expect for a files-based core model, which when run (and post-processed, if applicable) will generate one or more named and regularly formatted output files.
Parameters: - filename (str) – The name of the file in which the mapping data is stored. The filename is a relative path to the file, and will be evaluated relative to the from_dir argument in the read method. Generally the from_dir will be a directory containing a set of model output files from a single model run, and this filename will be just the name of the file, unless the core model run constructs a sub-directory hierarchy within the output directory (this is unusual).
- measure_getters (Mapping[str, Getter]) – A mapping that relates scalar performance measure values to Getters that extract values from the mapping data.
- reader_method (Callable, default pandas.read_csv) – A function that accepts one positional argument (the filename to be read) and optionally some keyword arguments, and returns a Python mapping (i.e. a dict, or something that acts like a dict).
- handle_errors (str, default 'raise') – How to handle errors when reading a file, one of {‘raise’, ‘nan’}
- **kwargs (Mapping, optional) – A set of fixed keyword arguments that will be passed to reader_method each time it is called.
-
property
measure_names
¶ the measure names contained in this TableParser.
Type: List
-
raw
(from_dir)[source]¶ Read the raw mapping data.
This method will read the raw file, using the reader_method defined for this MappingParser and any designated keyword arguments for that reader, but it will not actually run any of the measure_getters that convert the table into individual performance measures. This method is exposed for users primarily to test be able to conveniently test MappingParser objects during development.
Parameters: from_dir (Path-like) – The base directory from which to read the data. Returns: Mapping