Writing and Using a Bespoke Model Interface

[1]:
import emat
import os
import pandas as pd
import numpy as np
import gzip
import asyncio
from emat.util.show_dir import show_dir, show_file_contents

This notebook is meant to illustrate the use of TMIP-EMAT’s various modes of operation. It provides an illustration of how to use TMIP-EMAT and the demo interface to run the command line version of the Road Test model. A similar approach can be developed to run any transportation model that can be run from the command line, including for proprietary modeling tools that are typically run from a graphical user interface (GUI) but that provide command line access also.

In this example notebook, we will activate some logging features. The same logging utility is written directly into the EMAT and the core_files_demo.py module. This will give us a view of what’s happening inside the code as it runs.

[2]:
import logging
from emat.util.loggers import log_to_stderr
log = log_to_stderr(logging.INFO)

Connecting to the Model

The interface for this model is located in the core_files_demo.py module, which we will import into this notebook. This file is extensively documented in comments, and is a great starting point for new users who want to write an interface for a new bespoke travel demand model.

[3]:
import core_files_demo

Within this module, you will find a definition for the RoadTestFileModel class.

We initialize an instance of the model interface object. If you look at the module code, you’ll note the __init__ function does a number of things, including creating a temporary directory to work in, copying the needed files into this temporary directory, loading the scope, and creating a SQLite database to work within. For your implementation, you might or might not do any of these steps. In particular, you’ll probably want to use a database that is not in a temporary location, so that the results will be available after this notebook is closed.

[4]:
fx = core_files_demo.RoadTestFileModel()
[00:03.84] MainProcess/WARNING: changing cwd to /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc
[00:03.88] MainProcess/INFO: running script emat_db_init.sql
[00:03.89] MainProcess/INFO: running script meta_model.sql
[00:03.90] MainProcess/INFO: found no experiments with missing run_id's
[00:03.90] MainProcess/INFO: running script emat_db_init_views.sql

Once we have loaded the RoadTestFileModel class, we have a number of files available in the “master_directory” that was created as that temporary directory:

[5]:
show_dir(fx.master_directory.name)
tmpygcky5rc/
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   └── demo-inputs-x.yml.template
├── road-test-model-config.yml
└── road-test-scope.yml

Understanding Directories

The TMIP-EMAT interface design for files-based bespoke models uses pointers for several directories to control the operation of the model.

  • local_directory
    This is the working directory for this instance of TMIP-EMAT, not that for the core model itself. Typically it can be Python’s usual current working directory, accessible via os.getcwd(). In this directory typically you’ll have a TMIP-EMAT model configuration yaml file, a scope definition yaml file, and a sub-directory containing the files needed to run the core model itself.
  • model_path
    The relative path from the local_directory to the directory where the core model files are located. When the core model itself is actually run, this should be to the “current working directory” for that run. The model_path must be given in the model config yaml file.
  • rel_output_path
    The relative path from the model_path to the directory where the core model output files are located. The default value of this path is “./Outputs” but this can be overridden by setting rel_output_path in the model config yaml file. If the outputs are comingled with other input files in the core model directory, this can be set to “.” (just a dot).
  • archive_path
    The path where model archive directories can be found. This path must be given in the model config yaml file. It can be given as an absolute path, or a relative path. If it is a relative path, it should be relative to the local_directory.

These directories, especially the ones other than the local_directory, are defined in a model configuration yaml file. This makes it easy to change the directory pointers when moving TMIP-EMAT between different machines that may have different file system structures.

Single Run Operation for Development and Debugging

Before we take on the task of running this model in exploratory mode, we’ll want to make sure that our interface code is working correctly. To check each of the components of the interface (setup, run, post-process, load-measures, and archive), we can run each individually in sequence, and inspect the results to make sure they are correct.

setup

This method is the place where the core model set up takes place, including creating or modifying files as necessary to prepare for a core model run. When running experiments, this method is called once for each core model experiment, where each experiment is defined by a set of particular values for both the exogenous uncertainties and the policy levers. These values are passed to the experiment only here, and not in the run method itself. This facilitates debugging, as the setup method can be used without the run method, as we do here. This allows us to manually inspect the prepared files and ensure they are correct before actually running a potentially expensive model.

Each input exogenous uncertainty or policy lever can potentially be used to manipulate multiple different aspects of the underlying core model. For example, a policy lever that includes a number of discrete future network “build” options might trigger the replacement of multiple related network definition files. Or, a single uncertainty relating to the cost of fuel might scale both a parameter linked to the modeled per-mile cost of operating an automobile and the modeled total cost of fuel used by transit services.

For this demo model, running the core model itself in files mode requires two configuration files to be available, one for levers and another for uncertainties. These two files are provided in the demo in two ways: as a runnable base file (for the levers) and as a template file (for the uncertainties).

The levers file is a ready-to-use file (for this demo, in YAML format, although your model may use a different file format for input files). It has default values pre-coded into the file, and to modify this file for use by EMAT the setup method needs to parse and edit this file to swap out the default values for new ones in each experiment. This can be done using regular expressions (as in this demo), or any other method you like to edit the file appropriately. The advantage of this approach is that the base file is ready to use with the core model as-is, facilitating the use of this file outside the EMAT context.

[6]:
show_file_contents(fx.master_directory.name, 'road-test-files', 'demo-inputs-l.yml')
---
# This file defines lever values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
expand_capacity: 10
amortization_period: 30
interest_rate_lock: False
debt_type: GO Bond
lane_width: 10
mandatory_unused_lever: 42
...

By contrast, the uncertainties file is in a template format. The values of the parameters that will be manipulated by EMAT for each experiment are not given by default values, but instead each value to be set is indicated in the file by a unique token that is easy to search and replace, and definitely not something that appear in any script otherwise. This approach makes the text-substitution code that is used in this module much simpler and less prone to bugs. But there is a small downside of this approach: every parameter must definitely be replaced in this process, as the template file is unusable outside the EMAT context, and also every unique token needs to be replaced.

[7]:
show_file_contents(fx.master_directory.name, 'road-test-files', 'demo-inputs-x.yml.template')
---
# This file defines uncertainty values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
alpha: __EMAT_PROVIDES_VALUE__ALPHA__
beta: __EMAT_PROVIDES_VALUE__BETA__
input_flow: __EMAT_PROVIDES_VALUE__INPUT_FLOW__
value_of_time: __EMAT_PROVIDES_VALUE__VALUE_OF_TIME__
labor_unit_cost_expansion: __EMAT_PROVIDES_VALUE__LABOR_UNIT_COST_EXPANSION__
materials_unit_cost_expansion: __EMAT_PROVIDES_VALUE__MATERIALS_UNIT_COST_EXPANSION__
interest_rate: __EMAT_PROVIDES_VALUE__INTEREST_RATE__
yield_curve: __EMAT_PROVIDES_VALUE__YIELD_CURVE__
...

Regardless of which file management system you use, the setup method is the place to make edits to these input files and write them into your working directory. To do so, the setup method takes one argument: a dictionary containing key-value pairs that assign a particular value to each input (exogenous uncertainty or policy lever) that is defined in the model scope. The keys must match exactly with the names of the parameters given in the scope.

If you have written your setup method to call the super-class setup, you will find that if you give keys as input that are not defined in the scope, you’ll get a KeyError.

[8]:
bad_params = {
    'name_not_in_scope': 'is_a_problem',
}

try:
    fx.setup(bad_params)
except KeyError as error:
    log.error(repr(error))
[00:03.94] MainProcess/ERROR: SETUP ERROR: 'name_not_in_scope' not found in scope parameters
[00:03.94] MainProcess/ERROR: KeyError("'name_not_in_scope' not found in scope parameters")

On the other hand, your custom model may or may not allow you to leave out some parameters. It is up to you to decide how to handle missing values, either by setting them at their default values or raising an error. In normal operation, parameters typically won’t be left out from the design of experiments, so it is not usually important to monitor this carefully.

In our example module’s setup, all of the uncertainty values must be given, because the template file would be unusable otherwise. But the policy levers can be omitted, and if so they are left at their default values in the original file. Note that the default values in that file are not strictly consistent with the default values in the scope file, and TMIP-EMAT does nothing on its own to address this discrepancy.

[9]:
params = {
    'expand_capacity': 75,
    'amortization_period': 25,
    'debt_type': "Paygo",
    'alpha': 0.1234,
    'beta': 4.0,
    'input_flow': 100,
    'value_of_time': 0.075,
    'unit_cost_expansion': 100,
    'interest_rate': 0.035,
    'yield_curve': 0.01,
} # interest_rate_lock is missing, that's ok

fx.setup(params)
[00:03.97] MainProcess/INFO: RoadTestFileModel SETUP RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122
[00:03.97] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 1 RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

After running setup successfully, we will have overwritten the “demo-inputs-l.yml” file with new values, and written a new “demo-inputs-x.yml” file into the model working directory with those values.

[10]:
show_dir(fx.local_directory)
tmpygcky5rc/
├── _emat_experiment_id_.yml
├── _emat_parameters_.yml
├── archive/
│   └── scp_EMAT Road Test/
│       └── exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122/
│           └── _emat_start_.log
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   ├── demo-inputs-x.yml
│   └── demo-inputs-x.yml.template
├── road-test-model-config.yml
└── road-test-scope.yml
[11]:
show_file_contents(fx.local_directory, 'road-test-files', 'demo-inputs-l.yml')
---
# This file defines lever values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
expand_capacity: 75
amortization_period: 25
interest_rate_lock: False
debt_type: Paygo
lane_width: 10
mandatory_unused_lever: 42
...
[12]:
show_file_contents(fx.local_directory, 'road-test-files', 'demo-inputs-x.yml')
---
# This file defines uncertainty values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
alpha: 0.1234
beta: 4.0
input_flow: 100
value_of_time: 0.075
labor_unit_cost_expansion: 60.0
materials_unit_cost_expansion: 40.0
interest_rate: 0.035
yield_curve: 0.01
...

run

The run method is the place where the core model run takes place. Note that this method takes no arguments; all the input exogenous uncertainties and policy levers are delivered to the core model in the setup method, which will be executed prior to calling this method. This facilitates debugging, as the setup method can be used without the run method as we did above, allowing us to manually inspect the prepared files and ensure they are correct before actually running a potentially expensive model.

[13]:
fx.run()
[00:06.49] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

The RoadTestFileModel class includes a custom last_run_logs method, which displays both the “stdout” and “stderr” logs generated by the model executable during the most recent call to the run method. We can use this method for debugging purposes, to identify why the core model crashes (if it does crash). In this first test it did not crash, and the logs look good.

[14]:
fx.last_run_logs()
=== STDOUT ===
[2021-01-31 18:59:50,158] emat.RoadTest.INFO: running emat-road-test-demo
[2021-01-31 18:59:50,164] emat.RoadTest.INFO: emat-road-test-demo completed without errors

=== END OF LOG ===

post-process

There is an (optional) post_process step that is separate from the run step.

Post-processing differs from the main model run in two important ways:

  • It can be run to efficiently generate a subset of performance measures.
  • It can be run based on archived model main-run core model results.

Both features are designed to support workflows where new performance measures are added to the exploratory scope after the main model run(s) are completed. By allowing the post_process method to be run only for a subset of measures, we can avoid replicating possibly expensive post-processing steps when we have already completed them, or when they are not needed for a particular application.

For example, consider an exploratory modeling activity where the scope at the time of the initial model run experiments was focused on highway measures, and transit usage was not explored extensively, and no network assignment was done for transit trips when the experiments were initially run. By creating a post-process step to run the transit network assignment, we can apply that step to existing archived results, as well as have it run automatically for future model experients where transit usage is under study, but continue to omit it for future model experients where we do not need it.

An optional measure_names argument allows the post-processor to identify which measures need additional computational effort to generate, and to skip excluded measures that are not currently of interest, or which have already been computed and do not need to be computed again.

The post processing is isolated from the main model run to allow it to be run later using archived model results. When executed directly after a core model run, it will operate on the results of the model stored in the local working directory. However, it can also be used with an optional output_path argument, which can be pointed at a model archive directory instead of the local working directory.

A consequence of this (and an intentional limitation) is that the post_process method should only use files from the set of files that are or will be archived from the core model run, and not attempt to use other non-persistent temporary or intermediate files that will not be archived.

[15]:
fx.post_process()
[00:06.52] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

At this point, the model’s output performance measures should be available in one or more output files that can be read in the next step. For this example, the results are written to two separate files: ‘output_1.csv.gz’ and ‘output.yaml’.

[16]:
show_file_contents(fx.local_directory, 'road-test-files', "Outputs", "output.yaml")
build_travel_time: 60.78943107038733
no_build_travel_time: 67.404
time_savings: 6.614568929612666

Note in this example, some of the values in the output_1.csv.gz file are intentionally manipulated in a contrived manner, so that there is some work for the post-processor to do.

[17]:
show_file_contents(fx.local_directory, 'road-test-files', "Outputs", "output_1.csv.gz")
,value_of_time_savings,present_cost_expansion,cost_of_capacity_expansion,net_benefits
exp,1.0508604102769246,,1.3651577800056909,
plain,49.60926697209499,7500.0,311.2700117047018,-261.6607447326066

load-measures

The load_measures method is the place to actually reach into files in the core model’s run results and extract performance measures, returning a dictionary of key-value pairs for the various performance measures. It takes an optional list giving a subset of performance measures to load, and like the post_process method also can be pointed at an archive location instead of loading measures from the local working directory (which is the default). The load_measures method should not do any post-processing of results (i.e. it should read from but not write to the model outputs directory).

[18]:
fx.load_measures()
[18]:
{'value_of_time_savings': 49.60926697209499,
 'present_cost_expansion': 7500.0,
 'cost_of_capacity_expansion': 311.2700117047018,
 'net_benefits': -261.6607447326066,
 'build_travel_time': 60.78943107038733,
 'no_build_travel_time': 67.404,
 'time_savings': 6.614568929612666}

You may note that the implementation of RoadTestFileModel in the core_files_demo module does not actually include a load_measures method itself, but instead inherits this method from the FilesCoreModel superclass. The instructions on how to actually find the relevant performance measures for this file are instead loaded into table parsers, which are defined in the RoadTestFileModel.__init__ constructor. There are details and illustrations of how to write and use parsers in the file parsing examples page of the TMIP-EMAT documentation.

archive

The archive method copies the relevant model output files to an archive location for longer term storage. The particular archive location is based on the experiment id for a particular experiment, and can be customized if desired by overloading the get_experiment_archive_path method. This customization is not done in this demo, so the default location is used.

[19]:
fx.get_experiment_archive_path(parameters=params)
[19]:
'/var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122'

Actually running the archive method should copy any relevant output files from the model_path of the current active model into a subdirectory of archive_path.

[20]:
fx.archive(params)
[00:06.55] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122
[21]:
show_dir(fx.local_directory)
tmpygcky5rc/
├── _emat_experiment_id_.yml
├── _emat_parameters_.yml
├── archive/
│   └── scp_EMAT Road Test/
│       └── exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122/
│           ├── _emat_start_.log
│           ├── demo-inputs-l.yml
│           ├── demo-inputs-x.yml
│           ├── demo-inputs-x.yml.template
│           ├── emat-road-test.log
│           ├── output.csv.gz
│           ├── output.yaml
│           └── Outputs/
│               ├── output.yaml
│               └── output_1.csv.gz
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   ├── demo-inputs-x.yml
│   ├── demo-inputs-x.yml.template
│   ├── emat-road-test.log
│   ├── output.csv.gz
│   ├── output.yaml
│   └── Outputs/
│       ├── output.yaml
│       └── output_1.csv.gz
├── road-test-model-config.yml
└── road-test-scope.yml

It is permissible, but not required, to simply copy the entire contents of the former to the latter, as is done in this example. However, if the current active model directory has a lot of boilerplate files that don’t change with the inputs, or if it becomes full of intermediate or temporary files that definitely will never be used to compute performance measures, it can be advisable to selectively copy only relevant files. In that case, those files and whatever related sub-directory tree structure exists in the current active model should be replicated within the experiments archive directory.

Normal Operation for Running Multiple Experiments

For this demo, we’ll create a design of experiments with only 8 experiments. The design_experiments method of the RoadTestFileModel object is not defined in the custom core_files_demo written for this model, but rather is a generic function provide by the TMIP-EMAT main library. Real applications will typically use a larger number of experiments, but this small number is sufficient to demonstrate the operation of the tools.

[22]:
design1 = fx.design_experiments(design_name='lhs_1', n_samples=8)
design1
[22]:
alpha amortization_period beta debt_type expand_capacity input_flow interest_rate interest_rate_lock unit_cost_expansion value_of_time yield_curve free_flow_time initial_capacity
experiment
2 0.134750 36 4.642370 Rev Bond 24.901018 85 0.039219 False 143.519760 0.017005 0.003280 60 100
3 0.115907 50 5.242315 Rev Bond 11.985022 114 0.025997 True 107.836739 0.145378 0.009659 60 100
4 0.178456 30 3.510139 Paygo 72.399552 121 0.029066 True 127.838010 0.067821 -0.000839 60 100
5 0.110023 44 4.887030 Paygo 28.565637 127 0.034673 True 125.799829 0.083593 0.002714 60 100
6 0.161977 24 3.865644 GO Bond 82.811459 135 0.028634 True 118.989447 0.054968 0.006745 60 100
7 0.173449 21 4.094118 Paygo 43.476172 142 0.033847 False 133.480778 0.184904 0.017674 60 100
8 0.141973 19 5.331978 GO Bond 52.445940 98 0.031519 False 99.317350 0.115228 0.014752 60 100
9 0.193762 40 4.453382 Rev Bond 92.278968 95 0.037907 False 103.543415 0.092517 0.014360 60 100

The run_experiments command will automatically run the model once for each experiment in the named design. The demo command line version of the road test model is (intentionally) a little bit slow, so will take a few seconds to conduct these eight model experiment runs.

[23]:
fx.run_experiments(design_name='lhs_1')
[00:06.66] MainProcess/INFO: performing 8 scenarios/policies * 1 model(s) = 8 experiments
[00:06.67] MainProcess/INFO: performing experiments sequentially
[00:06.67] MainProcess/INFO: RoadTestFileModel SETUP RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:06.68] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 2 RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.14] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.15] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.16] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_002_c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.17] MainProcess/INFO: RAN EXPERIMENT IN 2.50 SECONDS
[00:09.17] MainProcess/INFO: 1 cases completed
[00:09.18] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:09.18] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 3 RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.56] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.57] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.58] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_003_cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.59] MainProcess/INFO: RAN EXPERIMENT IN 2.42 SECONDS
[00:11.59] MainProcess/INFO: 2 cases completed
[00:11.60] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:11.60] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 4 RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.00] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.01] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.02] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_004_cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.03] MainProcess/INFO: RAN EXPERIMENT IN 2.44 SECONDS
[00:14.03] MainProcess/INFO: 3 cases completed
[00:14.04] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:14.04] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 5 RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.46] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.47] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.48] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_005_cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.48] MainProcess/INFO: RAN EXPERIMENT IN 2.45 SECONDS
[00:16.48] MainProcess/INFO: 4 cases completed
[00:16.49] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:16.50] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 6 RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:18.99] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.00] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.01] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_006_cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.02] MainProcess/INFO: RAN EXPERIMENT IN 2.53 SECONDS
[00:19.02] MainProcess/INFO: 5 cases completed
[00:19.02] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:19.03] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 7 RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.37] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.39] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.40] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_007_d0c23324-6428-11eb-8c2b-acde48001122
[00:21.40] MainProcess/INFO: RAN EXPERIMENT IN 2.38 SECONDS
[00:21.40] MainProcess/INFO: 6 cases completed
[00:21.41] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:21.42] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 8 RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.88] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.89] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.91] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_008_d22e7272-6428-11eb-8c2b-acde48001122
[00:23.91] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:23.91] MainProcess/INFO: 7 cases completed
[00:23.92] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d3ad349e-6428-11eb-8c2b-acde48001122
[00:23.93] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 9 RUNID-d3ad349e-6428-11eb-8c2b-acde48001122
[00:26.40] MainProcess/ERROR: ERROR in run_core_model run 9: Command '['emat-road-test-demo', '--uncs', 'demo-inputs-x.yml', '--levers', 'demo-inputs-l.yml']' returned non-zero exit status 247.
[00:26.40] MainProcess/ERROR: run_core_model ABORT 9
[00:26.40] MainProcess/INFO: RAN EXPERIMENT IN 2.49 SECONDS
[00:26.41] MainProcess/INFO: 8 cases completed
[00:26.41] MainProcess/INFO: experiments finished
[23]:
alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock free_flow_time initial_capacity no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
2 0.134750 4.642370 85 0.017005 143.519760 0.039219 0.003280 24.901018 36 Rev Bond False 60 100 63.802031 61.354318 2.447713 3.537924 -175.134011 178.671935 3573.788162
3 0.115907 5.242315 114 0.145378 107.836739 0.025997 0.009659 11.985022 50 Rev Bond True 60 100 73.822149 67.635961 6.186188 102.524606 42.361931 60.162675 1292.425695
4 0.178456 3.510139 121 0.067821 127.838010 0.029066 -0.000839 72.399552 30 Paygo True 60 100 80.905959 63.090263 17.815696 146.201784 -183.229667 329.431451 9255.414627
5 0.110023 4.887030 127 0.083593 125.799829 0.034673 0.002714 28.565637 44 Paygo True 60 100 81.228935 66.217725 15.011210 159.363387 61.567385 97.796001 3593.552248
6 0.161977 3.865644 135 0.054968 118.989447 0.028634 0.006745 82.811459 24 GO Bond True 60 100 91.004725 63.010343 27.994382 207.737074 -374.545620 582.282695 9853.689681
7 0.173449 4.094118 142 0.184904 133.480778 0.033847 0.017674 43.476172 21 Paygo False 60 100 103.733064 69.975507 33.757557 886.351136 604.765873 281.585263 5803.233337
8 0.141973 5.331978 98 0.115228 99.317350 0.031519 0.014752 52.445940 19 GO Bond False 60 100 67.648456 60.807614 6.840842 77.249238 -282.042005 359.291243 5208.791736
9 0.193762 4.453382 95 0.092517 103.543415 0.037907 0.014360 92.278968 40 Rev Bond False 60 100 NaN NaN NaN NaN NaN NaN NaN

Re-running Failed Experiments

If you pay attention to the logged output, you might notice that one of the experiments (the last one) failed. We can see NaN values in the outputs.

[24]:
results = fx.db.read_experiment_all(fx.scope, 'lhs_1')
results
[24]:
free_flow_time initial_capacity alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
2 60 100 0.134750 4.642370 85 0.017005 143.519760 0.039219 0.003280 24.901018 36 Rev Bond False 63.802031 61.354318 2.447713 3.537924 -175.134011 178.671935 3573.788162
3 60 100 0.115907 5.242315 114 0.145378 107.836739 0.025997 0.009659 11.985022 50 Rev Bond True 73.822149 67.635961 6.186188 102.524606 42.361931 60.162675 1292.425695
4 60 100 0.178456 3.510139 121 0.067821 127.838010 0.029066 -0.000839 72.399552 30 Paygo True 80.905959 63.090263 17.815696 146.201784 -183.229667 329.431451 9255.414627
5 60 100 0.110023 4.887030 127 0.083593 125.799829 0.034673 0.002714 28.565637 44 Paygo True 81.228935 66.217725 15.011210 159.363387 61.567385 97.796001 3593.552248
6 60 100 0.161977 3.865644 135 0.054968 118.989447 0.028634 0.006745 82.811459 24 GO Bond True 91.004725 63.010343 27.994382 207.737074 -374.545620 582.282695 9853.689681
7 60 100 0.173449 4.094118 142 0.184904 133.480778 0.033847 0.017674 43.476172 21 Paygo False 103.733064 69.975507 33.757557 886.351136 604.765873 281.585263 5803.233337
8 60 100 0.141973 5.331978 98 0.115228 99.317350 0.031519 0.014752 52.445940 19 GO Bond False 67.648456 60.807614 6.840842 77.249238 -282.042005 359.291243 5208.791736
9 60 100 0.193762 4.453382 95 0.092517 103.543415 0.037907 0.014360 92.278968 40 Rev Bond False NaN NaN NaN NaN NaN NaN NaN

We can collect the id’s of the failed experiments programmatically. To collect all the experiments that are missing any performance measure output, we can do this:

[25]:
fails = results.isna().any(axis=1)
failed_experiment_ids = fails.index[fails]
failed_experiment_ids
[25]:
Int64Index([9], dtype='int64', name='experiment')

When there is an error (thrown as a subprocess.CalledProcessError) during the execution of a FilesCoreModel, the output from stdout and stderr are written to log files in the archive location, instead of having the legit model outputs written there.

We can see the log output by reading in the log file, like this:

[26]:
error_log = os.path.join(
    fx.get_experiment_archive_path(9),
    'error.stdout.log'
)
with open(error_log, 'r') as stdout:
    error_log_content = stdout.read()

print(error_log_content)
[2021-01-31 19:00:10,065] emat.RoadTest.INFO: running emat-road-test-demo
[2021-01-31 19:00:10,068] emat.RoadTest.ERROR: Random crash, ha ha!

Here we see the log file is explicitly taunting us about randomly crashing the model run. That’s fine – we wanted to crash the execution randomly to show what to do in this event, cause it happens sometimes. Maybe a disk filled up, or there is an intermittent license problem that causes a failure one in a while. If that’s the case and we can fix it just by re-running, awesome!

We can load just the failed experiments to try them again.

[27]:
failed_experiments = fx.read_experiment_parameters(experiment_ids=failed_experiment_ids)
failed_experiments
[27]:
free_flow_time initial_capacity alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock
experiment
9 60 100 0.193762 4.453382 95 0.092517 103.543415 0.037907 0.01436 92.278968 40 Rev Bond False

Normally, there is a “short circuit” process that will prevent re-running a core model experiment, instead the performance measure results will simply be loaded from the database, which is typically much faster than actually running the core model. But, if the performance measures stored in the database are junk, we will not want to trigger the short circuit system, and actually run the full core model again. To do so, we can disable the short circuit like and re-run the failed experiment. If it failed because of a transient error, e.g. a disk space problem that’s been fixed, then perhaps we can simply re-run the model and it will work.

[28]:
fx.run_experiments(failed_experiments, allow_short_circuit=False)
[00:26.53] MainProcess/INFO: performing 1 scenarios/policies * 1 model(s) = 1 experiments
[00:26.54] MainProcess/INFO: performing experiments sequentially
[00:26.54] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:26.55] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 9 RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.93] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.94] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.95] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_009_d53da118-6428-11eb-8c2b-acde48001122
[00:28.96] MainProcess/INFO: RAN EXPERIMENT IN 2.42 SECONDS
[00:28.96] MainProcess/INFO: 1 cases completed
[00:28.96] MainProcess/INFO: experiments finished
[28]:
alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock free_flow_time initial_capacity no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
9 0.193762 4.453382 95 0.092517 103.543415 0.037907 0.01436 92.278968 40 Rev Bond False 60 100 69.251572 60.503221 8.748351 76.890662 -385.524839 462.4155 9554.87952

Much better! Now we can see we have a more complete set of outputs, without the NaN’s. Hooray!

[29]:
results = fx.db.read_experiment_all(scope_name=fx.scope.name, design_name='lhs_1')
results
[29]:
free_flow_time initial_capacity alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
2 60 100 0.134750 4.642370 85 0.017005 143.519760 0.039219 0.003280 24.901018 36 Rev Bond False 63.802031 61.354318 2.447713 3.537924 -175.134011 178.671935 3573.788162
3 60 100 0.115907 5.242315 114 0.145378 107.836739 0.025997 0.009659 11.985022 50 Rev Bond True 73.822149 67.635961 6.186188 102.524606 42.361931 60.162675 1292.425695
4 60 100 0.178456 3.510139 121 0.067821 127.838010 0.029066 -0.000839 72.399552 30 Paygo True 80.905959 63.090263 17.815696 146.201784 -183.229667 329.431451 9255.414627
5 60 100 0.110023 4.887030 127 0.083593 125.799829 0.034673 0.002714 28.565637 44 Paygo True 81.228935 66.217725 15.011210 159.363387 61.567385 97.796001 3593.552248
6 60 100 0.161977 3.865644 135 0.054968 118.989447 0.028634 0.006745 82.811459 24 GO Bond True 91.004725 63.010343 27.994382 207.737074 -374.545620 582.282695 9853.689681
7 60 100 0.173449 4.094118 142 0.184904 133.480778 0.033847 0.017674 43.476172 21 Paygo False 103.733064 69.975507 33.757557 886.351136 604.765873 281.585263 5803.233337
8 60 100 0.141973 5.331978 98 0.115228 99.317350 0.031519 0.014752 52.445940 19 GO Bond False 67.648456 60.807614 6.840842 77.249238 -282.042005 359.291243 5208.791736
9 60 100 0.193762 4.453382 95 0.092517 103.543415 0.037907 0.014360 92.278968 40 Rev Bond False 69.251572 60.503221 8.748351 76.890662 -385.524839 462.415500 9554.879520

Multiprocessing for Running Multiple Experiments

The examples above are all single-process demonstrations of using TMIP-EMAT to run core model experiments. If your core model itself is multi-threaded or otherwise is designed to make full use of your multi-core CPU, or if a single core model run will otherwise max out some computational resource (e.g. RAM, disk space) then single process operation should be sufficient.

If, on the other hand, your core model is such that you can run multiple independent instances of the model side-by-side on the same machine, then you could benefit from a multiprocessing approach. This can be accomplished by splitting a design of experiments over several processes that you start manually, or by using an automatic multiprocessing library such as dask.distributed.

Running a Subset of Experiments Manually

Suppose, for example, you wanted to distribute the workload of running experiments over several processes, or even over several computers. If each process has file system access to the same TMIP-EMAT database of experiments, we can orchestrate these experiments in parallel by manually splitting up the processes.

To begin with, we’ll have one process create a complete design of experiments, and save it to the database (which happens automatically here).

[30]:
design2 = fx.design_experiments(design_name='lhs_2', n_samples=8, random_seed=42)

Then, we can create set up a copy of the same model in a different process, even on a different machine, as long as we point back to the same original database file. This implies the different process has access to the file system where the original file is stored. It is valuable to read and write to the same database file, not just a copy of the file, as this will obviate the need to sync the experimental data manually afterwards. In this demo, we’ll just create a new directory to work in, but we’ll point to the database in the original directory. Instead of allowing our model to implicitly create a new database file in the new directory, we’ll instantiate a SQLiteDB object pointing to the original database.

[31]:
database_filename = fx.db.database_path
db2 = emat.SQLiteDB(database_filename)
[00:29.10] MainProcess/INFO: running script emat_db_init.sql
[00:29.10] MainProcess/INFO: running script meta_model.sql
[00:29.11] MainProcess/INFO: found no experiments with missing run_id's
[00:29.11] MainProcess/INFO: running script emat_db_init_views.sql

Now, db2 is a emat.SQLiteDB object, which wraps a new connection to the original database. Then, we’ll pass that db2 explicitly to the new RoadTestFileModel constructor, which will create a complete copy of our model (other than the database) in a new directory.

[32]:
fx2 = core_files_demo.RoadTestFileModel(db=db2)
[00:29.12] MainProcess/WARNING: changing cwd to /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon

To run a particular slice of a design of experiments, we need to load the experimental design first, and then pass that slice to the run_experiments function, instead of just giving the design_name.

[33]:
design2 = fx.read_experiment_parameters('lhs_2')

For splitting the work across a number of similarly capable processes or machines, the double-colon slice is convenient. If, for example, you are splitting the work over 4 computers, you can run each with slices 0::4, 1::4, 2::4, and 3::4. This slices in skip-step manner, so slice below will run every 4th experiment from the design, starting with experiment index 0 (i.e. the first one).

[34]:
fx2.run_experiments(design2.iloc[0::4])
[00:29.19] MainProcess/INFO: performing 2 scenarios/policies * 1 model(s) = 2 experiments
[00:29.21] MainProcess/INFO: performing experiments sequentially
[00:29.21] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:29.22] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 10 RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.69] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.70] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.72] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/archive/scp_EMAT Road Test/exp_010_d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.72] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:31.72] MainProcess/INFO: 1 cases completed
[00:31.73] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:31.74] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 14 RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.21] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.22] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.23] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/archive/scp_EMAT Road Test/exp_014_d85582c6-6428-11eb-8c2b-acde48001122
[00:34.24] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:34.24] MainProcess/INFO: 2 cases completed
[00:34.24] MainProcess/INFO: experiments finished
[34]:
alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock free_flow_time initial_capacity no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
10 0.114450 3.648104 100 0.034746 120.196432 0.028321 0.001514 5.621927 34 Paygo True 60 100 66.867014 65.624845 1.242169 4.316063 -17.502784 21.818848 675.735529
14 0.132514 4.016263 133 0.107612 131.922290 0.035993 0.015811 54.081760 22 Paygo False 60 100 84.993873 64.403269 20.590604 294.700221 -37.110353 331.810575 7134.589600

Because we have linked the second model instance back to the same database, after these experiments have finished we can access the results from the original fx instance.

[35]:
fx.read_experiment_measures('lhs_2')
[35]:
no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment run
10 d6d5244c-6428-11eb-8c2b-acde48001122 66.867014 65.624845 1.242169 4.316063 -17.502784 21.818848 675.735529
14 d85582c6-6428-11eb-8c2b-acde48001122 84.993873 64.403269 20.590604 294.700221 -37.110353 331.810575 7134.589600

It is important to note that for this manual multiprocessing technique to work, where different processes run the model simultaneously, each process must be in a seperate Python instance (e.g. in seperate Jupyter notebooks, not in the same notebook as shown here).

Automatic Multiprocessing for Running Multiple Experiments

The examples above are all essentially single-process demonstrations of using TMIP-EMAT to run core model experiments, either by running all in one single process, or by having a user manually instantiate a number of single processes. If your core model itself is multi-threaded or otherwise is designed to make full use of your multi-core CPU, or if a single core model run will otherwise max out some computational resource (e.g. RAM, disk space) then single process operation should be sufficient.

If, on the other hand, your model is such that you can run multiple independent instances of the model side-by-side on the same machine, but you don’t want to manage the process of manually, then you could benefit from a multiprocessing approach that uses the dask.distributed library. To demonstrate this, we’ll create yet another small design of experiments to run.

[36]:
design3 = fx.design_experiments(design_name='lhs_3', n_samples=30, random_seed=3)
design3
[36]:
alpha amortization_period beta debt_type expand_capacity input_flow interest_rate interest_rate_lock unit_cost_expansion value_of_time yield_curve free_flow_time initial_capacity
experiment
18 0.151341 48 3.676012 Rev Bond 34.248792 139 0.037682 False 128.627672 0.004817 0.012079 60 100
19 0.118509 41 5.309555 Paygo 76.473048 89 0.035629 True 125.994370 0.091859 0.005987 60 100
20 0.174355 18 4.727608 Rev Bond 94.255893 131 0.030304 False 139.257007 0.076668 0.016896 60 100
21 0.188622 26 3.703274 GO Bond 96.919294 125 0.036478 False 121.126672 0.110190 0.016048 60 100
22 0.160916 33 4.447292 GO Bond 64.247841 149 0.034941 False 104.498216 0.040406 0.010364 60 100
23 0.148182 24 5.015381 Paygo 83.286755 128 0.027313 False 144.602977 0.061343 0.014137 60 100
24 0.145859 43 3.943408 Rev Bond 61.408748 117 0.032835 True 107.081150 0.150078 0.004032 60 100
25 0.109392 32 5.465265 Paygo 29.100532 98 0.038449 False 142.411065 0.057756 0.001003 60 100
26 0.154192 37 4.918319 GO Bond 55.383965 145 0.038841 True 130.162175 0.119689 0.001579 60 100
27 0.106310 41 4.276815 GO Bond 87.269134 106 0.036746 True 113.206445 0.093281 0.019623 60 100
28 0.159061 47 3.630430 GO Bond 52.479431 102 0.029875 False 135.066324 0.075284 0.015455 60 100
29 0.190069 45 5.262368 Paygo 67.884939 101 0.026620 True 118.147867 0.125009 0.013234 60 100
30 0.123809 28 4.504676 Rev Bond 39.255873 124 0.037005 True 121.924833 0.029906 0.017495 60 100
31 0.141084 43 4.570515 GO Bond 42.644409 85 0.028042 False 101.824130 0.214848 0.005303 60 100
32 0.116030 50 4.772191 Paygo 78.781012 120 0.025253 False 119.915395 0.024683 0.011550 60 100
33 0.177997 22 4.097314 GO Bond 92.535470 138 0.039246 True 108.782051 0.142603 0.004617 60 100
34 0.121416 28 5.136051 Rev Bond 48.793510 112 0.029319 False 115.523335 0.113898 0.000314 60 100
35 0.168046 47 3.822497 Paygo 12.183780 135 0.030787 False 127.210823 0.131430 0.006618 60 100
36 0.132624 30 3.978002 Paygo 70.001596 105 0.025537 False 98.871279 0.065724 0.007461 60 100
37 0.139045 25 4.317356 Rev Bond 85.104877 148 0.035198 True 134.932609 0.045659 -0.002330 60 100
38 0.196744 20 4.412799 Paygo 3.928463 91 0.026193 False 124.260667 0.082519 0.003482 60 100
39 0.164386 18 3.559705 Rev Bond 2.563779 114 0.027669 True 105.599016 0.069089 0.018237 60 100
40 0.171876 37 5.189379 Rev Bond 43.416751 134 0.033763 True 110.240305 0.102707 0.009666 60 100
41 0.102494 36 4.637645 GO Bond 20.219566 90 0.028766 False 141.028735 0.036429 0.013430 60 100
42 0.128699 39 4.224275 Rev Bond 31.087547 82 0.033269 True 132.576526 0.135245 0.009204 60 100
43 0.194011 17 4.119391 Rev Bond 25.279767 83 0.031359 True 137.502268 0.053824 0.008654 60 100
44 0.112179 35 5.082739 Paygo 57.001926 119 0.039705 True 101.590247 0.087633 -0.001666 60 100
45 0.185542 21 4.843869 Paygo 18.339744 143 0.034300 True 96.190623 0.169604 0.002675 60 100
46 0.182904 31 3.879664 GO Bond 9.336654 96 0.032018 False 113.617685 0.100206 -0.000695 60 100
47 0.136037 15 5.422761 GO Bond 13.567899 108 0.031707 True 97.768721 0.160056 0.018624 60 100

The demo module is set up to facilitate distributed multiprocessing. During the setup step, the code detects if it is being run in a distributed “worker” environment instead of in a normal Python environment. If the “worker” environment is detected, then a copy of the entire files-based model is made into the worker’s local workspace, and the model is run there instead of in the master workspace. This allows each worker to edit the files independently and simultaneously, without disturbing other parallel workers.

With this small modification, we are ready to run this demo model in parallel subprocesses. to do, we will use the async_experiments method. Leveraging the asyncio Python interface will allow us to run multiple core models in parallel in the background, and monitor the progress interactively from within a Jupyter notebook.

[37]:
background = fx.async_experiments(
    design=design3,
    max_n_workers=2,
    stagger_start=5,
    batch_size=1,
)
[00:34.36] MainProcess/INFO: asynchronous_experiments(max_n_workers=2)
[38]:
background.progress() # Initially everything is pending
[38]:
'30 runs: 30 pending'
[39]:
await asyncio.sleep(15)
background.progress()
[00:34.37] MainProcess/INFO: AsyncExperimentalDesign.run start
[00:34.37] MainProcess/INFO: initializing default DistributedEvaluator.client
[00:34.37] MainProcess/INFO:   max_n_workers=2, actual n_workers=2
[00:34.37] MainProcess/INFO:   n_workers=2
[00:35.31] MainProcess/INFO: completed initializing default DistributedEvaluator.client
[00:37.49] MainProcess/INFO: AsyncExperimentalDesign.run dispatching experiments
[00:37.50] MainProcess/INFO: performing 30 scenarios/policies * 1 model(s) = 30 experiments
[00:37.52] MainProcess/INFO: experiments in asynchronous evaluator
[39]:
'30 runs: 2 done, 27 pending, 1 queued'

After 15 seconds, only 3 runs have executed, as the stagger_start allows only one run to start every 5 seconds. Un-started runs remain in the “pending” status. We can see the status of all runs in status.

[40]:
background.status()
[40]:
experiment
18       done
19       done
20     queued
21    pending
22    pending
23    pending
24    pending
25    pending
26    pending
27    pending
28    pending
29    pending
30    pending
31    pending
32    pending
33    pending
34    pending
35    pending
36    pending
37    pending
38    pending
39    pending
40    pending
41    pending
42    pending
43    pending
44    pending
45    pending
46    pending
47    pending
dtype: object

For fast running models, may find that stagger_start is not needed. Large and slow models, especially ones that begin with a massive file-copy operation on a hard disk, may benefit from staggering the task start times, so the first run has a clear shot at finishing disk actions promptly and moving to data processing as soon as possible.

If we are dissatisfied with this pace for this small and fast model, the stagger_start can be changed dynamically while runs are going. If we set it to zero, all remaining pending runs will be queued immediately (or, within a second or so). Queued runs will begin as soon as a worker process is available to handle them.

[41]:
background.stagger_start = 0
await asyncio.sleep(1)
background.status()
[00:49.53] MainProcess/INFO: AsyncExperimentalDesign.run dispatching task complete
[00:50.09] MainProcess/INFO: 3 cases completed
[41]:
experiment
18      done
19      done
20      done
21    queued
22    queued
23    queued
24    queued
25    queued
26    queued
27    queued
28    queued
29    queued
30    queued
31    queued
32    queued
33    queued
34    queued
35    queued
36    queued
37    queued
38    queued
39    queued
40    queued
41    queued
42    queued
43    queued
44    queued
45    queued
46    queued
47    queued
dtype: object
[42]:
await asyncio.sleep(15)
background.progress()
[00:55.05] MainProcess/INFO: 6 cases completed
[00:58.24] MainProcess/INFO: 9 cases completed
[01:02.97] MainProcess/INFO: 12 cases completed
[42]:
'30 runs: 13 done, 17 queued'

After 15 more seconds, only about 10 more runs are complete, as each run takes around 3 seconds, and we only have two workers processing the runs.

The current_results method allows us to view the run results that are available currently. You might notice that runs are not necessarily dispatched in the order they appear in the table, but we’ll get to all of them eventually.

[43]:
background.current_results().head(15)
[43]:
free_flow_time initial_capacity alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
18 60 100 0.151341 3.676012 139 0.004817 128.627672 0.037682 0.012079 34.248792 48 Rev Bond False 90.467198 70.318877 20.148321 13.491094 -192.374479 205.865574 4405.342341
19 60 100 0.118509 5.309555 89 0.091859 125.994370 0.035629 0.005987 76.473048 41 Paygo True 63.829895 60.187687 3.642208 29.776845 -243.694139 273.470984 9635.173464
20 60 100 0.174355 4.727608 131 0.076668 139.257007 0.030304 0.016896 94.255893 18 Rev Bond False 97.497064 61.624340 35.872724 360.289988 -600.998696 961.288684 13125.793539
21 60 100 0.188622 3.703274 125 0.110190 121.126672 0.036478 0.016048 96.919294 26 GO Bond False NaN NaN NaN NaN NaN NaN NaN
22 60 100 0.160916 4.447292 149 0.040406 104.498216 0.034941 0.010364 64.247841 33 GO Bond False NaN NaN NaN NaN NaN NaN NaN
23 60 100 0.148182 5.015381 128 0.061343 144.602977 0.027313 0.014137 83.286755 24 Paygo False NaN NaN NaN NaN NaN NaN NaN
24 60 100 0.145859 3.943408 117 0.150078 107.081150 0.032835 0.004032 61.408748 43 Rev Bond True NaN NaN NaN NaN NaN NaN NaN
25 60 100 0.109392 5.465265 98 0.057756 142.411065 0.038449 0.001003 29.100532 32 Paygo False NaN NaN NaN NaN NaN NaN NaN
26 60 100 0.154192 4.918319 145 0.119689 130.162175 0.038841 0.001579 55.383965 37 GO Bond True NaN NaN NaN NaN NaN NaN NaN
27 60 100 0.106310 4.276815 106 0.093281 113.206445 0.036746 0.019623 87.269134 41 GO Bond True NaN NaN NaN NaN NaN NaN NaN
28 60 100 0.159061 3.630430 102 0.075284 135.066324 0.029875 0.015455 52.479431 47 GO Bond False NaN NaN NaN NaN NaN NaN NaN
29 60 100 0.190069 5.262368 101 0.125009 118.147867 0.026620 0.013234 67.884939 45 Paygo True NaN NaN NaN NaN NaN NaN NaN
30 60 100 0.123809 4.504676 124 0.029906 121.924833 0.037005 0.017495 39.255873 28 Rev Bond True NaN NaN NaN NaN NaN NaN NaN
31 60 100 0.141084 4.570515 85 0.214848 101.824130 0.028042 0.005303 42.644409 43 GO Bond False NaN NaN NaN NaN NaN NaN NaN
32 60 100 0.116030 4.772191 120 0.024683 119.915395 0.025253 0.011550 78.781012 50 Paygo False NaN NaN NaN NaN NaN NaN NaN

If we want to simply block the Python interpreter until the runs are done, we can do so by awaiting the final_results.

[44]:
await background.final_results()
[01:06.21] MainProcess/INFO: 15 cases completed
[01:11.06] MainProcess/INFO: 18 cases completed
[01:14.23] MainProcess/INFO: 21 cases completed
[01:19.02] MainProcess/INFO: 24 cases completed
[01:22.12] MainProcess/INFO: 27 cases completed
[01:26.65] MainProcess/INFO: 30 cases completed
[44]:
free_flow_time initial_capacity alpha beta input_flow value_of_time unit_cost_expansion interest_rate yield_curve expand_capacity amortization_period debt_type interest_rate_lock no_build_travel_time build_travel_time time_savings value_of_time_savings net_benefits cost_of_capacity_expansion present_cost_expansion
experiment
18 60 100 0.151341 3.676012 139 0.004817 128.627672 0.037682 0.012079 34.248792 48 Rev Bond False 90.467198 70.318877 20.148321 13.491094 -192.374479 205.865574 4405.342341
19 60 100 0.118509 5.309555 89 0.091859 125.994370 0.035629 0.005987 76.473048 41 Paygo True 63.829895 60.187687 3.642208 29.776845 -243.694139 273.470984 9635.173464
20 60 100 0.174355 4.727608 131 0.076668 139.257007 0.030304 0.016896 94.255893 18 Rev Bond False 97.497064 61.624340 35.872724 360.289988 -600.998696 961.288684 13125.793539
21 60 100 0.188622 3.703274 125 0.110190 121.126672 0.036478 0.016048 96.919294 26 GO Bond False 85.859973 62.102800 23.757173 327.226070 -334.329873 661.555942 11739.511583
22 60 100 0.160916 4.447292 149 0.040406 104.498216 0.034941 0.010364 64.247841 33 GO Bond False 116.880236 66.259960 50.620276 304.758181 -30.577074 335.335254 6713.784698
23 60 100 0.148182 5.015381 128 0.061343 144.602977 0.027313 0.014137 83.286755 24 Paygo False 90.665184 61.468734 29.196450 229.247199 -288.854311 518.101510 12043.512651
24 60 100 0.145859 3.943408 117 0.150078 107.081150 0.032835 0.004032 61.408748 43 Rev Bond True 76.254321 62.460523 13.793797 242.207489 -70.587693 312.795183 6575.719357
25 60 100 0.109392 5.465265 98 0.057756 142.411065 0.038449 0.001003 29.100532 32 Paygo False 65.877391 61.455236 4.422155 25.029593 -115.116663 140.146256 4144.237736
26 60 100 0.154192 4.918319 145 0.119689 130.162175 0.038841 0.001579 55.383965 37 GO Bond True 117.527065 66.583783 50.943282 884.112291 539.879198 344.233094 7208.897323
27 60 100 0.106310 4.276815 106 0.093281 113.206445 0.036746 0.019623 87.269134 41 GO Bond True 68.183795 60.559329 7.624466 75.389314 -381.495063 456.884376 9879.428427
28 60 100 0.159061 3.630430 102 0.075284 135.066324 0.029875 0.015455 52.479431 47 GO Bond False 70.255064 62.217189 8.037875 61.722934 -256.677328 318.400262 7088.203858
29 60 100 0.190069 5.262368 101 0.125009 118.147867 0.026620 0.013234 67.884939 45 Paygo True 72.017180 60.786517 11.230663 141.797161 -73.766382 215.563543 8020.460747
30 60 100 0.123809 4.504676 124 0.029906 121.924833 0.037005 0.017495 39.255873 28 Rev Bond True 79.576635 64.404583 15.172052 56.263770 -210.531699 266.795469 4786.265822
31 60 100 0.141084 4.570515 85 0.214848 101.824130 0.028042 0.005303 42.644409 43 GO Bond False 64.027525 60.794355 3.233170 59.044512 -139.377225 198.421737 4342.229844
32 60 100 0.116030 4.772191 120 0.024683 119.915395 0.025253 0.011550 78.781012 50 Paygo False 76.618381 61.038635 15.579747 46.147516 -194.872685 241.020201 9447.056124
33 60 100 0.177997 4.097314 138 0.142603 108.782051 0.039246 0.004617 92.535470 22 GO Bond True 99.966081 62.728737 37.237343 732.799774 104.331664 628.468110 10066.198296
34 60 100 0.121416 5.136051 112 0.113898 115.523335 0.029319 0.000314 48.793510 28 Rev Bond False 73.038039 61.693605 11.344433 144.715681 -169.489549 314.205230 5636.788951
35 60 100 0.168046 3.822497 135 0.131430 127.210823 0.030787 0.006618 12.183780 47 Paygo False 91.752596 80.460756 11.291840 200.351591 159.635389 40.716201 1549.908642
36 60 100 0.132624 3.978002 105 0.065724 98.871279 0.025537 0.007461 70.001596 30 Paygo False 69.661928 61.170365 8.491563 58.600606 -187.746392 246.346998 6921.147330
37 60 100 0.139045 4.317356 148 0.045659 134.932609 0.035198 -0.002330 85.104877 25 Rev Bond True 105.330218 63.175774 42.154444 284.861228 -395.777779 680.639007 11483.423089
38 60 100 0.196744 4.412799 91 0.082519 124.260667 0.026193 0.003482 3.928463 20 Paygo False 67.785922 66.568468 1.217455 9.142197 -15.633244 24.775440 488.153405
39 60 100 0.164386 3.559705 114 0.069089 105.599016 0.027669 0.018237 2.563779 18 Rev Bond True 75.724624 74.369604 1.355019 10.672342 -9.155192 19.827534 270.732536
40 60 100 0.171876 5.189379 134 0.102707 110.240305 0.033763 0.009666 43.416751 37 Rev Bond True 107.093529 67.249450 39.844079 548.362942 311.279196 237.083746 4786.275850
41 60 100 0.102494 4.637645 90 0.036429 141.028735 0.028766 0.013430 20.219566 36 GO Bond False 63.772612 61.606001 2.166611 7.103389 -130.417635 137.521024 2851.539859
42 60 100 0.128699 4.224275 82 0.135245 132.576526 0.033269 0.009204 31.087547 39 Rev Bond True 63.339271 61.064241 2.275030 25.230374 -175.639567 200.869942 4121.478933
43 60 100 0.194011 4.119391 83 0.053824 137.502268 0.031359 0.008654 25.279767 17 Rev Bond True 65.402926 62.135104 3.267821 14.598542 -250.630628 265.229169 3476.025296
44 60 100 0.112179 5.082739 119 0.087633 101.590247 0.039705 -0.001666 57.001926 35 Paygo True 76.294739 61.645559 14.649179 152.766854 -30.252812 183.019666 5790.839697
45 60 100 0.185542 4.843869 143 0.169604 96.190623 0.034300 0.002675 18.339744 21 Paygo True 122.953743 87.847511 35.106232 851.444399 765.845948 85.598451 1764.111444
46 60 100 0.182904 3.879664 96 0.100206 113.617685 0.032018 -0.000695 9.336654 31 GO Bond False 69.366811 66.625117 2.741694 26.374431 -28.133075 54.507506 1060.809021
47 60 100 0.136037 5.422761 108 0.160056 97.768721 0.031707 0.018624 13.567899 15 GO Bond True 72.389640 66.214709 6.174931 106.740422 -2.377699 109.118121 1326.516148