Writing and Using a Bespoke Model Interface¶

[1]:

import emat
import os
import pandas as pd
import numpy as np
import gzip
import asyncio
from emat.util.show_dir import show_dir, show_file_contents

This notebook is meant to illustrate the use of TMIP-EMAT’s various modes of operation. It provides an illustration of how to use TMIP-EMAT and the demo interface to run the command line version of the Road Test model. A similar approach can be developed to run any transportation model that can be run from the command line, including for proprietary modeling tools that are typically run from a graphical user interface (GUI) but that provide command line access also.

In this example notebook, we will activate some logging features. The same logging utility is written directly into the EMAT and the core_files_demo.py module. This will give us a view of what’s happening inside the code as it runs.

[2]:

import logging
from emat.util.loggers import log_to_stderr
log = log_to_stderr(logging.INFO)

Connecting to the Model¶

The interface for this model is located in the core_files_demo.py module, which we will import into this notebook. This file is extensively documented in comments, and is a great starting point for new users who want to write an interface for a new bespoke travel demand model.

[3]:

import core_files_demo

Within this module, you will find a definition for the RoadTestFileModel class.

We initialize an instance of the model interface object. If you look at the module code, you’ll note the __init__ function does a number of things, including creating a temporary directory to work in, copying the needed files into this temporary directory, loading the scope, and creating a SQLite database to work within. For your implementation, you might or might not do any of these steps. In particular, you’ll probably want to use a database that is not in a temporary location, so that the results will be available after this notebook is closed.

[4]:

fx = core_files_demo.RoadTestFileModel()

[00:03.84] MainProcess/WARNING: changing cwd to /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc
[00:03.88] MainProcess/INFO: running script emat_db_init.sql
[00:03.89] MainProcess/INFO: running script meta_model.sql
[00:03.90] MainProcess/INFO: found no experiments with missing run_id's
[00:03.90] MainProcess/INFO: running script emat_db_init_views.sql

Once we have loaded the RoadTestFileModel class, we have a number of files available in the “master_directory” that was created as that temporary directory:

[5]:

show_dir(fx.master_directory.name)

tmpygcky5rc/
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   └── demo-inputs-x.yml.template
├── road-test-model-config.yml
└── road-test-scope.yml

Understanding Directories¶

The TMIP-EMAT interface design for files-based bespoke models uses pointers for several directories to control the operation of the model.

local_directory

This is the working directory for this instance of TMIP-EMAT, not that for the core model itself. Typically it can be Python’s usual current working directory, accessible via os.getcwd(). In this directory typically you’ll have a TMIP-EMAT model configuration yaml file, a scope definition yaml file, and a sub-directory containing the files needed to run the core model itself.
model_path

The relative path from the local_directory to the directory where the core model files are located. When the core model itself is actually run, this should be to the “current working directory” for that run. The model_path must be given in the model config yaml file.
rel_output_path

The relative path from the model_path to the directory where the core model output files are located. The default value of this path is “./Outputs” but this can be overridden by setting rel_output_path in the model config yaml file. If the outputs are comingled with other input files in the core model directory, this can be set to “.” (just a dot).
archive_path

The path where model archive directories can be found. This path must be given in the model config yaml file. It can be given as an absolute path, or a relative path. If it is a relative path, it should be relative to the local_directory.

These directories, especially the ones other than the local_directory, are defined in a model configuration yaml file. This makes it easy to change the directory pointers when moving TMIP-EMAT between different machines that may have different file system structures.

Single Run Operation for Development and Debugging¶

Before we take on the task of running this model in exploratory mode, we’ll want to make sure that our interface code is working correctly. To check each of the components of the interface (setup, run, post-process, load-measures, and archive), we can run each individually in sequence, and inspect the results to make sure they are correct.

setup¶

This method is the place where the core model set up takes place, including creating or modifying files as necessary to prepare for a core model run. When running experiments, this method is called once for each core model experiment, where each experiment is defined by a set of particular values for both the exogenous uncertainties and the policy levers. These values are passed to the experiment only here, and not in the run method itself. This facilitates debugging, as the setup method can be used without the run method, as we do here. This allows us to manually inspect the prepared files and ensure they are correct before actually running a potentially expensive model.

Each input exogenous uncertainty or policy lever can potentially be used to manipulate multiple different aspects of the underlying core model. For example, a policy lever that includes a number of discrete future network “build” options might trigger the replacement of multiple related network definition files. Or, a single uncertainty relating to the cost of fuel might scale both a parameter linked to the modeled per-mile cost of operating an automobile and the modeled total cost of fuel used by transit services.

For this demo model, running the core model itself in files mode requires two configuration files to be available, one for levers and another for uncertainties. These two files are provided in the demo in two ways: as a runnable base file (for the levers) and as a template file (for the uncertainties).

The levers file is a ready-to-use file (for this demo, in YAML format, although your model may use a different file format for input files). It has default values pre-coded into the file, and to modify this file for use by EMAT the setup method needs to parse and edit this file to swap out the default values for new ones in each experiment. This can be done using regular expressions (as in this demo), or any other method you like to edit the file appropriately. The advantage of this approach is that the base file is ready to use with the core model as-is, facilitating the use of this file outside the EMAT context.

[6]:

show_file_contents(fx.master_directory.name, 'road-test-files', 'demo-inputs-l.yml')

---
# This file defines lever values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
expand_capacity: 10
amortization_period: 30
interest_rate_lock: False
debt_type: GO Bond
lane_width: 10
mandatory_unused_lever: 42
...

By contrast, the uncertainties file is in a template format. The values of the parameters that will be manipulated by EMAT for each experiment are not given by default values, but instead each value to be set is indicated in the file by a unique token that is easy to search and replace, and definitely not something that appear in any script otherwise. This approach makes the text-substitution code that is used in this module much simpler and less prone to bugs. But there is a small downside of this approach: every parameter must definitely be replaced in this process, as the template file is unusable outside the EMAT context, and also every unique token needs to be replaced.

[7]:

show_file_contents(fx.master_directory.name, 'road-test-files', 'demo-inputs-x.yml.template')

---
# This file defines uncertainty values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
alpha: __EMAT_PROVIDES_VALUE__ALPHA__
beta: __EMAT_PROVIDES_VALUE__BETA__
input_flow: __EMAT_PROVIDES_VALUE__INPUT_FLOW__
value_of_time: __EMAT_PROVIDES_VALUE__VALUE_OF_TIME__
labor_unit_cost_expansion: __EMAT_PROVIDES_VALUE__LABOR_UNIT_COST_EXPANSION__
materials_unit_cost_expansion: __EMAT_PROVIDES_VALUE__MATERIALS_UNIT_COST_EXPANSION__
interest_rate: __EMAT_PROVIDES_VALUE__INTEREST_RATE__
yield_curve: __EMAT_PROVIDES_VALUE__YIELD_CURVE__
...

Regardless of which file management system you use, the setup method is the place to make edits to these input files and write them into your working directory. To do so, the setup method takes one argument: a dictionary containing key-value pairs that assign a particular value to each input (exogenous uncertainty or policy lever) that is defined in the model scope. The keys must match exactly with the names of the parameters given in the scope.

If you have written your setup method to call the super-class setup, you will find that if you give keys as input that are not defined in the scope, you’ll get a KeyError.

[8]:

bad_params = {
    'name_not_in_scope': 'is_a_problem',
}

try:
    fx.setup(bad_params)
except KeyError as error:
    log.error(repr(error))

[00:03.94] MainProcess/ERROR: SETUP ERROR: 'name_not_in_scope' not found in scope parameters
[00:03.94] MainProcess/ERROR: KeyError("'name_not_in_scope' not found in scope parameters")

On the other hand, your custom model may or may not allow you to leave out some parameters. It is up to you to decide how to handle missing values, either by setting them at their default values or raising an error. In normal operation, parameters typically won’t be left out from the design of experiments, so it is not usually important to monitor this carefully.

In our example module’s setup, all of the uncertainty values must be given, because the template file would be unusable otherwise. But the policy levers can be omitted, and if so they are left at their default values in the original file. Note that the default values in that file are not strictly consistent with the default values in the scope file, and TMIP-EMAT does nothing on its own to address this discrepancy.

[9]:

params = {
    'expand_capacity': 75,
    'amortization_period': 25,
    'debt_type': "Paygo",
    'alpha': 0.1234,
    'beta': 4.0,
    'input_flow': 100,
    'value_of_time': 0.075,
    'unit_cost_expansion': 100,
    'interest_rate': 0.035,
    'yield_curve': 0.01,
} # interest_rate_lock is missing, that's ok

fx.setup(params)

[00:03.97] MainProcess/INFO: RoadTestFileModel SETUP RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122
[00:03.97] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 1 RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

After running setup successfully, we will have overwritten the “demo-inputs-l.yml” file with new values, and written a new “demo-inputs-x.yml” file into the model working directory with those values.

[10]:

show_dir(fx.local_directory)

tmpygcky5rc/
├── _emat_experiment_id_.yml
├── _emat_parameters_.yml
├── archive/
│   └── scp_EMAT Road Test/
│       └── exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122/
│           └── _emat_start_.log
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   ├── demo-inputs-x.yml
│   └── demo-inputs-x.yml.template
├── road-test-model-config.yml
└── road-test-scope.yml

[11]:

show_file_contents(fx.local_directory, 'road-test-files', 'demo-inputs-l.yml')

---
# This file defines lever values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
expand_capacity: 75
amortization_period: 25
interest_rate_lock: False
debt_type: Paygo
lane_width: 10
mandatory_unused_lever: 42
...

[12]:

show_file_contents(fx.local_directory, 'road-test-files', 'demo-inputs-x.yml')

---
# This file defines uncertainty values for the files-based
# Road Test example.  It is intentionally a complex way
# to implement this Python-based model, designed to
# demonstrate how to use a files-based model called
# from the command line.
alpha: 0.1234
beta: 4.0
input_flow: 100
value_of_time: 0.075
labor_unit_cost_expansion: 60.0
materials_unit_cost_expansion: 40.0
interest_rate: 0.035
yield_curve: 0.01
...

run¶

The run method is the place where the core model run takes place. Note that this method takes no arguments; all the input exogenous uncertainties and policy levers are delivered to the core model in the setup method, which will be executed prior to calling this method. This facilitates debugging, as the setup method can be used without the run method as we did above, allowing us to manually inspect the prepared files and ensure they are correct before actually running a potentially expensive model.

[13]:

fx.run()

[00:06.49] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

The RoadTestFileModel class includes a custom last_run_logs method, which displays both the “stdout” and “stderr” logs generated by the model executable during the most recent call to the run method. We can use this method for debugging purposes, to identify why the core model crashes (if it does crash). In this first test it did not crash, and the logs look good.

[14]:

fx.last_run_logs()

=== STDOUT ===
[2021-01-31 18:59:50,158] emat.RoadTest.INFO: running emat-road-test-demo
[2021-01-31 18:59:50,164] emat.RoadTest.INFO: emat-road-test-demo completed without errors

=== END OF LOG ===

post-process¶

There is an (optional) post_process step that is separate from the run step.

Post-processing differs from the main model run in two important ways:

It can be run to efficiently generate a subset of performance measures.
It can be run based on archived model main-run core model results.

Both features are designed to support workflows where new performance measures are added to the exploratory scope after the main model run(s) are completed. By allowing the post_process method to be run only for a subset of measures, we can avoid replicating possibly expensive post-processing steps when we have already completed them, or when they are not needed for a particular application.

For example, consider an exploratory modeling activity where the scope at the time of the initial model run experiments was focused on highway measures, and transit usage was not explored extensively, and no network assignment was done for transit trips when the experiments were initially run. By creating a post-process step to run the transit network assignment, we can apply that step to existing archived results, as well as have it run automatically for future model experients where transit usage is under study, but continue to omit it for future model experients where we do not need it.

An optional measure_names argument allows the post-processor to identify which measures need additional computational effort to generate, and to skip excluded measures that are not currently of interest, or which have already been computed and do not need to be computed again.

The post processing is isolated from the main model run to allow it to be run later using archived model results. When executed directly after a core model run, it will operate on the results of the model stored in the local working directory. However, it can also be used with an optional output_path argument, which can be pointed at a model archive directory instead of the local working directory.

A consequence of this (and an intentional limitation) is that the post_process method should only use files from the set of files that are or will be archived from the core model run, and not attempt to use other non-persistent temporary or intermediate files that will not be archived.

[15]:

fx.post_process()

[00:06.52] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-c7c8bb08-6428-11eb-8c2b-acde48001122

At this point, the model’s output performance measures should be available in one or more output files that can be read in the next step. For this example, the results are written to two separate files: ‘output_1.csv.gz’ and ‘output.yaml’.

[16]:

show_file_contents(fx.local_directory, 'road-test-files', "Outputs", "output.yaml")

build_travel_time: 60.78943107038733
no_build_travel_time: 67.404
time_savings: 6.614568929612666

Note in this example, some of the values in the output_1.csv.gz file are intentionally manipulated in a contrived manner, so that there is some work for the post-processor to do.

[17]:

show_file_contents(fx.local_directory, 'road-test-files', "Outputs", "output_1.csv.gz")

,value_of_time_savings,present_cost_expansion,cost_of_capacity_expansion,net_benefits
exp,1.0508604102769246,,1.3651577800056909,
plain,49.60926697209499,7500.0,311.2700117047018,-261.6607447326066

load-measures¶

The load_measures method is the place to actually reach into files in the core model’s run results and extract performance measures, returning a dictionary of key-value pairs for the various performance measures. It takes an optional list giving a subset of performance measures to load, and like the post_process method also can be pointed at an archive location instead of loading measures from the local working directory (which is the default). The load_measures method should not do any post-processing of results (i.e. it should read from but not write to the model outputs directory).

[18]:

fx.load_measures()

[18]:

{'value_of_time_savings': 49.60926697209499,
 'present_cost_expansion': 7500.0,
 'cost_of_capacity_expansion': 311.2700117047018,
 'net_benefits': -261.6607447326066,
 'build_travel_time': 60.78943107038733,
 'no_build_travel_time': 67.404,
 'time_savings': 6.614568929612666}

You may note that the implementation of RoadTestFileModel in the core_files_demo module does not actually include a load_measures method itself, but instead inherits this method from the FilesCoreModel superclass. The instructions on how to actually find the relevant performance measures for this file are instead loaded into table parsers, which are defined in the RoadTestFileModel.__init__ constructor. There are details and illustrations of how to write and use parsers in the file parsing examples page of the TMIP-EMAT documentation.

archive¶

The archive method copies the relevant model output files to an archive location for longer term storage. The particular archive location is based on the experiment id for a particular experiment, and can be customized if desired by overloading the get_experiment_archive_path method. This customization is not done in this demo, so the default location is used.

[19]:

fx.get_experiment_archive_path(parameters=params)

[19]:

'/var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122'

Actually running the archive method should copy any relevant output files from the model_path of the current active model into a subdirectory of archive_path.

[20]:

fx.archive(params)

[00:06.55] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122

[21]:

show_dir(fx.local_directory)

tmpygcky5rc/
├── _emat_experiment_id_.yml
├── _emat_parameters_.yml
├── archive/
│   └── scp_EMAT Road Test/
│       └── exp_001_c7c8bb08-6428-11eb-8c2b-acde48001122/
│           ├── _emat_start_.log
│           ├── demo-inputs-l.yml
│           ├── demo-inputs-x.yml
│           ├── demo-inputs-x.yml.template
│           ├── emat-road-test.log
│           ├── output.csv.gz
│           ├── output.yaml
│           └── Outputs/
│               ├── output.yaml
│               └── output_1.csv.gz
├── road-test-colleague.sqlitedb
├── road-test-demo.db
├── road-test-files/
│   ├── demo-inputs-l.yml
│   ├── demo-inputs-x.yml
│   ├── demo-inputs-x.yml.template
│   ├── emat-road-test.log
│   ├── output.csv.gz
│   ├── output.yaml
│   └── Outputs/
│       ├── output.yaml
│       └── output_1.csv.gz
├── road-test-model-config.yml
└── road-test-scope.yml

It is permissible, but not required, to simply copy the entire contents of the former to the latter, as is done in this example. However, if the current active model directory has a lot of boilerplate files that don’t change with the inputs, or if it becomes full of intermediate or temporary files that definitely will never be used to compute performance measures, it can be advisable to selectively copy only relevant files. In that case, those files and whatever related sub-directory tree structure exists in the current active model should be replicated within the experiments archive directory.

Normal Operation for Running Multiple Experiments¶

For this demo, we’ll create a design of experiments with only 8 experiments. The design_experiments method of the RoadTestFileModel object is not defined in the custom core_files_demo written for this model, but rather is a generic function provide by the TMIP-EMAT main library. Real applications will typically use a larger number of experiments, but this small number is sufficient to demonstrate the operation of the tools.

[22]:

design1 = fx.design_experiments(design_name='lhs_1', n_samples=8)
design1

[22]:

	alpha	amortization_period	beta	debt_type	expand_capacity	input_flow	interest_rate	interest_rate_lock	unit_cost_expansion	value_of_time	yield_curve	free_flow_time	initial_capacity
experiment
2	0.134750	36	4.642370	Rev Bond	24.901018	85	0.039219	False	143.519760	0.017005	0.003280	60	100
3	0.115907	50	5.242315	Rev Bond	11.985022	114	0.025997	True	107.836739	0.145378	0.009659	60	100
4	0.178456	30	3.510139	Paygo	72.399552	121	0.029066	True	127.838010	0.067821	-0.000839	60	100
5	0.110023	44	4.887030	Paygo	28.565637	127	0.034673	True	125.799829	0.083593	0.002714	60	100
6	0.161977	24	3.865644	GO Bond	82.811459	135	0.028634	True	118.989447	0.054968	0.006745	60	100
7	0.173449	21	4.094118	Paygo	43.476172	142	0.033847	False	133.480778	0.184904	0.017674	60	100
8	0.141973	19	5.331978	GO Bond	52.445940	98	0.031519	False	99.317350	0.115228	0.014752	60	100
9	0.193762	40	4.453382	Rev Bond	92.278968	95	0.037907	False	103.543415	0.092517	0.014360	60	100

The run_experiments command will automatically run the model once for each experiment in the named design. The demo command line version of the road test model is (intentionally) a little bit slow, so will take a few seconds to conduct these eight model experiment runs.

[23]:

fx.run_experiments(design_name='lhs_1')

[00:06.66] MainProcess/INFO: performing 8 scenarios/policies * 1 model(s) = 8 experiments
[00:06.67] MainProcess/INFO: performing experiments sequentially
[00:06.67] MainProcess/INFO: RoadTestFileModel SETUP RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:06.68] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 2 RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.14] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.15] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.16] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_002_c9660fe2-6428-11eb-8c2b-acde48001122
[00:09.17] MainProcess/INFO: RAN EXPERIMENT IN 2.50 SECONDS
[00:09.17] MainProcess/INFO: 1 cases completed
[00:09.18] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:09.18] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 3 RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.56] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.57] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.58] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_003_cae3c1f2-6428-11eb-8c2b-acde48001122
[00:11.59] MainProcess/INFO: RAN EXPERIMENT IN 2.42 SECONDS
[00:11.59] MainProcess/INFO: 2 cases completed
[00:11.60] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:11.60] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 4 RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.00] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.01] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.02] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_004_cc552f1c-6428-11eb-8c2b-acde48001122
[00:14.03] MainProcess/INFO: RAN EXPERIMENT IN 2.44 SECONDS
[00:14.03] MainProcess/INFO: 3 cases completed
[00:14.04] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:14.04] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 5 RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.46] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.47] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.48] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_005_cdc944c8-6428-11eb-8c2b-acde48001122
[00:16.48] MainProcess/INFO: RAN EXPERIMENT IN 2.45 SECONDS
[00:16.48] MainProcess/INFO: 4 cases completed
[00:16.49] MainProcess/INFO: RoadTestFileModel SETUP RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:16.50] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 6 RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:18.99] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.00] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.01] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_006_cf3fd362-6428-11eb-8c2b-acde48001122
[00:19.02] MainProcess/INFO: RAN EXPERIMENT IN 2.53 SECONDS
[00:19.02] MainProcess/INFO: 5 cases completed
[00:19.02] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:19.03] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 7 RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.37] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.39] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d0c23324-6428-11eb-8c2b-acde48001122
[00:21.40] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_007_d0c23324-6428-11eb-8c2b-acde48001122
[00:21.40] MainProcess/INFO: RAN EXPERIMENT IN 2.38 SECONDS
[00:21.40] MainProcess/INFO: 6 cases completed
[00:21.41] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:21.42] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 8 RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.88] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.89] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d22e7272-6428-11eb-8c2b-acde48001122
[00:23.91] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_008_d22e7272-6428-11eb-8c2b-acde48001122
[00:23.91] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:23.91] MainProcess/INFO: 7 cases completed
[00:23.92] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d3ad349e-6428-11eb-8c2b-acde48001122
[00:23.93] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 9 RUNID-d3ad349e-6428-11eb-8c2b-acde48001122
[00:26.40] MainProcess/ERROR: ERROR in run_core_model run 9: Command '['emat-road-test-demo', '--uncs', 'demo-inputs-x.yml', '--levers', 'demo-inputs-l.yml']' returned non-zero exit status 247.
[00:26.40] MainProcess/ERROR: run_core_model ABORT 9
[00:26.40] MainProcess/INFO: RAN EXPERIMENT IN 2.49 SECONDS
[00:26.41] MainProcess/INFO: 8 cases completed
[00:26.41] MainProcess/INFO: experiments finished

[23]:

	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	free_flow_time	initial_capacity	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
2	0.134750	4.642370	85	0.017005	143.519760	0.039219	0.003280	24.901018	36	Rev Bond	False	60	100	63.802031	61.354318	2.447713	3.537924	-175.134011	178.671935	3573.788162
3	0.115907	5.242315	114	0.145378	107.836739	0.025997	0.009659	11.985022	50	Rev Bond	True	60	100	73.822149	67.635961	6.186188	102.524606	42.361931	60.162675	1292.425695
4	0.178456	3.510139	121	0.067821	127.838010	0.029066	-0.000839	72.399552	30	Paygo	True	60	100	80.905959	63.090263	17.815696	146.201784	-183.229667	329.431451	9255.414627
5	0.110023	4.887030	127	0.083593	125.799829	0.034673	0.002714	28.565637	44	Paygo	True	60	100	81.228935	66.217725	15.011210	159.363387	61.567385	97.796001	3593.552248
6	0.161977	3.865644	135	0.054968	118.989447	0.028634	0.006745	82.811459	24	GO Bond	True	60	100	91.004725	63.010343	27.994382	207.737074	-374.545620	582.282695	9853.689681
7	0.173449	4.094118	142	0.184904	133.480778	0.033847	0.017674	43.476172	21	Paygo	False	60	100	103.733064	69.975507	33.757557	886.351136	604.765873	281.585263	5803.233337
8	0.141973	5.331978	98	0.115228	99.317350	0.031519	0.014752	52.445940	19	GO Bond	False	60	100	67.648456	60.807614	6.840842	77.249238	-282.042005	359.291243	5208.791736
9	0.193762	4.453382	95	0.092517	103.543415	0.037907	0.014360	92.278968	40	Rev Bond	False	60	100	NaN	NaN	NaN	NaN	NaN	NaN	NaN

Re-running Failed Experiments¶

If you pay attention to the logged output, you might notice that one of the experiments (the last one) failed. We can see NaN values in the outputs.

[24]:

results = fx.db.read_experiment_all(fx.scope, 'lhs_1')
results

[24]:

	free_flow_time	initial_capacity	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
2	60	100	0.134750	4.642370	85	0.017005	143.519760	0.039219	0.003280	24.901018	36	Rev Bond	False	63.802031	61.354318	2.447713	3.537924	-175.134011	178.671935	3573.788162
3	60	100	0.115907	5.242315	114	0.145378	107.836739	0.025997	0.009659	11.985022	50	Rev Bond	True	73.822149	67.635961	6.186188	102.524606	42.361931	60.162675	1292.425695
4	60	100	0.178456	3.510139	121	0.067821	127.838010	0.029066	-0.000839	72.399552	30	Paygo	True	80.905959	63.090263	17.815696	146.201784	-183.229667	329.431451	9255.414627
5	60	100	0.110023	4.887030	127	0.083593	125.799829	0.034673	0.002714	28.565637	44	Paygo	True	81.228935	66.217725	15.011210	159.363387	61.567385	97.796001	3593.552248
6	60	100	0.161977	3.865644	135	0.054968	118.989447	0.028634	0.006745	82.811459	24	GO Bond	True	91.004725	63.010343	27.994382	207.737074	-374.545620	582.282695	9853.689681
7	60	100	0.173449	4.094118	142	0.184904	133.480778	0.033847	0.017674	43.476172	21	Paygo	False	103.733064	69.975507	33.757557	886.351136	604.765873	281.585263	5803.233337
8	60	100	0.141973	5.331978	98	0.115228	99.317350	0.031519	0.014752	52.445940	19	GO Bond	False	67.648456	60.807614	6.840842	77.249238	-282.042005	359.291243	5208.791736
9	60	100	0.193762	4.453382	95	0.092517	103.543415	0.037907	0.014360	92.278968	40	Rev Bond	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN

We can collect the id’s of the failed experiments programmatically. To collect all the experiments that are missing any performance measure output, we can do this:

[25]:

fails = results.isna().any(axis=1)
failed_experiment_ids = fails.index[fails]
failed_experiment_ids

[25]:

Int64Index([9], dtype='int64', name='experiment')

When there is an error (thrown as a subprocess.CalledProcessError) during the execution of a FilesCoreModel, the output from stdout and stderr are written to log files in the archive location, instead of having the legit model outputs written there.

We can see the log output by reading in the log file, like this:

[26]:

error_log = os.path.join(
    fx.get_experiment_archive_path(9),
    'error.stdout.log'
)
with open(error_log, 'r') as stdout:
    error_log_content = stdout.read()

print(error_log_content)

[2021-01-31 19:00:10,065] emat.RoadTest.INFO: running emat-road-test-demo
[2021-01-31 19:00:10,068] emat.RoadTest.ERROR: Random crash, ha ha!

Here we see the log file is explicitly taunting us about randomly crashing the model run. That’s fine – we wanted to crash the execution randomly to show what to do in this event, cause it happens sometimes. Maybe a disk filled up, or there is an intermittent license problem that causes a failure one in a while. If that’s the case and we can fix it just by re-running, awesome!

We can load just the failed experiments to try them again.

[27]:

failed_experiments = fx.read_experiment_parameters(experiment_ids=failed_experiment_ids)
failed_experiments

[27]:

	free_flow_time	initial_capacity	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock
experiment
9	60	100	0.193762	4.453382	95	0.092517	103.543415	0.037907	0.01436	92.278968	40	Rev Bond	False

Normally, there is a “short circuit” process that will prevent re-running a core model experiment, instead the performance measure results will simply be loaded from the database, which is typically much faster than actually running the core model. But, if the performance measures stored in the database are junk, we will not want to trigger the short circuit system, and actually run the full core model again. To do so, we can disable the short circuit like and re-run the failed experiment. If it failed because of a transient error, e.g. a disk space problem that’s been fixed, then perhaps we can simply re-run the model and it will work.

[28]:

fx.run_experiments(failed_experiments, allow_short_circuit=False)

[00:26.53] MainProcess/INFO: performing 1 scenarios/policies * 1 model(s) = 1 experiments
[00:26.54] MainProcess/INFO: performing experiments sequentially
[00:26.54] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:26.55] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 9 RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.93] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.94] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d53da118-6428-11eb-8c2b-acde48001122
[00:28.95] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmpygcky5rc/archive/scp_EMAT Road Test/exp_009_d53da118-6428-11eb-8c2b-acde48001122
[00:28.96] MainProcess/INFO: RAN EXPERIMENT IN 2.42 SECONDS
[00:28.96] MainProcess/INFO: 1 cases completed
[00:28.96] MainProcess/INFO: experiments finished

[28]:

	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	free_flow_time	initial_capacity	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
9	0.193762	4.453382	95	0.092517	103.543415	0.037907	0.01436	92.278968	40	Rev Bond	False	60	100	69.251572	60.503221	8.748351	76.890662	-385.524839	462.4155	9554.87952

Much better! Now we can see we have a more complete set of outputs, without the NaN’s. Hooray!

[29]:

results = fx.db.read_experiment_all(scope_name=fx.scope.name, design_name='lhs_1')
results

[29]:

	free_flow_time	initial_capacity	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
2	60	100	0.134750	4.642370	85	0.017005	143.519760	0.039219	0.003280	24.901018	36	Rev Bond	False	63.802031	61.354318	2.447713	3.537924	-175.134011	178.671935	3573.788162
3	60	100	0.115907	5.242315	114	0.145378	107.836739	0.025997	0.009659	11.985022	50	Rev Bond	True	73.822149	67.635961	6.186188	102.524606	42.361931	60.162675	1292.425695
4	60	100	0.178456	3.510139	121	0.067821	127.838010	0.029066	-0.000839	72.399552	30	Paygo	True	80.905959	63.090263	17.815696	146.201784	-183.229667	329.431451	9255.414627
5	60	100	0.110023	4.887030	127	0.083593	125.799829	0.034673	0.002714	28.565637	44	Paygo	True	81.228935	66.217725	15.011210	159.363387	61.567385	97.796001	3593.552248
6	60	100	0.161977	3.865644	135	0.054968	118.989447	0.028634	0.006745	82.811459	24	GO Bond	True	91.004725	63.010343	27.994382	207.737074	-374.545620	582.282695	9853.689681
7	60	100	0.173449	4.094118	142	0.184904	133.480778	0.033847	0.017674	43.476172	21	Paygo	False	103.733064	69.975507	33.757557	886.351136	604.765873	281.585263	5803.233337
8	60	100	0.141973	5.331978	98	0.115228	99.317350	0.031519	0.014752	52.445940	19	GO Bond	False	67.648456	60.807614	6.840842	77.249238	-282.042005	359.291243	5208.791736
9	60	100	0.193762	4.453382	95	0.092517	103.543415	0.037907	0.014360	92.278968	40	Rev Bond	False	69.251572	60.503221	8.748351	76.890662	-385.524839	462.415500	9554.879520

Multiprocessing for Running Multiple Experiments¶

The examples above are all single-process demonstrations of using TMIP-EMAT to run core model experiments. If your core model itself is multi-threaded or otherwise is designed to make full use of your multi-core CPU, or if a single core model run will otherwise max out some computational resource (e.g. RAM, disk space) then single process operation should be sufficient.

If, on the other hand, your core model is such that you can run multiple independent instances of the model side-by-side on the same machine, then you could benefit from a multiprocessing approach. This can be accomplished by splitting a design of experiments over several processes that you start manually, or by using an automatic multiprocessing library such as dask.distributed.

Running a Subset of Experiments Manually¶

Suppose, for example, you wanted to distribute the workload of running experiments over several processes, or even over several computers. If each process has file system access to the same TMIP-EMAT database of experiments, we can orchestrate these experiments in parallel by manually splitting up the processes.

To begin with, we’ll have one process create a complete design of experiments, and save it to the database (which happens automatically here).

[30]:

design2 = fx.design_experiments(design_name='lhs_2', n_samples=8, random_seed=42)

Then, we can create set up a copy of the same model in a different process, even on a different machine, as long as we point back to the same original database file. This implies the different process has access to the file system where the original file is stored. It is valuable to read and write to the same database file, not just a copy of the file, as this will obviate the need to sync the experimental data manually afterwards. In this demo, we’ll just create a new directory to work in, but we’ll point to the database in the original directory. Instead of allowing our model to implicitly create a new database file in the new directory, we’ll instantiate a SQLiteDB object pointing to the original database.

[31]:

database_filename = fx.db.database_path
db2 = emat.SQLiteDB(database_filename)

[00:29.10] MainProcess/INFO: running script emat_db_init.sql
[00:29.10] MainProcess/INFO: running script meta_model.sql
[00:29.11] MainProcess/INFO: found no experiments with missing run_id's
[00:29.11] MainProcess/INFO: running script emat_db_init_views.sql

Now, db2 is a emat.SQLiteDB object, which wraps a new connection to the original database. Then, we’ll pass that db2 explicitly to the new RoadTestFileModel constructor, which will create a complete copy of our model (other than the database) in a new directory.

[32]:

fx2 = core_files_demo.RoadTestFileModel(db=db2)

[00:29.12] MainProcess/WARNING: changing cwd to /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon

To run a particular slice of a design of experiments, we need to load the experimental design first, and then pass that slice to the run_experiments function, instead of just giving the design_name.

[33]:

design2 = fx.read_experiment_parameters('lhs_2')

For splitting the work across a number of similarly capable processes or machines, the double-colon slice is convenient. If, for example, you are splitting the work over 4 computers, you can run each with slices 0::4, 1::4, 2::4, and 3::4. This slices in skip-step manner, so slice below will run every 4th experiment from the design, starting with experiment index 0 (i.e. the first one).

[34]:

fx2.run_experiments(design2.iloc[0::4])

[00:29.19] MainProcess/INFO: performing 2 scenarios/policies * 1 model(s) = 2 experiments
[00:29.21] MainProcess/INFO: performing experiments sequentially
[00:29.21] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:29.22] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 10 RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.69] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.70] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.72] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/archive/scp_EMAT Road Test/exp_010_d6d5244c-6428-11eb-8c2b-acde48001122
[00:31.72] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:31.72] MainProcess/INFO: 1 cases completed
[00:31.73] MainProcess/INFO: RoadTestFileModel SETUP RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:31.74] MainProcess/INFO: RoadTestFileModel SETUP complete experiment_id 14 RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.21] MainProcess/INFO: RoadTestFileModel RUN complete RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.22] MainProcess/INFO: RoadTestFileModel POST-PROCESS complete RUNID-d85582c6-6428-11eb-8c2b-acde48001122
[00:34.23] MainProcess/INFO: RoadTestFileModel ARCHIVE
 from: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/road-test-files
   to: /var/folders/js/bk_dt9015j79_f6bxnc44dsr0000gp/T/tmph_ksfgon/archive/scp_EMAT Road Test/exp_014_d85582c6-6428-11eb-8c2b-acde48001122
[00:34.24] MainProcess/INFO: RAN EXPERIMENT IN 2.51 SECONDS
[00:34.24] MainProcess/INFO: 2 cases completed
[00:34.24] MainProcess/INFO: experiments finished

[34]:

	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	free_flow_time	initial_capacity	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
10	0.114450	3.648104	100	0.034746	120.196432	0.028321	0.001514	5.621927	34	Paygo	True	60	100	66.867014	65.624845	1.242169	4.316063	-17.502784	21.818848	675.735529
14	0.132514	4.016263	133	0.107612	131.922290	0.035993	0.015811	54.081760	22	Paygo	False	60	100	84.993873	64.403269	20.590604	294.700221	-37.110353	331.810575	7134.589600

Because we have linked the second model instance back to the same database, after these experiments have finished we can access the results from the original fx instance.

[35]:

fx.read_experiment_measures('lhs_2')

[35]:

		no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment	run
10	d6d5244c-6428-11eb-8c2b-acde48001122	66.867014	65.624845	1.242169	4.316063	-17.502784	21.818848	675.735529
14	d85582c6-6428-11eb-8c2b-acde48001122	84.993873	64.403269	20.590604	294.700221	-37.110353	331.810575	7134.589600

It is important to note that for this manual multiprocessing technique to work, where different processes run the model simultaneously, each process must be in a seperate Python instance (e.g. in seperate Jupyter notebooks, not in the same notebook as shown here).

Automatic Multiprocessing for Running Multiple Experiments¶

The examples above are all essentially single-process demonstrations of using TMIP-EMAT to run core model experiments, either by running all in one single process, or by having a user manually instantiate a number of single processes. If your core model itself is multi-threaded or otherwise is designed to make full use of your multi-core CPU, or if a single core model run will otherwise max out some computational resource (e.g. RAM, disk space) then single process operation should be sufficient.

If, on the other hand, your model is such that you can run multiple independent instances of the model side-by-side on the same machine, but you don’t want to manage the process of manually, then you could benefit from a multiprocessing approach that uses the dask.distributed library. To demonstrate this, we’ll create yet another small design of experiments to run.

[36]:

design3 = fx.design_experiments(design_name='lhs_3', n_samples=30, random_seed=3)
design3

[36]:

	alpha	amortization_period	beta	debt_type	expand_capacity	input_flow	interest_rate	interest_rate_lock	unit_cost_expansion	value_of_time	yield_curve	free_flow_time	initial_capacity
experiment
18	0.151341	48	3.676012	Rev Bond	34.248792	139	0.037682	False	128.627672	0.004817	0.012079	60	100
19	0.118509	41	5.309555	Paygo	76.473048	89	0.035629	True	125.994370	0.091859	0.005987	60	100
20	0.174355	18	4.727608	Rev Bond	94.255893	131	0.030304	False	139.257007	0.076668	0.016896	60	100
21	0.188622	26	3.703274	GO Bond	96.919294	125	0.036478	False	121.126672	0.110190	0.016048	60	100
22	0.160916	33	4.447292	GO Bond	64.247841	149	0.034941	False	104.498216	0.040406	0.010364	60	100
23	0.148182	24	5.015381	Paygo	83.286755	128	0.027313	False	144.602977	0.061343	0.014137	60	100
24	0.145859	43	3.943408	Rev Bond	61.408748	117	0.032835	True	107.081150	0.150078	0.004032	60	100
25	0.109392	32	5.465265	Paygo	29.100532	98	0.038449	False	142.411065	0.057756	0.001003	60	100
26	0.154192	37	4.918319	GO Bond	55.383965	145	0.038841	True	130.162175	0.119689	0.001579	60	100
27	0.106310	41	4.276815	GO Bond	87.269134	106	0.036746	True	113.206445	0.093281	0.019623	60	100
28	0.159061	47	3.630430	GO Bond	52.479431	102	0.029875	False	135.066324	0.075284	0.015455	60	100
29	0.190069	45	5.262368	Paygo	67.884939	101	0.026620	True	118.147867	0.125009	0.013234	60	100
30	0.123809	28	4.504676	Rev Bond	39.255873	124	0.037005	True	121.924833	0.029906	0.017495	60	100
31	0.141084	43	4.570515	GO Bond	42.644409	85	0.028042	False	101.824130	0.214848	0.005303	60	100
32	0.116030	50	4.772191	Paygo	78.781012	120	0.025253	False	119.915395	0.024683	0.011550	60	100
33	0.177997	22	4.097314	GO Bond	92.535470	138	0.039246	True	108.782051	0.142603	0.004617	60	100
34	0.121416	28	5.136051	Rev Bond	48.793510	112	0.029319	False	115.523335	0.113898	0.000314	60	100
35	0.168046	47	3.822497	Paygo	12.183780	135	0.030787	False	127.210823	0.131430	0.006618	60	100
36	0.132624	30	3.978002	Paygo	70.001596	105	0.025537	False	98.871279	0.065724	0.007461	60	100
37	0.139045	25	4.317356	Rev Bond	85.104877	148	0.035198	True	134.932609	0.045659	-0.002330	60	100
38	0.196744	20	4.412799	Paygo	3.928463	91	0.026193	False	124.260667	0.082519	0.003482	60	100
39	0.164386	18	3.559705	Rev Bond	2.563779	114	0.027669	True	105.599016	0.069089	0.018237	60	100
40	0.171876	37	5.189379	Rev Bond	43.416751	134	0.033763	True	110.240305	0.102707	0.009666	60	100
41	0.102494	36	4.637645	GO Bond	20.219566	90	0.028766	False	141.028735	0.036429	0.013430	60	100
42	0.128699	39	4.224275	Rev Bond	31.087547	82	0.033269	True	132.576526	0.135245	0.009204	60	100
43	0.194011	17	4.119391	Rev Bond	25.279767	83	0.031359	True	137.502268	0.053824	0.008654	60	100
44	0.112179	35	5.082739	Paygo	57.001926	119	0.039705	True	101.590247	0.087633	-0.001666	60	100
45	0.185542	21	4.843869	Paygo	18.339744	143	0.034300	True	96.190623	0.169604	0.002675	60	100
46	0.182904	31	3.879664	GO Bond	9.336654	96	0.032018	False	113.617685	0.100206	-0.000695	60	100
47	0.136037	15	5.422761	GO Bond	13.567899	108	0.031707	True	97.768721	0.160056	0.018624	60	100

The demo module is set up to facilitate distributed multiprocessing. During the setup step, the code detects if it is being run in a distributed “worker” environment instead of in a normal Python environment. If the “worker” environment is detected, then a copy of the entire files-based model is made into the worker’s local workspace, and the model is run there instead of in the master workspace. This allows each worker to edit the files independently and simultaneously, without disturbing other parallel workers.

With this small modification, we are ready to run this demo model in parallel subprocesses. to do, we will use the async_experiments method. Leveraging the asyncio Python interface will allow us to run multiple core models in parallel in the background, and monitor the progress interactively from within a Jupyter notebook.

[37]:

background = fx.async_experiments(
    design=design3,
    max_n_workers=2,
    stagger_start=5,
    batch_size=1,
)

[00:34.36] MainProcess/INFO: asynchronous_experiments(max_n_workers=2)

[38]:

background.progress() # Initially everything is pending

[38]:

'30 runs: 30 pending'

[39]:

await asyncio.sleep(15)
background.progress()

[00:34.37] MainProcess/INFO: AsyncExperimentalDesign.run start
[00:34.37] MainProcess/INFO: initializing default DistributedEvaluator.client
[00:34.37] MainProcess/INFO:   max_n_workers=2, actual n_workers=2
[00:34.37] MainProcess/INFO:   n_workers=2
[00:35.31] MainProcess/INFO: completed initializing default DistributedEvaluator.client
[00:37.49] MainProcess/INFO: AsyncExperimentalDesign.run dispatching experiments
[00:37.50] MainProcess/INFO: performing 30 scenarios/policies * 1 model(s) = 30 experiments
[00:37.52] MainProcess/INFO: experiments in asynchronous evaluator

[39]:

'30 runs: 2 done, 27 pending, 1 queued'

After 15 seconds, only 3 runs have executed, as the stagger_start allows only one run to start every 5 seconds. Un-started runs remain in the “pending” status. We can see the status of all runs in status.

[40]:

background.status()

[40]:

experiment
18       done
19       done
20     queued
21    pending
22    pending
23    pending
24    pending
25    pending
26    pending
27    pending
28    pending
29    pending
30    pending
31    pending
32    pending
33    pending
34    pending
35    pending
36    pending
37    pending
38    pending
39    pending
40    pending
41    pending
42    pending
43    pending
44    pending
45    pending
46    pending
47    pending
dtype: object

For fast running models, may find that stagger_start is not needed. Large and slow models, especially ones that begin with a massive file-copy operation on a hard disk, may benefit from staggering the task start times, so the first run has a clear shot at finishing disk actions promptly and moving to data processing as soon as possible.

If we are dissatisfied with this pace for this small and fast model, the stagger_start can be changed dynamically while runs are going. If we set it to zero, all remaining pending runs will be queued immediately (or, within a second or so). Queued runs will begin as soon as a worker process is available to handle them.

[41]:

background.stagger_start = 0
await asyncio.sleep(1)
background.status()

[00:49.53] MainProcess/INFO: AsyncExperimentalDesign.run dispatching task complete
[00:50.09] MainProcess/INFO: 3 cases completed

[41]:

experiment
18      done
19      done
20      done
21    queued
22    queued
23    queued
24    queued
25    queued
26    queued
27    queued
28    queued
29    queued
30    queued
31    queued
32    queued
33    queued
34    queued
35    queued
36    queued
37    queued
38    queued
39    queued
40    queued
41    queued
42    queued
43    queued
44    queued
45    queued
46    queued
47    queued
dtype: object

[42]:

await asyncio.sleep(15)
background.progress()

[00:55.05] MainProcess/INFO: 6 cases completed
[00:58.24] MainProcess/INFO: 9 cases completed
[01:02.97] MainProcess/INFO: 12 cases completed

[42]:

'30 runs: 13 done, 17 queued'

After 15 more seconds, only about 10 more runs are complete, as each run takes around 3 seconds, and we only have two workers processing the runs.

The current_results method allows us to view the run results that are available currently. You might notice that runs are not necessarily dispatched in the order they appear in the table, but we’ll get to all of them eventually.

[43]:

background.current_results().head(15)

[43]:

	free_flow_time	initial_capacity	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
18	60	100	0.151341	3.676012	139	0.004817	128.627672	0.037682	0.012079	34.248792	48	Rev Bond	False	90.467198	70.318877	20.148321	13.491094	-192.374479	205.865574	4405.342341
19	60	100	0.118509	5.309555	89	0.091859	125.994370	0.035629	0.005987	76.473048	41	Paygo	True	63.829895	60.187687	3.642208	29.776845	-243.694139	273.470984	9635.173464
20	60	100	0.174355	4.727608	131	0.076668	139.257007	0.030304	0.016896	94.255893	18	Rev Bond	False	97.497064	61.624340	35.872724	360.289988	-600.998696	961.288684	13125.793539
21	60	100	0.188622	3.703274	125	0.110190	121.126672	0.036478	0.016048	96.919294	26	GO Bond	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
22	60	100	0.160916	4.447292	149	0.040406	104.498216	0.034941	0.010364	64.247841	33	GO Bond	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
23	60	100	0.148182	5.015381	128	0.061343	144.602977	0.027313	0.014137	83.286755	24	Paygo	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
24	60	100	0.145859	3.943408	117	0.150078	107.081150	0.032835	0.004032	61.408748	43	Rev Bond	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN
25	60	100	0.109392	5.465265	98	0.057756	142.411065	0.038449	0.001003	29.100532	32	Paygo	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
26	60	100	0.154192	4.918319	145	0.119689	130.162175	0.038841	0.001579	55.383965	37	GO Bond	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN
27	60	100	0.106310	4.276815	106	0.093281	113.206445	0.036746	0.019623	87.269134	41	GO Bond	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN
28	60	100	0.159061	3.630430	102	0.075284	135.066324	0.029875	0.015455	52.479431	47	GO Bond	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
29	60	100	0.190069	5.262368	101	0.125009	118.147867	0.026620	0.013234	67.884939	45	Paygo	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN
30	60	100	0.123809	4.504676	124	0.029906	121.924833	0.037005	0.017495	39.255873	28	Rev Bond	True	NaN	NaN	NaN	NaN	NaN	NaN	NaN
31	60	100	0.141084	4.570515	85	0.214848	101.824130	0.028042	0.005303	42.644409	43	GO Bond	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN
32	60	100	0.116030	4.772191	120	0.024683	119.915395	0.025253	0.011550	78.781012	50	Paygo	False	NaN	NaN	NaN	NaN	NaN	NaN	NaN

If we want to simply block the Python interpreter until the runs are done, we can do so by awaiting the final_results.

[44]:

await background.final_results()

[01:06.21] MainProcess/INFO: 15 cases completed
[01:11.06] MainProcess/INFO: 18 cases completed
[01:14.23] MainProcess/INFO: 21 cases completed
[01:19.02] MainProcess/INFO: 24 cases completed
[01:22.12] MainProcess/INFO: 27 cases completed
[01:26.65] MainProcess/INFO: 30 cases completed

[44]:

	free_flow_time	initial_capacity	alpha	beta	input_flow	value_of_time	unit_cost_expansion	interest_rate	yield_curve	expand_capacity	amortization_period	debt_type	interest_rate_lock	no_build_travel_time	build_travel_time	time_savings	value_of_time_savings	net_benefits	cost_of_capacity_expansion	present_cost_expansion
experiment
18	60	100	0.151341	3.676012	139	0.004817	128.627672	0.037682	0.012079	34.248792	48	Rev Bond	False	90.467198	70.318877	20.148321	13.491094	-192.374479	205.865574	4405.342341
19	60	100	0.118509	5.309555	89	0.091859	125.994370	0.035629	0.005987	76.473048	41	Paygo	True	63.829895	60.187687	3.642208	29.776845	-243.694139	273.470984	9635.173464
20	60	100	0.174355	4.727608	131	0.076668	139.257007	0.030304	0.016896	94.255893	18	Rev Bond	False	97.497064	61.624340	35.872724	360.289988	-600.998696	961.288684	13125.793539
21	60	100	0.188622	3.703274	125	0.110190	121.126672	0.036478	0.016048	96.919294	26	GO Bond	False	85.859973	62.102800	23.757173	327.226070	-334.329873	661.555942	11739.511583
22	60	100	0.160916	4.447292	149	0.040406	104.498216	0.034941	0.010364	64.247841	33	GO Bond	False	116.880236	66.259960	50.620276	304.758181	-30.577074	335.335254	6713.784698
23	60	100	0.148182	5.015381	128	0.061343	144.602977	0.027313	0.014137	83.286755	24	Paygo	False	90.665184	61.468734	29.196450	229.247199	-288.854311	518.101510	12043.512651
24	60	100	0.145859	3.943408	117	0.150078	107.081150	0.032835	0.004032	61.408748	43	Rev Bond	True	76.254321	62.460523	13.793797	242.207489	-70.587693	312.795183	6575.719357
25	60	100	0.109392	5.465265	98	0.057756	142.411065	0.038449	0.001003	29.100532	32	Paygo	False	65.877391	61.455236	4.422155	25.029593	-115.116663	140.146256	4144.237736
26	60	100	0.154192	4.918319	145	0.119689	130.162175	0.038841	0.001579	55.383965	37	GO Bond	True	117.527065	66.583783	50.943282	884.112291	539.879198	344.233094	7208.897323
27	60	100	0.106310	4.276815	106	0.093281	113.206445	0.036746	0.019623	87.269134	41	GO Bond	True	68.183795	60.559329	7.624466	75.389314	-381.495063	456.884376	9879.428427
28	60	100	0.159061	3.630430	102	0.075284	135.066324	0.029875	0.015455	52.479431	47	GO Bond	False	70.255064	62.217189	8.037875	61.722934	-256.677328	318.400262	7088.203858
29	60	100	0.190069	5.262368	101	0.125009	118.147867	0.026620	0.013234	67.884939	45	Paygo	True	72.017180	60.786517	11.230663	141.797161	-73.766382	215.563543	8020.460747
30	60	100	0.123809	4.504676	124	0.029906	121.924833	0.037005	0.017495	39.255873	28	Rev Bond	True	79.576635	64.404583	15.172052	56.263770	-210.531699	266.795469	4786.265822
31	60	100	0.141084	4.570515	85	0.214848	101.824130	0.028042	0.005303	42.644409	43	GO Bond	False	64.027525	60.794355	3.233170	59.044512	-139.377225	198.421737	4342.229844
32	60	100	0.116030	4.772191	120	0.024683	119.915395	0.025253	0.011550	78.781012	50	Paygo	False	76.618381	61.038635	15.579747	46.147516	-194.872685	241.020201	9447.056124
33	60	100	0.177997	4.097314	138	0.142603	108.782051	0.039246	0.004617	92.535470	22	GO Bond	True	99.966081	62.728737	37.237343	732.799774	104.331664	628.468110	10066.198296
34	60	100	0.121416	5.136051	112	0.113898	115.523335	0.029319	0.000314	48.793510	28	Rev Bond	False	73.038039	61.693605	11.344433	144.715681	-169.489549	314.205230	5636.788951
35	60	100	0.168046	3.822497	135	0.131430	127.210823	0.030787	0.006618	12.183780	47	Paygo	False	91.752596	80.460756	11.291840	200.351591	159.635389	40.716201	1549.908642
36	60	100	0.132624	3.978002	105	0.065724	98.871279	0.025537	0.007461	70.001596	30	Paygo	False	69.661928	61.170365	8.491563	58.600606	-187.746392	246.346998	6921.147330
37	60	100	0.139045	4.317356	148	0.045659	134.932609	0.035198	-0.002330	85.104877	25	Rev Bond	True	105.330218	63.175774	42.154444	284.861228	-395.777779	680.639007	11483.423089
38	60	100	0.196744	4.412799	91	0.082519	124.260667	0.026193	0.003482	3.928463	20	Paygo	False	67.785922	66.568468	1.217455	9.142197	-15.633244	24.775440	488.153405
39	60	100	0.164386	3.559705	114	0.069089	105.599016	0.027669	0.018237	2.563779	18	Rev Bond	True	75.724624	74.369604	1.355019	10.672342	-9.155192	19.827534	270.732536
40	60	100	0.171876	5.189379	134	0.102707	110.240305	0.033763	0.009666	43.416751	37	Rev Bond	True	107.093529	67.249450	39.844079	548.362942	311.279196	237.083746	4786.275850
41	60	100	0.102494	4.637645	90	0.036429	141.028735	0.028766	0.013430	20.219566	36	GO Bond	False	63.772612	61.606001	2.166611	7.103389	-130.417635	137.521024	2851.539859
42	60	100	0.128699	4.224275	82	0.135245	132.576526	0.033269	0.009204	31.087547	39	Rev Bond	True	63.339271	61.064241	2.275030	25.230374	-175.639567	200.869942	4121.478933
43	60	100	0.194011	4.119391	83	0.053824	137.502268	0.031359	0.008654	25.279767	17	Rev Bond	True	65.402926	62.135104	3.267821	14.598542	-250.630628	265.229169	3476.025296
44	60	100	0.112179	5.082739	119	0.087633	101.590247	0.039705	-0.001666	57.001926	35	Paygo	True	76.294739	61.645559	14.649179	152.766854	-30.252812	183.019666	5790.839697
45	60	100	0.185542	4.843869	143	0.169604	96.190623	0.034300	0.002675	18.339744	21	Paygo	True	122.953743	87.847511	35.106232	851.444399	765.845948	85.598451	1764.111444
46	60	100	0.182904	3.879664	96	0.100206	113.617685	0.032018	-0.000695	9.336654	31	GO Bond	False	69.366811	66.625117	2.741694	26.374431	-28.133075	54.507506	1060.809021
47	60	100	0.136037	5.422761	108	0.160056	97.768721	0.031707	0.018624	13.567899	15	GO Bond	True	72.389640	66.214709	6.174931	106.740422	-2.377699	109.118121	1326.516148