GBNRTC Example Model¶
[1]:
import emat
emat.require_version('0.5.1')
emat 0.5.2, plotly 4.14.3
[2]:
import pandas as pd
The model scope is defined in a YAML file. For this GBNRTC example, the scope file is named gbnrtc_scope.yaml.
[3]:
scope = emat.Scope('gbnrtc_scope.yaml')
[4]:
db = emat.SQLiteDB()
[5]:
scope.store_scope(db)
The basic operation of the GBNRTC model can be controlled by EMAT through a custom developed class, which defines the input and output “hooks” that are consistent with the defined scope file. The GBNRTCModel
class is able to call to TransCAD, setup the input parameters (exogenous uncertainties, policy levers, and constants defined in the scope), exceute the model, and retrieve the performance measure results.
[6]:
from emat.model import GBNRTCModel
[7]:
g = GBNRTCModel(
configuration='gbnrtc_model_config.yaml',
scope=scope,
db=db,
)
g
[7]:
<emat.model.core_files.gbnrtc_model.GBNRTCModel at 0x7ff6e0c3f3d0>
The GBNRTC model takes a couple of hours for each run, and runs in TransCAD, which is a proprietary software package that is not included with the EMAT distribution. However, for demonstration purposes, the definition and results of a particular set of experiments is included in the file buffalo.csv
. We can use the write_experiment_all
method to pre-load these results into the database.
[8]:
lhs = pd.read_csv('buffalo.csv')
[9]:
lhs.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 60 entries, 0 to 59
Data columns (total 92 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 expmntID 60 non-null int64
1 Land Use - CBD Focus 60 non-null float64
2 Freeway Capacity 60 non-null float64
3 Auto IVTT Sensitivity 60 non-null float64
4 Shared Mobility 60 non-null float64
5 Kensington Decommissioning 60 non-null float64
6 LRT Extension 60 non-null float64
7 Region-wide VMT 60 non-null float64
8 Interstate + Expressway + Ramp/Connector VMT 60 non-null float64
9 Major and Minor Arterials VMT 60 non-null float64
10 Total Auto VMT 60 non-null float64
11 Total Truck VMT 60 non-null float64
12 Daily Auto VMT - Interstate 60 non-null float64
13 Daily Truck VMT - Interstate 60 non-null float64
14 Daily Total VMT - Interstate 60 non-null float64
15 Daily Auto VMT - Expressway 60 non-null float64
16 Daily Truck VMT - Expressway 60 non-null float64
17 Daily Total VMT - Expressway 60 non-null float64
18 Daily Auto VMT - Major Arterial 60 non-null float64
19 Daily Truck VMT - Major Arterial 60 non-null float64
20 Daily Total VMT - Major Arterial 60 non-null float64
21 Daily Auto VMT - Minor Arterial 60 non-null float64
22 Daily Truck VMT - Minor Arterial 60 non-null float64
23 Daily Total VMT - Minor Arterial 60 non-null float64
24 Daily Auto VMT - Collector 60 non-null float64
25 Daily Truck VMT - Collector 60 non-null float64
26 Daily Total VMT - Collector 60 non-null float64
27 Daily Auto VMT - Freeway On-Ramp 60 non-null float64
28 Daily Truck VMT - Freeway On-Ramp 60 non-null float64
29 Daily Total VMT - Freeway On-Ramp 60 non-null float64
30 Daily Auto VMT - Freeway Off-Ramp 60 non-null float64
31 Daily Truck VMT - Freeway Off-Ramp 60 non-null float64
32 Daily Total VMT - Freeway Off-Ramp 60 non-null float64
33 Daily Auto VMT - Freeway to Freeway Connector 60 non-null float64
34 Daily Truck VMT - Freeway to Freeway Connector 60 non-null float64
35 Daily Total VMT - Freeway to Freeway Connector 60 non-null float64
36 Daily Auto VMT - Toll Booth 60 non-null float64
37 Daily Truck VMT - Toll Booth 60 non-null float64
38 Daily Total VMT - Toll Booth 60 non-null float64
39 Daily Auto VMT - Border Control 60 non-null float64
40 Daily Truck VMT - Border Control 60 non-null float64
41 Daily Total VMT - Border Control 60 non-null float64
42 Daily Auto VMT - Canadian Interstate 60 non-null float64
43 Daily Truck VMT - Canadian Interstate 60 non-null float64
44 Daily Total VMT - Canadian Interstate 60 non-null float64
45 Daily Auto VMT - Canadian Expressway 60 non-null float64
46 Daily Truck VMT - Canadian Expressway 60 non-null float64
47 Daily Total VMT - Canadian Expressway 60 non-null float64
48 Daily Auto VMT - Canadian Major Arterial 60 non-null float64
49 Daily Truck VMT - Canadian Major Arterial 60 non-null float64
50 Daily Total VMT - Canadian Major Arterial 60 non-null float64
51 AM Trip Time (minutes) 60 non-null float64
52 AM Trip Length (miles) 60 non-null float64
53 PM Trip Time (minutes) 60 non-null float64
54 PM Trip Length (miles) 60 non-null float64
55 Total Transit Boardings 60 non-null float64
56 Total LRT Boardings 60 non-null float64
57 Peak Transit Share 60 non-null float64
58 Peak NonMotorized Share 60 non-null float64
59 Off-Peak Transit Share 60 non-null float64
60 Off-Peak NonMotorized Share 60 non-null float64
61 Daily Transit Share 60 non-null float64
62 Daily NonMotorized Share 60 non-null float64
63 Downtown to Airport Travel Time 60 non-null float64
64 Households within 30 min of CBD 60 non-null float64
65 Number of Home-based work tours taking <= 45 minutes via transit 60 non-null float64
66 Corridor Kensington Daily VMT 60 non-null float64
67 Corridor Kensington Daily VHT 60 non-null float64
68 Corridor Kensington_OB PM VMT 60 non-null float64
69 Corridor Kensington_OB PM VHT 60 non-null float64
70 Corridor Kensington_IB AM VMT 60 non-null float64
71 Corridor Kensington_IB AM VHT 60 non-null float64
72 Corridor 190 Daily VMT 60 non-null float64
73 Corridor 190 Daily VHT 60 non-null float64
74 Corridor 190_OB Daily VMT 60 non-null float64
75 Corridor 190_OB Daily VHT 60 non-null float64
76 Corridor 190_IB Daily VMT 60 non-null float64
77 Corridor 190_IB Daily VHT 60 non-null float64
78 Corridor 33_west Daily VMT 60 non-null float64
79 Corridor 33_west Daily VHT 60 non-null float64
80 Corridor I90_south Daily VMT 60 non-null float64
81 Corridor I90_south Daily VHT 60 non-null float64
82 OD Volume District 1 to 1 60 non-null float64
83 OD Volume District 1 to 2 60 non-null float64
84 OD Volume District 1 to 3 60 non-null float64
85 OD Volume District 1 to 4 60 non-null float64
86 OD Volume District 1 to 5 60 non-null float64
87 OD Volume District 1 to 6 60 non-null float64
88 OD Volume District 1 to 7 60 non-null float64
89 OD Volume District 1 to 8 60 non-null float64
90 OD Volume District 1 to 9 60 non-null float64
91 OD Volume District 1 to 10 60 non-null float64
dtypes: float64(91), int64(1)
memory usage: 43.2 KB
[10]:
db.write_experiment_all(
'GBNRTC',
'lhs',
emat.SOURCE_IS_CORE_MODEL,
lhs,
)
We can check that the pre-loaded data includes the results of the experiments
by checking the number of rows in the read_experiments
DataFrame, both in total and when only loading pending experiments (those without stored performance
meaures):
[11]:
len(g.read_experiments('lhs'))
[11]:
60
[12]:
len(g.read_experiments('lhs', only_pending=True))
[12]:
0
The example data contains a large variety of output performance measures, as TransCAD models can potentially output a lot of data.
[13]:
g.scope.get_measure_names()
[13]:
['Region-wide VMT',
'Interstate + Expressway + Ramp/Connector VMT',
'Major and Minor Arterials VMT',
'Total Auto VMT',
'Total Truck VMT',
'AM Trip Time (minutes)',
'AM Trip Length (miles)',
'PM Trip Time (minutes)',
'PM Trip Length (miles)',
'Total Transit Boardings',
'Total LRT Boardings',
'Downtown to Airport Travel Time',
'Households within 30 min of CBD',
'Number of Home-based work tours taking <= 45 minutes via transit',
'Kensington Daily VMT',
'Kensington Daily VHT',
'Kensington_OB PM VMT',
'Kensington_OB PM VHT',
'Kensington_IB AM VMT',
'Kensington_IB AM VHT',
'Corridor 190 Daily VMT',
'Corridor 190 Daily VHT',
'Corridor 190_OB Daily VMT',
'Corridor 190_OB Daily VHT',
'Corridor 190_IB Daily VMT',
'Corridor 190_IB Daily VHT',
'Corridor 33_west Daily VMT',
'Corridor 33_west Daily VHT',
'Corridor I90_south Daily VMT',
'Corridor I90_south Daily VHT',
'OD Volume District 1 to 1',
'OD Volume District 1 to 2',
'OD Volume District 1 to 3',
'OD Volume District 1 to 4',
'OD Volume District 1 to 5',
'OD Volume District 1 to 6',
'OD Volume District 1 to 7',
'OD Volume District 1 to 8',
'OD Volume District 1 to 9',
'OD Volume District 1 to 10',
'Peak Transit Share',
'Peak NonMotorized Share',
'Off-Peak Transit Share',
'Off-Peak NonMotorized Share',
'Daily Transit Share',
'Daily NonMotorized Share']
The high level scope definition is designed to capture all of this data for later analysis, but in this demonstration we will only evaluate a few of these performance measures. In part, this is because creating meta-models for each performance measure is relatively inexpensive (computationally speaking) but not free – it can take a few seconds to create the meta-model and it is not needed here if we are not interested in all these results for this analysis.
Creating a meta-model for analysis of an existing model with a completed design of experiments can be done using the create_metamodel_from_design
method. To create a meta-model on a more limited scope, we can use the include_measures
argument to list out a subset of measures that will be included in this metamodel.
[14]:
mm = g.create_metamodel_from_design(
'lhs',
include_measures=[
'Region-wide VMT',
'AM Trip Time (minutes)',
'Downtown to Airport Travel Time',
'Total Transit Boardings',
'Peak Transit Share',
'Peak NonMotorized Share',
'Kensington Daily VMT',
'Corridor 190 Daily VMT',
'Corridor 33_west Daily VMT',
'Corridor I90_south Daily VMT',
],
suppress_converge_warnings=True,
)
mm
[14]:
<emat.PythonCoreModel "MetaModel1", metamodel_id=1 with 4 uncertainties, 2 levers, 9 measures>
You might notice that the class of the meta-model is no longer a GBNRTCModel
but instead now it is a PythonCoreModel
. This is because at its heart, the meta-model is a Python function that wraps the gaussian process regression that has been fit to the available experimental data. Also, although the scope still has 46 measures, only 10 are active in the actual meta-model:
[15]:
mm.function
[15]:
<emat.MetaModel 6 inputs -> 9 active and 46 total outputs>
[16]:
callable(mm.function)
[16]:
True
To access this regression directly, we can use the regression
attribute
of the MetaModel
.
[17]:
mm.function.regression
[17]:
BoostedRegressor(estimators=[('lr', LinearRegression()),
('gpr',
MultiOutputRegressor(estimator=AnisotropicGaussianProcessRegressor()))])
[18]:
mm.function.regression.lr.r2
[18]:
Region-wide VMT 0.996544
AM Trip Time (minutes) 0.975504
Total Transit Boardings 0.979225
Downtown to Airport Travel Time 0.958725
Corridor 190 Daily VMT 0.987302
Corridor 33_west Daily VMT 0.985156
Corridor I90_south Daily VMT 0.990920
Peak Transit Share 0.968484
Peak NonMotorized Share 0.972930
dtype: float64
[19]:
mm.function.regression.lr.coefficients_summary()
[19]:
Coefficient | StdError | t-Statistic | p | ||
---|---|---|---|---|---|
Region-wide VMT | Land Use - CBD Focus | 0.716566 | 0.005804 | 123.451376 | 0.000000e+00 |
Freeway Capacity | 0.016095 | 0.002926 | 5.500103 | 1.064930e-06 | |
Auto IVTT Sensitivity | -0.087768 | 0.011870 | -7.394311 | 9.431564e-10 | |
Shared Mobility | 0.056346 | 0.003111 | 18.112700 | 0.000000e+00 | |
Kensington Decommissioning | -0.002179 | 0.001219 | -1.786796 | 7.958416e-02 | |
... | ... | ... | ... | ... | ... |
Peak NonMotorized Share | Auto IVTT Sensitivity | 0.004077 | 0.002633 | 1.548439 | 1.273580e-01 |
Shared Mobility | -0.022950 | 0.000690 | -33.259215 | 0.000000e+00 | |
Kensington Decommissioning | -0.000207 | 0.000270 | -0.764370 | 4.479741e-01 | |
LRT Extension | 0.000080 | 0.000267 | 0.298330 | 7.665962e-01 | |
_Intercept_ | 0.038900 | 0.002846 | 13.666207 | 0.000000e+00 |
63 rows × 4 columns
We can also generate cross-validation scores for the MetaModel
to verify that the
meta-model is performing well.
[20]:
mm.function.cross_val_scores()
[20]:
Cross Validation Score | |
---|---|
Region-wide VMT | 0.9969 |
AM Trip Time (minutes) | 0.9796 |
Total Transit Boardings | 0.9949 |
Downtown to Airport Travel Time | 0.9699 |
Corridor 190 Daily VMT | 0.9863 |
Corridor 33_west Daily VMT | 0.9856 |
Corridor I90_south Daily VMT | 0.9914 |
Peak Transit Share | 0.9905 |
Peak NonMotorized Share | 0.9930 |
To use the metamodel for exploratory analysis, we can design and run a large number of experiments.
[21]:
design = mm.design_experiments(n_samples=10000, sampler='lhs')
The meta-model evaluates pretty quickly.
[22]:
result = mm.run_experiments(design)
If we inspect the results, we see that among the performance measures, only the active measures have non-null computed values:
[23]:
result.info()
<class 'emat.experiment.experimental_design.ExperimentalDesign'>
Int64Index: 10000 entries, 61 to 10060
Data columns (total 15 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Auto IVTT Sensitivity 10000 non-null float64
1 Freeway Capacity 10000 non-null float64
2 Kensington Decommissioning 10000 non-null bool
3 LRT Extension 10000 non-null bool
4 Land Use - CBD Focus 10000 non-null float64
5 Shared Mobility 10000 non-null float64
6 Region-wide VMT 10000 non-null float64
7 AM Trip Time (minutes) 10000 non-null float64
8 Total Transit Boardings 10000 non-null float64
9 Downtown to Airport Travel Time 10000 non-null float64
10 Corridor 190 Daily VMT 10000 non-null float64
11 Corridor 33_west Daily VMT 10000 non-null float64
12 Corridor I90_south Daily VMT 10000 non-null float64
13 Peak Transit Share 10000 non-null float64
14 Peak NonMotorized Share 10000 non-null float64
dtypes: bool(2), float64(13)
memory usage: 1.1 MB
The results of these meta-model experiments can be used for visualization and other exploratory modeling applications.
[24]:
from emat.viz import scatter_graphs
scatter_graphs('Downtown to Airport Travel Time', result, scope=mm.scope, render='png')
[24]: