[1]:

import emat
emat.versions()

emat 0.5.2, plotly 4.14.3

Scatter Plot Matrix¶

Once a series of experiments has been conducted for a core model, it is suggested that an analyst review the results to confirm that the model is behaving as expected. TMIP-EMAT provides some visualization tools to review results in a graphical manner, which is a generally convenient method to review this experimental data.

To demonstrate these tools, we will use the Road Test example model included in TMIP-EMAT. We can quickly construct and run a design of experiments to exercise this model and populate some results to visualize:

[2]:

import emat.examples
scope, db, model = emat.examples.road_test()
design = model.design_experiments()
results = model.run_experiments(design)

Given this set of experimental results, we can display a scatter plot matrix to see the results. This is a collection of two-dimensional plots, each showing a contrast between two factors, typically an input parameter (i.e. an uncertainty or a policy lever) and an output performance measure, although it is also possible to plot inputs against inputs or outputs against outputs.

The display_experiments function in the emat.analysis sub-package can automatically create a scatter plot matrix that crosses every parameter with every measure, simply by providing the scope and the results. By default, plots that display levers are shown in blue, plots that show uncertainties are in red.

[3]:

from emat.analysis import display_experiments
display_experiments(scope, results)

No Build Time

Build Time

Time Savings

Value Time Save

Net Benefits

Cost of Expand

Present Cost

This function also offers the opportunity to identify only a particular subset of parameters or measures to display, using the rows and columns arguments. Similar colors are used as the default full display, although if the plot contrasts an uncertainty with a lever the variable on the X axis determines the color; and the plot is green if only measures are shown. Because parameters and measures are all required to have unique names within a scope, it is not necessary to identify which is which, as the display_experiments can figure it out automatically.

[4]:

display_experiments(
    scope, results,
    rows=['input_flow', 'expand_capacity', 'build_travel_time'],
    columns=['net_benefits', 'time_savings', 'no_build_travel_time', 'yield_curve', 'expand_capacity'],
)

Input Flow

Expand Amount

Build Time

Reviewing these results can be instructive, not only for exploratory analysis but also for validation of the results from the core model. An analyst can quickly see the direction, magnitude, and shape of various parametric relationships in the model, and easily detect any parameters that are giving unexpected results. For example, in many transportation modeling applications we would expect that most parameters will induce a monotonic response in most performance measures. Observing non-monotonic relationships where we don’t expect them is a red flag for the analyst to closely review model outputs, and perhaps the underlying model coding as well, to identify and correct errors.

Contrasting Sets of Experiments¶

A similar set of visualizations can be created to contrast two set of experiments derived from the same (or substantially similar) scopes. This is particularly valuable to evaluate the performance of meta-models that are derived from core models, as we can generate scatter plot matrices that show experiments from both the core and meta models.

To demonstrate this capability, we’ll first create a meta-model from the Road Test core model, then apply that meta-model to a design of 5,000 experiments to create a set of meta-model results to visualize.

[5]:

mm = model.create_metamodel_from_design('lhs', suppress_converge_warnings=True)
mm_design = mm.design_experiments(n_samples=5000)
mm_results = mm.run_experiments(mm_design)

The contrast_experiments function in the emat.analysis sub-package can automatically create a scatter plot matrix, using a very similar interface to the display_experiments function. The primary difference between these two functions is that contrast_experiments takes two sets of experiments as arguments, instead of one. The resulting plots are also not colorized based on the roles of each factor in the scope; instead colors are used to differentiate the different datasets.

[6]:

from emat.analysis import contrast_experiments
contrast_experiments(scope, mm_results, results)

No Build Time

Build Time

Time Savings

Value Time Save

Net Benefits

Cost of Expand

Present Cost

Scatter Plot Matrix API¶

emat.analysis.display_experiments(scope, experiment_results=None, db=None, render='png', rows='measures', columns='infer', mass=1000, use_gl=True, return_figures=False)[source]¶

Render a visualization of experimental results.

This function will display the outputs in a jupyter notebook, but does not actually return any values.

Parameters:

scope (emat.Scope) – The scope to use in identifying parameters and performance measures.
experiment_results (pandas.DataFrame or str) – The complete results from a set of experiments, including parameter inputs and performance measure outputs. Give a string to name a design in the database.
db (emat.Database, optional) – When either of the experiments arguments is given as a string, the experiments are loaded from this database using the scope name as well as the given string as the design name.
render (str or dict or None, default ‘png’) – If given, the graph[s] will be rendered to a static image using plotly.io.to_image. For default settings, pass ‘png’, or give a dictionary that specifies keyword arguments to that function. If no rendering is done (by setting render to None), the raw plotly figures are returned – this may result in a very large number of javascript figures and may slow down your browser.
rows ({‘measures’, ‘levers’, ‘uncertainties’} or Collection, default ‘measures’) – Give a named group to generate a row of figures for each item in that group, or give a collection of individual names to generate a row of figures for each named item.
columns ({‘infer’, ‘measures’, ‘levers’, ‘uncertainties’} or Collection, default ‘infer’) – Give a named group to generate a column of plots for each item in that group, or give a collection of individual names to generate a column of plots for each named item. The default ‘infer’ value will select all parameters when the row is a measure, and all measures otherwise.
mass (int or emat.viz.ScatterMass, default 1000) – The target number of rendered points in each figure. Setting to a number less than the number of experiments will make each scatter point partially transparent, which will help visually convey relative density when there are a very large number of points.
return_figures (bool, default False) – Set this to True to return the FigureWidgets instead of simply displaying them.

emat.analysis.contrast_experiments(scope, experiments_1, experiments_2, db=None, render='png', rows='measures', columns='infer', mass=1000, colors=None, use_gl=True, return_figures=False)[source]¶

Render a visualization of two sets of experimental results.

This function will display the outputs in a Jupyter notebook, but does not actually return any values.

Parameters:

scope (emat.Scope) – The scope to use in identifying parameters and performance measures.
experiments_1 (str or pandas.DataFrame) – The complete results from a set of experiments, including parameter inputs and performance measure outputs. Give a string to name a design in the database instead of passing results explicitly as a DataFrame.
experiments_2 (str or pandas.DataFrame) – The complete results from a set of experiments, including parameter inputs and performance measure outputs. Give a string to name a design in the database instead of passing results explicitly as a DataFrame.
db (emat.Database, optional) – When either of the experiments arguments is given as a string, the experiments are loaded from this database using the scope name as well as the given string as the design name.
render (str or dict or None, default ‘png’) – If given, the graph[s] will be rendered to a static image using plotly.io.to_image. For default settings, pass ‘png’, or give a dictionary that specifies keyword arguments to that function. If no rendering is done (by setting render to None), the raw plotly figures are returned – this may result in a very large number of javascript figures and may slow down your browser.
rows (str or Collection, default ‘measures’) – Give a named group {‘measures’, ‘levers’, ‘uncertainties’} to generate a row of figures for each item in that group, or give a collection of individual names to generate a row of figures for each named item.
columns (str or Collection, default ‘infer’) – Give a named group {‘infer’, ‘measures’, ‘levers’, ‘uncertainties’} to generate a column of plots for each item in that group, or give a collection of individual names to generate a column of plots for each named item. The default ‘infer’ value will select all parameters when the row is a measure, and all measures otherwise.
mass (int or emat.viz.ScatterMass, default 1000) – The target number of rendered points in each figure. Setting to a number less than the number of experiments will make each scatter point partially transparent, which will help visually convey relative density when there are a very large number of points.
colors (2-tuple, optional) – A pair of colors for the experiments.
return_figures (bool, default False) – Set this to True to return the figures instead of simply displaying them within a Jupyter notebook.

emat.viz.scatter_graphs(column, data, scope, db=None, contrast='infer', marker_opacity=None, mass=1000, render=None, use_gl=True)[source]¶

Generate a row of scatter plots comparing one column against others.

Parameters:	column (str) – The name of the principal parameter or measure to analyze. data (pandas.DataFrame or str) – The experimental results to plot. Can be given as a DataFrame or as the name of a design (in which case the results are loaded from the provided db). scope (Scope, optional) – The exploratory scope. db (Database, optional) – The database containing the results. This is ignored unless data is a string. contrast (str or list) – The contrast columns to plot the principal parameter or measure against. Can be given as a list of columns that appear in the data, or one of {‘uncertainties’, ‘levers’, ‘parameters’, ‘measures’, ‘infer’}. If set to ‘infer’, the contrast will be ‘measures’ if column is a parameter, and ‘parameters’ if columns is a measure. marker_opacity (float, optional) – The opacity to use for markers. If the number of markers is large, the figure may appear as a solid blob; by setting opacity to less than 1.0, the figure can more readily show relative density in various regions. If not specified, marker_opacity is set based on mass instead. mass (int or emat.viz.ScatterMass, default 1000) – The target number of rendered points in each figure. Setting to a number less than the number of experiments will make each scatter point partially transparent, which will help visually convey relative density when there are a very large number of points. render (str or dict, optional) – If given, the graph[s] will be rendered to a static image using plotly.io.to_image. For default settings, pass ‘png’, or give a dictionary that specifies keyword arguments to that function. See emat.util.rendering.render_plotly for more details. use_gl (bool, default True) – Use Plotly’s Scattergl instead of Scatter, which may provide some performance benefit for large data sets.
Returns:	FigureWidget or xmle.Elem
Raises:	ValueError – If contrast is ‘infer’ but column is neither a parameter nor a measure.

emat.viz.scatter_graphs_2(column, datas, scope, db=None, contrast='infer', render=None, colors=None, use_gl=True, mass=1000)[source]¶

Generate a row of scatter plots comparing multiple datasets.

This function is similar to scatter_graphs, but accepts multiple data sets and plots them using different colors.

Parameters:	column (str) – The name of the principal parameter or measure to analyze. datas (Collection[pandas.DataFrame or str]) – The experimental results to plot. Can be given as a DataFrame or as the name of a design (in which case the results are loaded from the provided Database db). scope (Scope, optional) – The exploratory scope. db (Database, optional) – The database containing the results. Ignored unless data is a string. contrast (str or list) – The contrast columns to plot the principal parameter or measure against. Can be given as a list of columns that appear in the data, or one of {‘uncertainties’, ‘levers’, ‘parameters’, ‘measures’, ‘infer’}. If set to ‘infer’, the contrast will be ‘measures’ if column is a parameter, and ‘parameters’ if columns is a measure. render (str or dict, optional) – If given, the graph[s] will be rendered to a static image using plotly.io.to_image. For default settings, pass ‘png’, or give a dictionary that specifies keyword arguments to that function. See emat.util.rendering.render_plotly for more details. mass (int or emat.viz.ScatterMass, default 1000) – The target number of rendered points in each figure. Setting to a number less than the number of experiments will make each scatter point partially transparent, which will help visually convey relative density when there are a very large number of points.
Returns:	The latter is returned if a render argument is used.
Return type:	plotly.FigureWidget or xmle.Elem
Raises:	ValueError – If contrast is ‘infer’ but column is neither a parameter nor a measure.