{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import emat\n", "import numpy\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emat.util.distributions import pert, triangle, uniform, get_bounds" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "This page reviews some common continuous distributions used for exploratory and risk analysis.\n", "EMAT can also use any named continuous distribution from the :any:`scipy.stats` module." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uniform Distribution\n", "\n", "The uniform distribution is defined by a probability density function that is a rectangle.\n", "It is parameterized using two parameters (minimum, maximum). It is a simple \n", "distribution that is easy to understand and explain, and is often assumed as the\n", "implied default distribution for exploratory analysis." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 4\n", " min: 1\n", " max: 4\n", " dist: uniform\n", " dtype: float\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = (0,5)\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also valid to include the `min` and `max` values under the `dist` key, instead of \n", "as top level keys for the parameter definition." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 4\n", " dist: \n", " name: uniform\n", " min: 1\n", " max: 4\n", " dtype: float\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = (0,5)\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Triangle Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The triangle distribution is defined by a probability density function that is a triangle.\n", "It is parameterized using three parameters (minimum, peak, maximum). It is a simple \n", "distribution that is easy to understand and explain, and unlike the uniform distribution,\n", "it allow more likelihood to be directed towards some particular value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = numpy.linspace(0,5)\n", "plt.plot(x, triangle(lower_bound=0, upper_bound=5, peak=0.0).pdf(x), label='Peak=0.0')\n", "plt.plot(x, triangle(lower_bound=0, upper_bound=5, peak=0.5).pdf(x), label='Peak=0.5')\n", "plt.plot(x, triangle(lower_bound=0, upper_bound=5, peak=1.0).pdf(x), label='Peak=1.0')\n", "plt.plot(x, triangle(lower_bound=0, upper_bound=5, peak=2.5).pdf(x), label='Peak=2.5')\n", "_=plt.legend()" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. autofunction:: emat.util.distributions.triangle" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 4\n", " min: 0\n", " max: 5\n", " dist: \n", " name: triangle\n", " peak: 4\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = get_bounds(s['uncertain_variable_name'])\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also valid to include the `min` and `max` values under the `dist` key, instead of \n", "as top level keys for the parameter definition." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 4\n", " dist: \n", " name: triangle\n", " min: 0\n", " peak: 4\n", " max: 5\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = get_bounds(s['uncertain_variable_name'])\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PERT Distribution\n", "\n", "The PERT distrubution (\"PERT\" is an acronym for \"project evaluation and review techniques\")\n", "is a generally bell-shaped curve that, unlike the normal distribution, has finite minimum and\n", "maximum values. It can be parameterized similar to the triangular distribution, using\n", "three parameters (minimum, peak, maximum). This allows a skew to be introduced, by setting \n", "the peak value to be other-than the midpoint between maximum and minimum values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x, pert(lower_bound=0, upper_bound=5, peak=0.0).pdf(x), label='Peak=0.0')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, peak=0.5).pdf(x), label='Peak=0.5')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, peak=1.0).pdf(x), label='Peak=1.0')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, peak=2.5).pdf(x), label='Peak=2.5')\n", "_=plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The relative peakiness (i.e., kurtosis) of the distribution can be controlled \n", "using the gamma parameter. The default value of gamma for a PERT distrubution is 4.0,\n", "but other positive numbers can be used as well, with\n", "higher numbers for a distribution that more favors outcomes\n", "near the peak, or smaller numbers for a distribution that gives less pronounced\n", "weight to value near the peak, and relatively more weight to the tails. In the limit,\n", "setting gamma to zero results in a uniform distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=1).pdf(x), label='gamma=1')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=2).pdf(x), label='gamma=2')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=3).pdf(x), label='gamma=3')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=4).pdf(x), label='gamma=4', lw=3.0)\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=5).pdf(x), label='gamma=5')\n", "plt.plot(x, pert(lower_bound=0, upper_bound=5, gamma=10).pdf(x), label='gamma=10')\n", "_=plt.legend()" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. autofunction:: emat.util.distributions.pert" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The PERT distribution can be indicated in a yaml scope file using the name \"pert\",\n", "with optional values for other named arguments outlined in the function docstring\n", "shown above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 1.0\n", " min: 0\n", " max: 5\n", " dist: \n", " name: pert\n", " peak: 4\n", " gamma: 3\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = get_bounds(s['uncertain_variable_name'])\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also valid to include the `min` and `max` values under the `dist` key, instead of \n", "as top level keys for the parameter definition." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 1.0\n", " dist: \n", " name: pert\n", " min: 0\n", " max: 5\n", " peak: 4\n", " gamma: 3\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = get_bounds(s['uncertain_variable_name'])\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other Distributions" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "It is possible to use any other continuous distribution provided in the :any:`scipy.stats` module.\n", "As a demonstration, below we define a trapezoidal distribution for an uncertainty. Instead of \n", "using the more intuitively named keys shown above, it is necessary to fall back to the standard\n", ":any:`scipy.stats` names for each of the distribution parameters, and they must all be defined within\n", "the `dist` key, which may be less intuitive than the suggested distributions above. For example,\n", "note in the example below that the upper bound of the distribution is implictly set to 7 based \n", "on the parameters, and that upper bound is not explicitly identified in the yaml file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "y = \"\"\"---\n", "scope:\n", " name: demonstration\n", "inputs:\n", " uncertain_variable_name:\n", " ptype: uncertainty\n", " desc: Slightly More Verbose Description\n", " default: 1.0\n", " dist: \n", " name: trapz\n", " c: 0.2\n", " d: 0.5\n", " loc: 2\n", " scale: 5\n", "outputs:\n", " performance_measure_name:\n", " kind: maximize\n", "...\n", "\"\"\"\n", "s = emat.Scope('t.yaml', scope_def=y)\n", "bounds = get_bounds(s['uncertain_variable_name'])\n", "x = numpy.linspace(*bounds)\n", "y = s['uncertain_variable_name'].dist.pdf(x)\n", "_=plt.plot(x,y)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }