{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import emat\n",
    "import yaml\n",
    "import json"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MappingParser Example\n",
    "\n",
    "In this notebook, we will illustrate the use of a MappingParser with \n",
    "a few simple examples.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from emat.model.core_files.parsers import (\n",
    "    MappingParser,\n",
    "    key\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parsing a YAML File\n",
    "\n",
    "First, let's consider a `MappingParser` for extracting values from a \n",
    "simple YAML file of traffic counts by time period.  We'll begin \n",
    "by writing such a table as a temporary file to be processed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample_file_yaml = \"\"\"\n",
    "---\n",
    "LinkID: 123\n",
    "LinkName: Yellow Brick Rd.\n",
    "Toll: 0.30\n",
    "Count_AM: 3498\n",
    "Count_MD: 2340\n",
    "Count_PM: 3821\n",
    "Count_EV: 1820\n",
    "...\n",
    "\"\"\"\n",
    "\n",
    "with open('/tmp/emat_sample_file.yml', 'wt') as f:\n",
    "    f.write(sample_file_yaml)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we wanted to read this YAML file one time, we could easily\n",
    "do so using `yaml.safe_load`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('/tmp/emat_sample_file.yml', 'rt') as fi: \n",
    "    mapping = yaml.safe_load(fi)\n",
    "    \n",
    "mapping"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is then simple to manually extract individual values by label,\n",
    "or by position, or we could extract a row total to get a daily \n",
    "total count for a link, or take the mean of a column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "{\n",
    "    'AM': mapping['Count_AM'],  # one key\n",
    "    'PM': mapping['Count_PM'],  \n",
    "    'OffPeak': mapping['Count_MD'] + mapping['Count_EV'],  # adding together keys\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `MappingParser` object makes it easy to combine these instructions\n",
    "to extract the same values from the same file in any model run."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser = MappingParser(\n",
    "    'emat_sample_file.yml',\n",
    "    {\n",
    "        'AM': key['Count_AM'],  # one key\n",
    "        'PM': key['Count_PM'],  \n",
    "        'OffPeak': key['Count_MD'] + key['Count_EV'],  # adding together keys\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now execute all these instructions by using the `read` method\n",
    "of the parser."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser.read(from_dir='/tmp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the `MappingParser` has some advantages over just writing a custom\n",
    "function for each table to be processed.  The most important is that\n",
    "we do not need to actually parse anything to access the names of the \n",
    "keys available in the parser's output."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser.measure_names"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parsing a JSON File\n",
    "\n",
    "The default format for a `MappingParser` input file is YAML,\n",
    "which conveniently can also be used to read performace measures \n",
    "from a JSON file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "with open('/tmp/emat_sample_file.json', 'wt') as f:\n",
    "    json.dump(mapping, f)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser = MappingParser(\n",
    "    'emat_sample_file.json',\n",
    "    {\n",
    "        'AM': key['Count_AM'],  # one key\n",
    "        'PM': key['Count_PM'],  \n",
    "        'OffPeak': key['Count_MD'] + key['Count_EV'],  # adding together keys\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser.read(from_dir='/tmp')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parsing other File Formats\n",
    "\n",
    "The `MappingParser` can also be used for other file types that can be read\n",
    "into a simple Python mapping.  For example, consider a mapping encoded as \n",
    "a `msgpack`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import msgpack\n",
    "\n",
    "with open('/tmp/emat_sample_file.msgpk', 'wb') as f:\n",
    "    msgpack.dump(mapping, f)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To parse this file, we'll need to write a small reader function that\n",
    "takes a filename and returns the raw mapping."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def msgpack_load(filename):\n",
    "    with open(filename, 'rb') as fi:\n",
    "        return msgpack.load(fi)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we provide that reader function in the `reader_method` \n",
    "argument when constucting the MappingParser."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser = MappingParser(\n",
    "    'emat_sample_file.msgpk',\n",
    "    {\n",
    "        'AM': key['Count_AM'],  # one key\n",
    "        'PM': key['Count_PM'],  \n",
    "        'OffPeak': key['Count_MD'] + key['Count_EV'],  # adding together keys\n",
    "    },\n",
    "    reader_method=msgpack_load\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "parser.read(from_dir='/tmp')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}