{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Modeling with interval averages\n\nTransposing interval-averaged irradiance data\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "This example shows how failing to account for the difference between\ninstantaneous and interval-averaged time series data can introduce\nerror in the modeling process. An instantaneous time series\nrepresents discrete measurements taken at each timestamp, while\nan interval-averaged time series represents the average value across\neach data interval.  For example, the value of an interval-averaged\nhourly time series at 11:00 represents the average value between\n11:00 (inclusive) and 12:00 (exclusive), assuming the series is left-labeled.\nFor a right-labeled time series it would be the average value\nbetween 10:00 (exclusive) and 11:00 (inclusive).  Sometimes timestamps\nare center-labeled, in which case it would be the\naverage value between 10:30 and 11:30.\nInterval-averaged time series are common in\nfield data, where the datalogger averages high-frequency measurements\ninto low-frequency averages for archiving purposes.\n\nIt is important to account for this difference when using\ninterval-averaged weather data for modeling.  This example\nfocuses on calculating solar position appropriately for\nirradiance transposition, but this concept is relevant for\nother steps in the modeling process as well.\n\nThis example calculates a POA irradiance timeseries at 1-second\nresolution as a \"ground truth\" value.  Then it performs the\ntransposition again at lower resolution using interval-averaged\nirradiance components, once using a half-interval shift and\nonce just using the unmodified timestamps. The difference\naffects the solar position calculation: for example, assuming\nwe have average irradiance for the interval 11:00 to 12:00,\nand it came from a left-labeled time series, naively using\nthe unmodified timestamp will calculate solar position for 11:00,\nmeaning the calculated solar position is used to represent\ntimes as far as an hour away.  A better option would be to\ncalculate the solar position at 11:30 to reduce the maximum\ntiming error to only half an hour.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import pvlib\nimport pandas as pd\nimport matplotlib.pyplot as plt"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "First, we'll define a helper function that we can re-use several\ntimes in the following code:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "def transpose(irradiance, timeshift):\n    \"\"\"\n    Transpose irradiance components to plane-of-array, incorporating\n    a timeshift in the solar position calculation.\n\n    Parameters\n    ----------\n        irradiance: DataFrame\n            Has columns dni, ghi, dhi\n        timeshift: float\n            Number of minutes to shift for solar position calculation\n    Outputs:\n        Series of POA irradiance\n    \"\"\"\n    idx = irradiance.index\n    # calculate solar position for shifted timestamps:\n    idx = idx + pd.Timedelta(timeshift, unit='min')\n    solpos = location.get_solarposition(idx)\n    # but still report the values with the original timestamps:\n    solpos.index = irradiance.index\n\n    poa_components = pvlib.irradiance.get_total_irradiance(\n        surface_tilt=20,\n        surface_azimuth=180,\n        solar_zenith=solpos['apparent_zenith'],\n        solar_azimuth=solpos['azimuth'],\n        dni=irradiance['dni'],\n        ghi=irradiance['ghi'],\n        dhi=irradiance['dhi'],\n        model='isotropic',\n    )\n    return poa_components['poa_global']"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, calculate the \"ground truth\" irradiance data.  We'll simulate\nclear-sky irradiance components at 1-second intervals and calculate\nthe corresponding POA irradiance.  At such a short timescale, the\ndifference between instantaneous and interval-averaged irradiance\nis negligible.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# baseline: all calculations done at 1-second scale\nlocation = pvlib.location.Location(40, -80, tz='Etc/GMT+5')\ntimes = pd.date_range('2019-06-01 05:00', '2019-06-01 19:00',\n                      freq='1s', tz='Etc/GMT+5')\nsolpos = location.get_solarposition(times)\nclearsky = location.get_clearsky(times, solar_position=solpos)\npoa_1s = transpose(clearsky, timeshift=0)  # no shift needed for 1s data"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, we will aggregate the 1-second values into interval averages.\nTo see how the averaging interval affects results, we'll loop over\na few common data intervals and accumulate the results.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig, ax = plt.subplots(figsize=(5, 3))\n\nresults = []\n\nfor timescale_minutes in [1, 5, 10, 15, 30, 60]:\n\n    timescale_str = f'{timescale_minutes}min'\n    # get the \"true\" interval average of poa as the baseline for comparison\n    poa_avg = poa_1s.resample(timescale_str).mean()\n    # get interval averages of irradiance components to use for transposition\n    clearsky_avg = clearsky.resample(timescale_str).mean()\n\n    # low-res interval averages of 1-second data, with NO shift\n    poa_avg_noshift = transpose(clearsky_avg, timeshift=0)\n\n    # low-res interval averages of 1-second data, with half-interval shift\n    poa_avg_halfshift = transpose(clearsky_avg, timeshift=timescale_minutes/2)\n\n    df = pd.DataFrame({\n        'ground truth': poa_avg,\n        'modeled, half shift': poa_avg_halfshift,\n        'modeled, no shift': poa_avg_noshift,\n    })\n    error = df.subtract(df['ground truth'], axis=0)\n    # add another trace to the error plot\n    error['modeled, no shift'].plot(ax=ax, label=timescale_str)\n    # calculate error statistics and save for later\n    stats = error.abs().mean()  # average absolute error across daylight hours\n    stats['timescale_minutes'] = timescale_minutes\n    results.append(stats)\n\nax.legend(ncol=2)\nax.set_ylabel('Transposition Error [W/m$^2$]')\nfig.tight_layout()\n\ndf_results = pd.DataFrame(results).set_index('timescale_minutes')\nprint(df_results)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The errors shown above are the average absolute difference in $W/m^2$.\nIn this example, using the timestamps unadjusted creates an error that\nincreases with increasing interval length, up to a ~40% error\nat hourly resolution.  In contrast, incorporating a half-interval shift\nso that solar position is calculated in the middle of the interval\ninstead of the edge reduces the error by one or two orders of magnitude:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig, ax = plt.subplots(figsize=(5, 3))\ndf_results[['modeled, no shift', 'modeled, half shift']].plot.bar(rot=0, ax=ax)\nax.set_ylabel('Mean Absolute Error [W/m$^2$]')\nax.set_xlabel('Transposition Timescale [minutes]')\nfig.tight_layout()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can also plot the underlying time series results of the last\niteration (hourly in this case).  The modeled irradiance using\nno shift is effectively time-lagged compared with ground truth.\nIn contrast, the half-shift model is nearly identical to the ground\ntruth irradiance.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig, ax = plt.subplots(figsize=(5, 3))\nax = df.plot(ax=ax, style=[None, ':', None], lw=3)\nax.set_ylabel('Irradiance [W/m$^2$]')\nfig.tight_layout()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}