Climate Adaptation Digital Twin: winter temperature in Germany and Heating Degree Days in Darmstadt¶

This notebook will provide you guidance on how to access and use the https://cacheb.dcms.destine.eu/d1-climate-dt/ScenarioMIP-SSP3-7.0-IFS-NEMO-0001-high-sfc-v0.zarr dataset. Access to this dataset is restricted to authorized user only via the the Data Cache Management service.

To access the Data Cache Management service you need to instruct your tools (e.g. Xarray, Zarr...) to do so with the approprieate access token. This can be done by running the code snippets below.

First, make sure you have an account on the Destination Earth platform. Then, run the following cell, filling in your Destination Earth credentials and password when asked:

In [1]:
%%capture cap
%run ../cacheb/cacheb-authentication.py
In [2]:
from pathlib import Path
with open(Path.home() / ".netrc", "a") as fp:
    fp.write(cap.stdout)

⚠ NOTE: the generated password is valid for a limited period of time, and needs to be regenerated and reconfigured periodically by running the cells above.

Goal of this tutorial¶

The first goal of this tutorial is to plot the average 2 metre temperature in Germany for years 2020-2039.

The second goal of this tutorial is to calculate the Heating Degree Days in Darmstadt in the same years.

What you will learn:¶

  • how to access the dataset
  • select and reduce the data
  • plot the results

Working with EDH data¶

Datasets on EDH are typically very large and remotely hosted. Typical use imply a selection of the data followed by one or more reduction steps to be performed in a local or distributed Dask environment.

The structure of a workflow that uses EDH data tipically looks like this:

  • data access
  • data selection
  • (optional) data reduction
  • data download
  • further operations and visualization

Xarray and Dask work together following a lazy principle. This means that when you access and manipulate a Zarr store the data is in not immediately downloaded and loaded in memory. Instead, Dask constructs a task graph that represents the operations to be performed. A smart user will first reduce the amount of data that needs to be downloaded and explicitly call compute() on it. Once the compute() operation is complete the data is loaded into memory and available for subsequent fast processing.

1. Data access¶

In [3]:
import xarray as xr

url = "https://cacheb.dcms.destine.eu/d1-climate-dt/ScenarioMIP-SSP3-7.0-IFS-NEMO-0001-high-sfc-v0.zarr"

ds = xr.open_dataset(
    url, 
    chunks={}, 
    engine="zarr", 
    storage_options={"client_kwargs": {"trust_env": True}}
)
ds
Out[3]:
<xarray.Dataset> Size: 188TB
Dimensions:    (time: 175320, latitude: 4096, longitude: 8193)
Coordinates:
  * latitude   (latitude) float64 33kB -90.0 -89.96 -89.91 ... 89.91 89.96 90.0
  * longitude  (longitude) float64 66kB -180.0 -180.0 -179.9 ... 180.0 180.0
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 1MB 2020-01-01 ... 2039-12-31T23:00:00
Data variables:
    d2m        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    sd         (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    ssr        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    str        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    t2m        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    tprate     (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    u10        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    v10        (time, latitude, longitude) float32 24TB dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
Attributes:
    Conventions:             CF-1.7
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_edition:            2
    GRIB_subCentre:          1003
    history:                 2024-06-06T16:50 GRIB to CDM+CF via cfgrib-0.9.1...
    institution:             European Centre for Medium-Range Weather Forecasts
xarray.Dataset
    • time: 175320
    • latitude: 4096
    • longitude: 8193
    • latitude
      (latitude)
      float64
      -90.0 -89.96 -89.91 ... 89.96 90.0
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array([-90.      , -89.956044, -89.912088, ...,  89.912088,  89.956044,
              90.      ])
    • longitude
      (longitude)
      float64
      -180.0 -180.0 ... 180.0 180.0
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array([-180.      , -179.956055, -179.912109, ...,  179.912109,  179.956055,
              180.      ])
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2020-01-01 ... 2039-12-31T23:00:00
      array(['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
             '2020-01-01T02:00:00.000000000', ..., '2039-12-31T21:00:00.000000000',
             '2039-12-31T22:00:00.000000000', '2039-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • d2m
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      unknown
      GRIB_cfVarName :
      d2m
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      2 metre dewpoint temperature
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      168
      GRIB_shortName :
      2d
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      heightAboveGround
      GRIB_units :
      K
      last_restart_dim_updated :
      175320
      long_name :
      2 metre dewpoint temperature
      standard_name :
      unknown
      units :
      K
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • sd
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      lwe_thickness_of_surface_snow_amount
      GRIB_cfVarName :
      sd
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      Snow depth
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      141
      GRIB_shortName :
      sd
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      surface
      GRIB_units :
      m of water equivalent
      last_restart_dim_updated :
      175320
      long_name :
      Snow depth
      standard_name :
      lwe_thickness_of_surface_snow_amount
      units :
      m of water equivalent
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • ssr
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      surface_net_downward_shortwave_flux
      GRIB_cfVarName :
      ssr
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      Surface net short-wave (solar) radiation
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      176
      GRIB_shortName :
      ssr
      GRIB_stepType :
      accum
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      surface
      GRIB_units :
      J m**-2
      last_restart_dim_updated :
      175320
      long_name :
      Surface net short-wave (solar) radiation
      standard_name :
      surface_net_downward_shortwave_flux
      units :
      J m**-2
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • str
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      surface_net_upward_longwave_flux
      GRIB_cfVarName :
      str
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      Surface net long-wave (thermal) radiation
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      177
      GRIB_shortName :
      str
      GRIB_stepType :
      accum
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      surface
      GRIB_units :
      J m**-2
      last_restart_dim_updated :
      175320
      long_name :
      Surface net long-wave (thermal) radiation
      standard_name :
      surface_net_upward_longwave_flux
      units :
      J m**-2
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • t2m
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      air_temperature
      GRIB_cfVarName :
      t2m
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      2 metre temperature
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      167
      GRIB_shortName :
      2t
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      heightAboveGround
      GRIB_units :
      K
      last_restart_dim_updated :
      175320
      long_name :
      2 metre temperature
      standard_name :
      air_temperature
      units :
      K
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • tprate
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      unknown
      GRIB_cfVarName :
      tprate
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      Total precipitation rate
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      260048
      GRIB_shortName :
      tprate
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      surface
      GRIB_units :
      kg m**-2 s**-1
      last_restart_dim_updated :
      175320
      long_name :
      Total precipitation rate
      standard_name :
      unknown
      units :
      kg m**-2 s**-1
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • u10
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      eastward_wind
      GRIB_cfVarName :
      u10
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      10 metre U wind component
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      165
      GRIB_shortName :
      10u
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      heightAboveGround
      GRIB_units :
      m s**-1
      last_restart_dim_updated :
      175320
      long_name :
      10 metre U wind component
      standard_name :
      eastward_wind
      units :
      m s**-1
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • v10
      (time, latitude, longitude)
      float32
      dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
      GRIB_NV :
      0
      GRIB_cfName :
      northward_wind
      GRIB_cfVarName :
      v10
      GRIB_dataType :
      fc
      GRIB_gridDefinitionDescription :
      150
      GRIB_gridType :
      healpix
      GRIB_missingValue :
      3.4028234663852886e+38
      GRIB_name :
      10 metre V wind component
      GRIB_numberOfPoints :
      12582912
      GRIB_paramId :
      166
      GRIB_shortName :
      10v
      GRIB_stepType :
      instant
      GRIB_stepUnits :
      1
      GRIB_typeOfLevel :
      heightAboveGround
      GRIB_units :
      m s**-1
      last_restart_dim_updated :
      175320
      long_name :
      10 metre V wind component
      standard_name :
      northward_wind
      units :
      m s**-1
      Array Chunk
      Bytes 21.40 TiB 48.00 MiB
      Shape (175320, 4096, 8193) (48, 512, 512)
      Dask graph 496808 chunks in 2 graph layers
      Data type float32 numpy.ndarray
      8193 4096 175320
    • latitude
      PandasIndex
      PandasIndex(Index([             -90.0, -89.95604395604396, -89.91208791208791,
             -89.86813186813187, -89.82417582417582, -89.78021978021978,
             -89.73626373626374,  -89.6923076923077, -89.64835164835165,
              -89.6043956043956,
             ...
              89.60439560439562,  89.64835164835165,  89.69230769230771,
              89.73626373626374,   89.7802197802198,  89.82417582417582,
              89.86813186813188,  89.91208791208791,  89.95604395604397,
                           90.0],
            dtype='float64', name='latitude', length=4096))
    • longitude
      PandasIndex
      PandasIndex(Index([             -180.0,     -179.9560546875, -179.91210937500003,
                 -179.8681640625,       -179.82421875,     -179.7802734375,
                  -179.736328125,     -179.6923828125,        -179.6484375,
                 -179.6044921875,
             ...
                  179.6044921875,         179.6484375,      179.6923828125,
                   179.736328125,      179.7802734375,        179.82421875,
                  179.8681640625,  179.91210937500003,      179.9560546875,
                           180.0],
            dtype='float64', name='longitude', length=8193))
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
                     '2020-01-01 02:00:00', '2020-01-01 03:00:00',
                     '2020-01-01 04:00:00', '2020-01-01 05:00:00',
                     '2020-01-01 06:00:00', '2020-01-01 07:00:00',
                     '2020-01-01 08:00:00', '2020-01-01 09:00:00',
                     ...
                     '2039-12-31 14:00:00', '2039-12-31 15:00:00',
                     '2039-12-31 16:00:00', '2039-12-31 17:00:00',
                     '2039-12-31 18:00:00', '2039-12-31 19:00:00',
                     '2039-12-31 20:00:00', '2039-12-31 21:00:00',
                     '2039-12-31 22:00:00', '2039-12-31 23:00:00'],
                    dtype='datetime64[ns]', name='time', length=175320, freq=None))
  • Conventions :
    CF-1.7
    GRIB_centre :
    ecmf
    GRIB_centreDescription :
    European Centre for Medium-Range Weather Forecasts
    GRIB_edition :
    2
    GRIB_subCentre :
    1003
    history :
    2024-06-06T16:50 GRIB to CDM+CF via cfgrib-0.9.12.0/ecCodes-2.35.0 with {"source": ".xarray-ecmwf-cache/10baaccbb0bb6b0157fbe574dc7d566e.grib", "filter_by_keys": {}, "encode_cf": ["parameter", "time", "geography", "vertical"]}
    institution :
    European Centre for Medium-Range Weather Forecasts

⚠ At this point, no data has been downloaded yet, nor loaded in memory.

2. Data selection¶

We first select the 2 metres temperature.

In [4]:
xr.set_options(keep_attrs=True)

t2m = ds.t2m - 273.15
t2m.attrs["units"] = "°C"
t2m
Out[4]:
<xarray.DataArray 't2m' (time: 175320, latitude: 4096, longitude: 8193)> Size: 24TB
dask.array<sub, shape=(175320, 4096, 8193), dtype=float32, chunksize=(48, 512, 512), chunktype=numpy.ndarray>
Coordinates:
  * latitude   (latitude) float64 33kB -90.0 -89.96 -89.91 ... 89.91 89.96 90.0
  * longitude  (longitude) float64 66kB -180.0 -180.0 -179.9 ... 180.0 180.0
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 1MB 2020-01-01 ... 2039-12-31T23:00:00
Attributes: (12/19)
    GRIB_NV:                         0
    GRIB_cfName:                     air_temperature
    GRIB_cfVarName:                  t2m
    GRIB_dataType:                   fc
    GRIB_gridDefinitionDescription:  150
    GRIB_gridType:                   healpix
    ...                              ...
    GRIB_typeOfLevel:                heightAboveGround
    GRIB_units:                      K
    last_restart_dim_updated:        175320
    long_name:                       2 metre temperature
    standard_name:                   air_temperature
    units:                           °C
xarray.DataArray
't2m'
  • time: 175320
  • latitude: 4096
  • longitude: 8193
  • dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    Array Chunk
    Bytes 21.40 TiB 48.00 MiB
    Shape (175320, 4096, 8193) (48, 512, 512)
    Dask graph 496808 chunks in 3 graph layers
    Data type float32 numpy.ndarray
    8193 4096 175320
    • latitude
      (latitude)
      float64
      -90.0 -89.96 -89.91 ... 89.96 90.0
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array([-90.      , -89.956044, -89.912088, ...,  89.912088,  89.956044,
              90.      ])
    • longitude
      (longitude)
      float64
      -180.0 -180.0 ... 180.0 180.0
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array([-180.      , -179.956055, -179.912109, ...,  179.912109,  179.956055,
              180.      ])
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2020-01-01 ... 2039-12-31T23:00:00
      array(['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
             '2020-01-01T02:00:00.000000000', ..., '2039-12-31T21:00:00.000000000',
             '2039-12-31T22:00:00.000000000', '2039-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • latitude
      PandasIndex
      PandasIndex(Index([             -90.0, -89.95604395604396, -89.91208791208791,
             -89.86813186813187, -89.82417582417582, -89.78021978021978,
             -89.73626373626374,  -89.6923076923077, -89.64835164835165,
              -89.6043956043956,
             ...
              89.60439560439562,  89.64835164835165,  89.69230769230771,
              89.73626373626374,   89.7802197802198,  89.82417582417582,
              89.86813186813188,  89.91208791208791,  89.95604395604397,
                           90.0],
            dtype='float64', name='latitude', length=4096))
    • longitude
      PandasIndex
      PandasIndex(Index([             -180.0,     -179.9560546875, -179.91210937500003,
                 -179.8681640625,       -179.82421875,     -179.7802734375,
                  -179.736328125,     -179.6923828125,        -179.6484375,
                 -179.6044921875,
             ...
                  179.6044921875,         179.6484375,      179.6923828125,
                   179.736328125,      179.7802734375,        179.82421875,
                  179.8681640625,  179.91210937500003,      179.9560546875,
                           180.0],
            dtype='float64', name='longitude', length=8193))
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
                     '2020-01-01 02:00:00', '2020-01-01 03:00:00',
                     '2020-01-01 04:00:00', '2020-01-01 05:00:00',
                     '2020-01-01 06:00:00', '2020-01-01 07:00:00',
                     '2020-01-01 08:00:00', '2020-01-01 09:00:00',
                     ...
                     '2039-12-31 14:00:00', '2039-12-31 15:00:00',
                     '2039-12-31 16:00:00', '2039-12-31 17:00:00',
                     '2039-12-31 18:00:00', '2039-12-31 19:00:00',
                     '2039-12-31 20:00:00', '2039-12-31 21:00:00',
                     '2039-12-31 22:00:00', '2039-12-31 23:00:00'],
                    dtype='datetime64[ns]', name='time', length=175320, freq=None))
  • GRIB_NV :
    0
    GRIB_cfName :
    air_temperature
    GRIB_cfVarName :
    t2m
    GRIB_dataType :
    fc
    GRIB_gridDefinitionDescription :
    150
    GRIB_gridType :
    healpix
    GRIB_missingValue :
    3.4028234663852886e+38
    GRIB_name :
    2 metre temperature
    GRIB_numberOfPoints :
    12582912
    GRIB_paramId :
    167
    GRIB_shortName :
    2t
    GRIB_stepType :
    instant
    GRIB_stepUnits :
    1
    GRIB_typeOfLevel :
    heightAboveGround
    GRIB_units :
    K
    last_restart_dim_updated :
    175320
    long_name :
    2 metre temperature
    standard_name :
    air_temperature
    units :
    °C

As xarray showes, the dimension of the t2m DataArray is around 20TB. We will now try to narrow down the selection as much as possible.

We select only the Germany area and the meteorological winter months.

In [5]:
germany = {'latitude': slice(46, 56), 'longitude': slice(5, 16)}
t2m_germany = t2m.sel(**germany)
t2m_germany_winter = t2m_germany[t2m_germany.time.dt.month.isin([12, 1, 2])]
t2m_germany_winter
Out[5]:
<xarray.DataArray 't2m' (time: 43320, latitude: 228, longitude: 251)> Size: 10GB
dask.array<getitem, shape=(43320, 228, 251), dtype=float32, chunksize=(47, 228, 251), chunktype=numpy.ndarray>
Coordinates:
  * latitude   (latitude) float64 2kB 46.0 46.04 46.09 ... 55.89 55.93 55.98
  * longitude  (longitude) float64 2kB 5.01 5.054 5.098 ... 15.91 15.95 16.0
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 347kB 2020-01-01 ... 2039-12-31T23:00:00
Attributes: (12/19)
    GRIB_NV:                         0
    GRIB_cfName:                     air_temperature
    GRIB_cfVarName:                  t2m
    GRIB_dataType:                   fc
    GRIB_gridDefinitionDescription:  150
    GRIB_gridType:                   healpix
    ...                              ...
    GRIB_typeOfLevel:                heightAboveGround
    GRIB_units:                      K
    last_restart_dim_updated:        175320
    long_name:                       2 metre temperature
    standard_name:                   air_temperature
    units:                           °C
xarray.DataArray
't2m'
  • time: 43320
  • latitude: 228
  • longitude: 251
  • dask.array<chunksize=(47, 228, 251), meta=np.ndarray>
    Array Chunk
    Bytes 9.24 GiB 10.26 MiB
    Shape (43320, 228, 251) (47, 228, 251)
    Dask graph 922 chunks in 5 graph layers
    Data type float32 numpy.ndarray
    251 228 43320
    • latitude
      (latitude)
      float64
      46.0 46.04 46.09 ... 55.93 55.98
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array([46.      , 46.043956, 46.087912, ..., 55.89011 , 55.934066, 55.978022])
    • longitude
      (longitude)
      float64
      5.01 5.054 5.098 ... 15.95 16.0
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array([ 5.009766,  5.053711,  5.097656, ..., 15.908203, 15.952148, 15.996094])
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2020-01-01 ... 2039-12-31T23:00:00
      array(['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
             '2020-01-01T02:00:00.000000000', ..., '2039-12-31T21:00:00.000000000',
             '2039-12-31T22:00:00.000000000', '2039-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • latitude
      PandasIndex
      PandasIndex(Index([              46.0,  46.04395604395606,  46.08791208791209,
             46.131868131868146, 46.175824175824175,  46.21978021978023,
              46.26373626373626,  46.30769230769232,  46.35164835164836,
              46.39560439560441,
             ...
             55.582417582417605, 55.626373626373635,  55.67032967032969,
              55.71428571428572,  55.75824175824178,  55.80219780219781,
              55.84615384615387,   55.8901098901099,  55.93406593406596,
             55.978021978021985],
            dtype='float64', name='latitude', length=228))
    • longitude
      PandasIndex
      PandasIndex(Index([ 5.009765625000001,       5.0537109375,         5.09765625,
                   5.1416015625,        5.185546875,       5.2294921875,
                      5.2734375,       5.3173828125,        5.361328125,
                   5.4052734375,
             ...
                  15.6005859375, 15.644531250000002,      15.6884765625,
                   15.732421875, 15.776367187499998,         15.8203125,
             15.864257812500002,       15.908203125, 15.952148437500002,
                    15.99609375],
            dtype='float64', name='longitude', length=251))
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
                     '2020-01-01 02:00:00', '2020-01-01 03:00:00',
                     '2020-01-01 04:00:00', '2020-01-01 05:00:00',
                     '2020-01-01 06:00:00', '2020-01-01 07:00:00',
                     '2020-01-01 08:00:00', '2020-01-01 09:00:00',
                     ...
                     '2039-12-31 14:00:00', '2039-12-31 15:00:00',
                     '2039-12-31 16:00:00', '2039-12-31 17:00:00',
                     '2039-12-31 18:00:00', '2039-12-31 19:00:00',
                     '2039-12-31 20:00:00', '2039-12-31 21:00:00',
                     '2039-12-31 22:00:00', '2039-12-31 23:00:00'],
                    dtype='datetime64[ns]', name='time', length=43320, freq=None))
  • GRIB_NV :
    0
    GRIB_cfName :
    air_temperature
    GRIB_cfVarName :
    t2m
    GRIB_dataType :
    fc
    GRIB_gridDefinitionDescription :
    150
    GRIB_gridType :
    healpix
    GRIB_missingValue :
    3.4028234663852886e+38
    GRIB_name :
    2 metre temperature
    GRIB_numberOfPoints :
    12582912
    GRIB_paramId :
    167
    GRIB_shortName :
    2t
    GRIB_stepType :
    instant
    GRIB_stepUnits :
    1
    GRIB_typeOfLevel :
    heightAboveGround
    GRIB_units :
    K
    last_restart_dim_updated :
    175320
    long_name :
    2 metre temperature
    standard_name :
    air_temperature
    units :
    °C

Notice that the size of the array dropped down to around 9GiB (in memory). However, due to the chunked structure of the DataArray, xarray must download every chunk that includes a portion of the selected data.

To estimate the size of the download, we can use the costing.py module. This estimate must be done before we apply any reduction operation.

In [6]:
import costing

costing.estimate_download_size(t2m, t2m_germany_winter)
estimated_needed_chunks: 922
estimated_memory_size: 46.406 GB
estimated_download_size: 4.641 GB

3. Data reduction¶

We average the 2 metres temperature quarterly, starting on December the 1st. This also is a lazy operation.

In [7]:
t2m_germany_winter_mean = t2m_germany_winter.resample(time='QS-DEC').mean(dim="time")
t2m_germany_winter_mean
Out[7]:
<xarray.DataArray 't2m' (time: 81, latitude: 228, longitude: 251)> Size: 19MB
dask.array<transpose, shape=(81, 228, 251), dtype=float32, chunksize=(1, 228, 251), chunktype=numpy.ndarray>
Coordinates:
  * latitude   (latitude) float64 2kB 46.0 46.04 46.09 ... 55.89 55.93 55.98
  * longitude  (longitude) float64 2kB 5.01 5.054 5.098 ... 15.91 15.95 16.0
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 648B 2019-12-01 2020-03-01 ... 2039-12-01
Attributes: (12/19)
    GRIB_NV:                         0
    GRIB_cfName:                     air_temperature
    GRIB_cfVarName:                  t2m
    GRIB_dataType:                   fc
    GRIB_gridDefinitionDescription:  150
    GRIB_gridType:                   healpix
    ...                              ...
    GRIB_typeOfLevel:                heightAboveGround
    GRIB_units:                      K
    last_restart_dim_updated:        175320
    long_name:                       2 metre temperature
    standard_name:                   air_temperature
    units:                           °C
xarray.DataArray
't2m'
  • time: 81
  • latitude: 228
  • longitude: 251
  • dask.array<chunksize=(1, 228, 251), meta=np.ndarray>
    Array Chunk
    Bytes 17.68 MiB 223.55 kiB
    Shape (81, 228, 251) (1, 228, 251)
    Dask graph 81 chunks in 97 graph layers
    Data type float32 numpy.ndarray
    251 228 81
    • latitude
      (latitude)
      float64
      46.0 46.04 46.09 ... 55.93 55.98
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array([46.      , 46.043956, 46.087912, ..., 55.89011 , 55.934066, 55.978022])
    • longitude
      (longitude)
      float64
      5.01 5.054 5.098 ... 15.95 16.0
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array([ 5.009766,  5.053711,  5.097656, ..., 15.908203, 15.952148, 15.996094])
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2019-12-01 ... 2039-12-01
      array(['2019-12-01T00:00:00.000000000', '2020-03-01T00:00:00.000000000',
             '2020-06-01T00:00:00.000000000', '2020-09-01T00:00:00.000000000',
             '2020-12-01T00:00:00.000000000', '2021-03-01T00:00:00.000000000',
             '2021-06-01T00:00:00.000000000', '2021-09-01T00:00:00.000000000',
             '2021-12-01T00:00:00.000000000', '2022-03-01T00:00:00.000000000',
             '2022-06-01T00:00:00.000000000', '2022-09-01T00:00:00.000000000',
             '2022-12-01T00:00:00.000000000', '2023-03-01T00:00:00.000000000',
             '2023-06-01T00:00:00.000000000', '2023-09-01T00:00:00.000000000',
             '2023-12-01T00:00:00.000000000', '2024-03-01T00:00:00.000000000',
             '2024-06-01T00:00:00.000000000', '2024-09-01T00:00:00.000000000',
             '2024-12-01T00:00:00.000000000', '2025-03-01T00:00:00.000000000',
             '2025-06-01T00:00:00.000000000', '2025-09-01T00:00:00.000000000',
             '2025-12-01T00:00:00.000000000', '2026-03-01T00:00:00.000000000',
             '2026-06-01T00:00:00.000000000', '2026-09-01T00:00:00.000000000',
             '2026-12-01T00:00:00.000000000', '2027-03-01T00:00:00.000000000',
             '2027-06-01T00:00:00.000000000', '2027-09-01T00:00:00.000000000',
             '2027-12-01T00:00:00.000000000', '2028-03-01T00:00:00.000000000',
             '2028-06-01T00:00:00.000000000', '2028-09-01T00:00:00.000000000',
             '2028-12-01T00:00:00.000000000', '2029-03-01T00:00:00.000000000',
             '2029-06-01T00:00:00.000000000', '2029-09-01T00:00:00.000000000',
             '2029-12-01T00:00:00.000000000', '2030-03-01T00:00:00.000000000',
             '2030-06-01T00:00:00.000000000', '2030-09-01T00:00:00.000000000',
             '2030-12-01T00:00:00.000000000', '2031-03-01T00:00:00.000000000',
             '2031-06-01T00:00:00.000000000', '2031-09-01T00:00:00.000000000',
             '2031-12-01T00:00:00.000000000', '2032-03-01T00:00:00.000000000',
             '2032-06-01T00:00:00.000000000', '2032-09-01T00:00:00.000000000',
             '2032-12-01T00:00:00.000000000', '2033-03-01T00:00:00.000000000',
             '2033-06-01T00:00:00.000000000', '2033-09-01T00:00:00.000000000',
             '2033-12-01T00:00:00.000000000', '2034-03-01T00:00:00.000000000',
             '2034-06-01T00:00:00.000000000', '2034-09-01T00:00:00.000000000',
             '2034-12-01T00:00:00.000000000', '2035-03-01T00:00:00.000000000',
             '2035-06-01T00:00:00.000000000', '2035-09-01T00:00:00.000000000',
             '2035-12-01T00:00:00.000000000', '2036-03-01T00:00:00.000000000',
             '2036-06-01T00:00:00.000000000', '2036-09-01T00:00:00.000000000',
             '2036-12-01T00:00:00.000000000', '2037-03-01T00:00:00.000000000',
             '2037-06-01T00:00:00.000000000', '2037-09-01T00:00:00.000000000',
             '2037-12-01T00:00:00.000000000', '2038-03-01T00:00:00.000000000',
             '2038-06-01T00:00:00.000000000', '2038-09-01T00:00:00.000000000',
             '2038-12-01T00:00:00.000000000', '2039-03-01T00:00:00.000000000',
             '2039-06-01T00:00:00.000000000', '2039-09-01T00:00:00.000000000',
             '2039-12-01T00:00:00.000000000'], dtype='datetime64[ns]')
    • latitude
      PandasIndex
      PandasIndex(Index([              46.0,  46.04395604395606,  46.08791208791209,
             46.131868131868146, 46.175824175824175,  46.21978021978023,
              46.26373626373626,  46.30769230769232,  46.35164835164836,
              46.39560439560441,
             ...
             55.582417582417605, 55.626373626373635,  55.67032967032969,
              55.71428571428572,  55.75824175824178,  55.80219780219781,
              55.84615384615387,   55.8901098901099,  55.93406593406596,
             55.978021978021985],
            dtype='float64', name='latitude', length=228))
    • longitude
      PandasIndex
      PandasIndex(Index([ 5.009765625000001,       5.0537109375,         5.09765625,
                   5.1416015625,        5.185546875,       5.2294921875,
                      5.2734375,       5.3173828125,        5.361328125,
                   5.4052734375,
             ...
                  15.6005859375, 15.644531250000002,      15.6884765625,
                   15.732421875, 15.776367187499998,         15.8203125,
             15.864257812500002,       15.908203125, 15.952148437500002,
                    15.99609375],
            dtype='float64', name='longitude', length=251))
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2019-12-01', '2020-03-01', '2020-06-01', '2020-09-01',
                     '2020-12-01', '2021-03-01', '2021-06-01', '2021-09-01',
                     '2021-12-01', '2022-03-01', '2022-06-01', '2022-09-01',
                     '2022-12-01', '2023-03-01', '2023-06-01', '2023-09-01',
                     '2023-12-01', '2024-03-01', '2024-06-01', '2024-09-01',
                     '2024-12-01', '2025-03-01', '2025-06-01', '2025-09-01',
                     '2025-12-01', '2026-03-01', '2026-06-01', '2026-09-01',
                     '2026-12-01', '2027-03-01', '2027-06-01', '2027-09-01',
                     '2027-12-01', '2028-03-01', '2028-06-01', '2028-09-01',
                     '2028-12-01', '2029-03-01', '2029-06-01', '2029-09-01',
                     '2029-12-01', '2030-03-01', '2030-06-01', '2030-09-01',
                     '2030-12-01', '2031-03-01', '2031-06-01', '2031-09-01',
                     '2031-12-01', '2032-03-01', '2032-06-01', '2032-09-01',
                     '2032-12-01', '2033-03-01', '2033-06-01', '2033-09-01',
                     '2033-12-01', '2034-03-01', '2034-06-01', '2034-09-01',
                     '2034-12-01', '2035-03-01', '2035-06-01', '2035-09-01',
                     '2035-12-01', '2036-03-01', '2036-06-01', '2036-09-01',
                     '2036-12-01', '2037-03-01', '2037-06-01', '2037-09-01',
                     '2037-12-01', '2038-03-01', '2038-06-01', '2038-09-01',
                     '2038-12-01', '2039-03-01', '2039-06-01', '2039-09-01',
                     '2039-12-01'],
                    dtype='datetime64[ns]', name='time', freq='QS-DEC'))
  • GRIB_NV :
    0
    GRIB_cfName :
    air_temperature
    GRIB_cfVarName :
    t2m
    GRIB_dataType :
    fc
    GRIB_gridDefinitionDescription :
    150
    GRIB_gridType :
    healpix
    GRIB_missingValue :
    3.4028234663852886e+38
    GRIB_name :
    2 metre temperature
    GRIB_numberOfPoints :
    12582912
    GRIB_paramId :
    167
    GRIB_shortName :
    2t
    GRIB_stepType :
    instant
    GRIB_stepUnits :
    1
    GRIB_typeOfLevel :
    heightAboveGround
    GRIB_units :
    K
    last_restart_dim_updated :
    175320
    long_name :
    2 metre temperature
    standard_name :
    air_temperature
    units :
    °C

4. Data download¶

This is the phase where we explicitly trigger the download of the data. Remember to assign the return of the compute() function to a new variable, so that the data is kept in memory.

In [8]:
%%time

t2m_germany_winter_mean_computed = t2m_germany_winter_mean.compute() 
CPU times: user 3min 53s, sys: 1min 21s, total: 5min 14s
Wall time: 1min 48s
In [9]:
t2m_germany_winter_mean_computed = t2m_germany_winter_mean_computed.dropna("time")

5. Visualization¶

We will now create and display an animation of the average winter 2 metres temperature in Germany, for the years 2020-2039.

In [10]:
import pandas as pd
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy import crs
from matplotlib.animation import FuncAnimation
from IPython.display import HTML
from mpl_toolkits.axes_grid1 import make_axes_locatable

fig, ax = plt.subplots(subplot_kw={'projection': ccrs.PlateCarree()})
ax.add_feature(cfeature.COASTLINE, linewidth=0.5)
ax.add_feature(cfeature.BORDERS, linestyle=':', linewidth=0.5)
ax.add_feature(cfeature.OCEAN, facecolor='lightblue', zorder=2)
ax.gridlines(draw_labels=True, zorder=3, color="white", alpha=0.5)

t2m_germany_winter_mean_computed.isel(time=0).plot(ax=ax, transform=ccrs.PlateCarree(), cmap='RdBu_r', add_colorbar=True, cbar_kwargs={'orientation': 'vertical', 'shrink':0.9,'pad': 0.15})

# function to update the plot for each frame (each timestep)
def update(frame):    
    data = t2m_germany_winter_mean_computed.isel(time=frame)
    plot = data.plot(
        ax=ax, 
        transform=ccrs.PlateCarree(), 
        cmap='RdBu_r', 
        vmin=-25, 
        vmax=25, 
        add_colorbar=False
    )

    ax.set_title(f"Time: {pd.Timestamp(data['time'].values).strftime('%Y-%m-%d')}") 
    return plot

anim = FuncAnimation(fig, update, frames=len(t2m_germany_winter_mean_computed['time']), repeat=True) # Create the animation
plt.close() # close the static plot to avoid duplicate display
HTML(anim.to_jshtml()) # display the animation in the notebook
Out[10]:
No description has been provided for this image

Heating Degree Days (HDD) in Darmstadt¶

Let us now investigate the Heating Degree Days in Darmstadt. We start again from the 2 metres temperature.

In [11]:
t2m
Out[11]:
<xarray.DataArray 't2m' (time: 175320, latitude: 4096, longitude: 8193)> Size: 24TB
dask.array<sub, shape=(175320, 4096, 8193), dtype=float32, chunksize=(48, 512, 512), chunktype=numpy.ndarray>
Coordinates:
  * latitude   (latitude) float64 33kB -90.0 -89.96 -89.91 ... 89.91 89.96 90.0
  * longitude  (longitude) float64 66kB -180.0 -180.0 -179.9 ... 180.0 180.0
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 1MB 2020-01-01 ... 2039-12-31T23:00:00
Attributes: (12/19)
    GRIB_NV:                         0
    GRIB_cfName:                     air_temperature
    GRIB_cfVarName:                  t2m
    GRIB_dataType:                   fc
    GRIB_gridDefinitionDescription:  150
    GRIB_gridType:                   healpix
    ...                              ...
    GRIB_typeOfLevel:                heightAboveGround
    GRIB_units:                      K
    last_restart_dim_updated:        175320
    long_name:                       2 metre temperature
    standard_name:                   air_temperature
    units:                           °C
xarray.DataArray
't2m'
  • time: 175320
  • latitude: 4096
  • longitude: 8193
  • dask.array<chunksize=(48, 512, 512), meta=np.ndarray>
    Array Chunk
    Bytes 21.40 TiB 48.00 MiB
    Shape (175320, 4096, 8193) (48, 512, 512)
    Dask graph 496808 chunks in 3 graph layers
    Data type float32 numpy.ndarray
    8193 4096 175320
    • latitude
      (latitude)
      float64
      -90.0 -89.96 -89.91 ... 89.96 90.0
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array([-90.      , -89.956044, -89.912088, ...,  89.912088,  89.956044,
              90.      ])
    • longitude
      (longitude)
      float64
      -180.0 -180.0 ... 180.0 180.0
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array([-180.      , -179.956055, -179.912109, ...,  179.912109,  179.956055,
              180.      ])
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2020-01-01 ... 2039-12-31T23:00:00
      array(['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
             '2020-01-01T02:00:00.000000000', ..., '2039-12-31T21:00:00.000000000',
             '2039-12-31T22:00:00.000000000', '2039-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • latitude
      PandasIndex
      PandasIndex(Index([             -90.0, -89.95604395604396, -89.91208791208791,
             -89.86813186813187, -89.82417582417582, -89.78021978021978,
             -89.73626373626374,  -89.6923076923077, -89.64835164835165,
              -89.6043956043956,
             ...
              89.60439560439562,  89.64835164835165,  89.69230769230771,
              89.73626373626374,   89.7802197802198,  89.82417582417582,
              89.86813186813188,  89.91208791208791,  89.95604395604397,
                           90.0],
            dtype='float64', name='latitude', length=4096))
    • longitude
      PandasIndex
      PandasIndex(Index([             -180.0,     -179.9560546875, -179.91210937500003,
                 -179.8681640625,       -179.82421875,     -179.7802734375,
                  -179.736328125,     -179.6923828125,        -179.6484375,
                 -179.6044921875,
             ...
                  179.6044921875,         179.6484375,      179.6923828125,
                   179.736328125,      179.7802734375,        179.82421875,
                  179.8681640625,  179.91210937500003,      179.9560546875,
                           180.0],
            dtype='float64', name='longitude', length=8193))
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
                     '2020-01-01 02:00:00', '2020-01-01 03:00:00',
                     '2020-01-01 04:00:00', '2020-01-01 05:00:00',
                     '2020-01-01 06:00:00', '2020-01-01 07:00:00',
                     '2020-01-01 08:00:00', '2020-01-01 09:00:00',
                     ...
                     '2039-12-31 14:00:00', '2039-12-31 15:00:00',
                     '2039-12-31 16:00:00', '2039-12-31 17:00:00',
                     '2039-12-31 18:00:00', '2039-12-31 19:00:00',
                     '2039-12-31 20:00:00', '2039-12-31 21:00:00',
                     '2039-12-31 22:00:00', '2039-12-31 23:00:00'],
                    dtype='datetime64[ns]', name='time', length=175320, freq=None))
  • GRIB_NV :
    0
    GRIB_cfName :
    air_temperature
    GRIB_cfVarName :
    t2m
    GRIB_dataType :
    fc
    GRIB_gridDefinitionDescription :
    150
    GRIB_gridType :
    healpix
    GRIB_missingValue :
    3.4028234663852886e+38
    GRIB_name :
    2 metre temperature
    GRIB_numberOfPoints :
    12582912
    GRIB_paramId :
    167
    GRIB_shortName :
    2t
    GRIB_stepType :
    instant
    GRIB_stepUnits :
    1
    GRIB_typeOfLevel :
    heightAboveGround
    GRIB_units :
    K
    last_restart_dim_updated :
    175320
    long_name :
    2 metre temperature
    standard_name :
    air_temperature
    units :
    °C

We narrow down the selection to the data that is closer to Darmstadt.

In [12]:
darmstadt = {"latitude": 49.88, "longitude": 8.65}
base_temperature = 15 #[°C]

t2m_darmstadt = t2m.sel(darmstadt, method="nearest")
t2m_darmstadt
Out[12]:
<xarray.DataArray 't2m' (time: 175320)> Size: 701kB
dask.array<getitem, shape=(175320,), dtype=float32, chunksize=(48,), chunktype=numpy.ndarray>
Coordinates:
    latitude   float64 8B 49.87
    longitude  float64 8B 8.657
    step       timedelta64[ns] 8B ...
    surface    float64 8B ...
  * time       (time) datetime64[ns] 1MB 2020-01-01 ... 2039-12-31T23:00:00
Attributes: (12/19)
    GRIB_NV:                         0
    GRIB_cfName:                     air_temperature
    GRIB_cfVarName:                  t2m
    GRIB_dataType:                   fc
    GRIB_gridDefinitionDescription:  150
    GRIB_gridType:                   healpix
    ...                              ...
    GRIB_typeOfLevel:                heightAboveGround
    GRIB_units:                      K
    last_restart_dim_updated:        175320
    long_name:                       2 metre temperature
    standard_name:                   air_temperature
    units:                           °C
xarray.DataArray
't2m'
  • time: 175320
  • dask.array<chunksize=(48,), meta=np.ndarray>
    Array Chunk
    Bytes 684.84 kiB 192 B
    Shape (175320,) (48,)
    Dask graph 3653 chunks in 4 graph layers
    Data type float32 numpy.ndarray
    175320 1
    • latitude
      ()
      float64
      49.87
      long_name :
      latitude
      standard_name :
      latitude
      units :
      degrees_north
      array(49.86813187)
    • longitude
      ()
      float64
      8.657
      long_name :
      longitude
      standard_name :
      longitude
      units :
      degrees_east
      array(8.65722656)
    • step
      ()
      timedelta64[ns]
      ...
      long_name :
      time since forecast_reference_time
      standard_name :
      forecast_period
      [1 values with dtype=timedelta64[ns]]
    • surface
      ()
      float64
      ...
      long_name :
      original GRIB coordinate for key: level(surface)
      units :
      1
      [1 values with dtype=float64]
    • time
      (time)
      datetime64[ns]
      2020-01-01 ... 2039-12-31T23:00:00
      array(['2020-01-01T00:00:00.000000000', '2020-01-01T01:00:00.000000000',
             '2020-01-01T02:00:00.000000000', ..., '2039-12-31T21:00:00.000000000',
             '2039-12-31T22:00:00.000000000', '2039-12-31T23:00:00.000000000'],
            dtype='datetime64[ns]')
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 01:00:00',
                     '2020-01-01 02:00:00', '2020-01-01 03:00:00',
                     '2020-01-01 04:00:00', '2020-01-01 05:00:00',
                     '2020-01-01 06:00:00', '2020-01-01 07:00:00',
                     '2020-01-01 08:00:00', '2020-01-01 09:00:00',
                     ...
                     '2039-12-31 14:00:00', '2039-12-31 15:00:00',
                     '2039-12-31 16:00:00', '2039-12-31 17:00:00',
                     '2039-12-31 18:00:00', '2039-12-31 19:00:00',
                     '2039-12-31 20:00:00', '2039-12-31 21:00:00',
                     '2039-12-31 22:00:00', '2039-12-31 23:00:00'],
                    dtype='datetime64[ns]', name='time', length=175320, freq=None))
  • GRIB_NV :
    0
    GRIB_cfName :
    air_temperature
    GRIB_cfVarName :
    t2m
    GRIB_dataType :
    fc
    GRIB_gridDefinitionDescription :
    150
    GRIB_gridType :
    healpix
    GRIB_missingValue :
    3.4028234663852886e+38
    GRIB_name :
    2 metre temperature
    GRIB_numberOfPoints :
    12582912
    GRIB_paramId :
    167
    GRIB_shortName :
    2t
    GRIB_stepType :
    instant
    GRIB_stepUnits :
    1
    GRIB_typeOfLevel :
    heightAboveGround
    GRIB_units :
    K
    last_restart_dim_updated :
    175320
    long_name :
    2 metre temperature
    standard_name :
    air_temperature
    units :
    °C

We estimate the cost of the download with the costing.py module.

In [13]:
costing.estimate_download_size(t2m, t2m_darmstadt)
estimated_needed_chunks: 3653
estimated_memory_size: 183.862 GB
estimated_download_size: 18.386 GB

We compute the HDD in Darmstadt with a very simplyfied formula:

In [14]:
t2m_darmstadt_daily_mean = t2m_darmstadt.resample(time='1D').mean(dim='time')
diff = (base_temperature - t2m_darmstadt_daily_mean)
hdd = diff.where(diff > 0).groupby("time.year").sum()

We explicitly trigger the download of the data. Remember to assign the return of the compute() function to a new variable, so that the data is kept in memory.

In [15]:
%%time

hdd_computed = hdd.compute()
CPU times: user 8min 48s, sys: 4min 53s, total: 13min 41s
Wall time: 3min 43s

We can finally visualize the HDD in Darmstadt.

In [16]:
plt.style.use("seaborn-v0_8-darkgrid")

fig, ax = plt.subplots()

plt.bar(hdd_computed.year, hdd.values, color='#ff0000', alpha=0.7)
plt.xlabel('time')
plt.ylabel('HDD [°C]')
plt.grid(axis='y', alpha=0.75)
plt.title('Heating Degrees Days in Darmstadt')
plt.xticks(hdd.year[::2]);
No description has been provided for this image