Getting started

Earth Data Hub offers an innovative and super-efficient way to access data. Here is what you need to know to start working.

Datasets are published as Zarr stores encoded in netCDF format and every user has a monthly quota, so downloads must be authenticated.

We will show how to obtain and use your authentication credentials later, let's start by connecting to a test dataset that doesn't need authentication.

Open your first dataset

The easiest way to get started with Earth Data Hub is by using Python and Xarray, on our public test dataset.

Be sure to have Python set up and install the basic tools running:

pip install xarray zarr dask

now you are ready to open our public dataset for testing, start Python and run:

import xarray as xr

xr.open_dataset(
    "https://data.earthdatahub.destine.eu/public/test-dataset-v0.zarr",
    chunks={},
    engine="zarr",
)

this will display the data as an xarray.Dataset.

If you use a different set of tools we suggest to get them to work with the public test dataset at "https://data.earthdatahub.destine.eu/public/test-dataset-v0.zarr" before attempting to set up the authentication.

Setup your credentials

To access the datasets in Earth Data Hub you need to obtain a personal access token and instruct your tools to use it when downloading the data.

How to obtain the personal access token

To obtain a personal access token you first need to register to the DestinE platform. Then you can go to Earth Data Hub account settings where you can find your default personal access tokens or create others.

Adding the token to the URL

In the following, we will access the same small test dataset as before, but this time from a URL that is authorisation protected: https://data.earthdatahub.destine.eu/private/test-dataset-v0.zarr.

The easiest way is to pass the personal access token as a password in the Zarr store URL, for example:

import xarray as xr

xr.open_dataset(
    "https://edh:<your personal access token>@data.earthdatahub.destine.eu/private/test-dataset-v0.zarr",
    chunks={},
    engine="zarr",
)

Configuring the token in the .netrc file

A more convenient way to set up the access token if you plan to use the system, is to configure the .netrc file as follows:

machine data.earthdatahub.destine.eu
     password <your personal access token>

Once this is set up you can use the URL above directly and similarly, the other URL that you find in the catalogue:

import xarray as xr

xr.open_dataset(
    "https://data.earthdatahub.destine.eu/private/test-dataset-v0.zarr",
    storage_options={"client_kwargs":{"trust_env":"true"}},
    chunks={},
    engine="zarr",
)

Keep in mind that some tools do not use the .netrc file by default, but can be instructed to do so, for example, the storage_options={"client_kwargs":{"trust_env":"true"}} option is needed by Xarray / Zarr.

How to use the catalogue

Datasets are offered as homogeneous Zarr stores encoded as a netCDF, even if a dataset is identified by a single URL its size may be very large, easily hundreds of terabytes.

You will find the code snippet to access the data with Xarray on the dataset page and that will work out of the box after you set up the .netrc file as described above.

Be careful, do not try to download an entire dataset, chances are that you will exceed your quota well before you come even close! Instead, we show how to access and work with large datasets in our tutorials.