Getting started with the Analytics Engine (AE)

This notebook covers:

  1. selecting data to work with
  2. retrieving a dataset from the catalog
  3. a simple plot to preview the data
  4. how to export that data

To execute a given ‘cell’ of this notebook, place the cursor in the cell and press the ‘play’ icon, or simply press shift+enter together. Some cells will take longer to run, and you will see a [$\ast$] to the left of the cell while AE is still working.

Step 0: Setup

These cells import our custom library climakitae, and any other specialized libraries needed for a given notebook.

import panel as pn
pn.extension()
import climakitae as ck

To use climakitae (the python ‘climate kit’ library containing our AE tools), load a new application:

app = ck.Application()

Step 1: Select data

Now we can call ‘select’ to display an interface from which to select the data to examine. Execute the cell, and read on for more explanation.

Currently, you can select from dynamically-downscaled data produced at hourly intervals. If you select ‘daily’ or ‘monthly’ for ‘Timescale’, you will receive an average of the hourly data. The spatial resolution options, on the other hand, are each the output of a different simulation, nesting to higher resolution over smaller areas.

Future projections are available for a greenhouse gas emission scenario (Shared Socioeconomic Pathway, or SSP) through 2100 for SSP 3-7.0 for 4 General Circulation Models (GCMs).

At 45 and 9km, more GCMs are to come, and one GCM was also downscaled for a higher and lower SSP. (Later, statistical downscaling will also be available at 3km for more GCMs.)

“Historical Climate” includes data from 1980-2014 simulated from the same GCMs used to produce the SSPs. They can be appended to a SSP time series using the option “Append historical.” Because this historical data is obtained through simulations, it represents average weather during the historical period and is not meant to capture historical timeseries as they occurred.

“Historical Reconstruction” provides a reference downscaled reanalysis dataset based on atmospheric models fit to satellite and station observations, and as a result will reflect observed historical time-evolution of the weather.

app.select()

Nothing is required to enter these selections, besides moving on to Step 2.

However, if you want to preview what has been selected, you can type “app.selections” alone in a new cell, and “app.location”. These store your selections behind-the-scenes.

($+$ will create a new cell, following the currently selected)

Step 2: Retrieve data

Call app.retrieve(), to assign the subset/combo of data specified to a variable name of your choosing, in an xarray DataArray format.

data_to_use = app.retrieve()

You can preview the data in the retrieved, aggregated dataset when this is complete.

data_to_use

Step 3: Visualize data

First some additional imports for plotting.

import hvplot.xarray
import cartopy.crs as ccrs

Preview the data before doing further calculations. This step may take longer than step 2, because the data is only loaded “lazily” until you output it (in visualize or export).

data_to_use = data_to_use.unify_chunks()
data_to_use.hvplot.quadmesh('lon','lat', groupby=['time','scenario','simulation'],
                     crs=ccrs.PlateCarree(),projection=ccrs.Orthographic(-118, 40),
                            project=True,rasterize=True,
                    coastline=True, features=['borders'])

More plotting helper-functions will be forthcoming.

See other notebooks for example analyses, or add your own.

# [insert your own code here]

You can load up another variable or resolution by modifying your selections and calling: next_data = app.retrieve()

If you do this a lot, and things are starting to get slow, you might want to try: data_to_use.close()

Step 4: Export data

Use the below code to export a dataset as a NetCDF, GeoTIFF, or CSV file.

app.export_as()

Provide the name of the dataset in the environment to export as well as a character string containing the file name in quotations.

If the dataset contains multiple variables, provide an argument specifying which variable to export (e.g. variable=”T2”).

If you would like to save data as a GeoTIFF or CSV file and the dataset contains scenarios or simulations, additionally provide arguments specifying the scenario (scenario=”historical”) and the simulation (simulation=”cesm2”).

app.export_dataset(data_to_use,'my_filename')