The ExtremeWeather Dataset

About the Data

The data is available as one HDF5 file per year, which are formatted like so: “climo_yyyy.h5”, like “climo_1979.h5”.

Each HDF5 file contains two datasets:

Here is a snippet of code to load the datasets with the python library, h5py:

import h5py
data_path = "./climo_1979.h5"
h5f = h5py.File(data_path)
images = h5f["images"] # (1460,16,768,1152) numpy array
boxes = h5f["boxes"] # (1460,15,5) numpy array

The two variables, "images" and "boxes" are described below:

images

boxes

Download

Additional Info

Variable Values:

The variables are the 2nd dimension of the "images" dataset in the HDF5 in the following order:

0. PRECT

1. PS

2. PSL

3. QREFHT

4. T200

5. T500

6. TMQ

7. TREFHT

8. TS

9. U850

10. UBOT

11. V850

12. VBOT

13. Z100

14. Z200

15. ZBOT

More information as to what each variable means is available here

Using the Data in Your Paper

To use this data in a paper please cite this paper:

ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events, Racah et al., 2017.

bibtex link