Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,233 @@
|
|
1 |
-
---
|
2 |
-
license: cc-by-4.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: cc-by-4.0
|
3 |
+
metrics:
|
4 |
+
- mse
|
5 |
+
pipeline_tag: graph-ml
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
library_name: anemoi
|
9 |
+
---
|
10 |
+
|
11 |
+
# AIFS Single - v1.0
|
12 |
+
|
13 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
14 |
+
|
15 |
+
Here, we introduce the **Artificial Intelligence Forecasting System (AIFS)**, a data driven forecast
|
16 |
+
model developed by the European Centre for Medium-Range Weather Forecasts (ECMWF).
|
17 |
+
|
18 |
+
The operational release of AIFS Single v1.0 marks the first operationally supported AIFS model. Version 1 will
|
19 |
+
supersede the existing experimental version, [0.2.1 AIFS-single](https://huggingface.co/ecmwf/aifs-single).
|
20 |
+
The new version, 1.0, will bring changes to the AIFS single model, including among many others:
|
21 |
+
|
22 |
+
- Improved performance for upper-level atmospheric variables (AIFS-single still uses 13 pressure-levels, so this improvement mainly refers to 50 hPa)
|
23 |
+
- Improved scores for total precipitation.
|
24 |
+
- Additional output variables, including 100 meter winds, snow-fall, solar-radiation and land variables such as soil-moisture and soil-temperature.
|
25 |
+
|
26 |
+
<div style="display: flex; justify-content: center;">
|
27 |
+
<img src="assets/aifs_10days.gif" alt="AIFS 10 days Forecast" style="width: 50%;"/>
|
28 |
+
</div>
|
29 |
+
|
30 |
+
AIFS produces highly skilled forecasts for upper-air variables, surface weather parameters and
|
31 |
+
tropical cyclone tracks. AIFS-single is run four times daily alongside ECMWF’s physics-based NWP model and forecasts
|
32 |
+
are available to the public under ECMWF’s open data policy. (https://www.ecmwf.int/en/forecasts/datasets/open-data)
|
33 |
+
|
34 |
+
## Model Details
|
35 |
+
|
36 |
+
### Model Description
|
37 |
+
|
38 |
+
<!-- Provide a longer summary of what this model is. -->
|
39 |
+
|
40 |
+
AIFS is based on a graph neural network (GNN) encoder and decoder, and a sliding window transformer processor,
|
41 |
+
and is trained on ECMWF’s ERA5 re-analysis and ECMWF’s operational numerical weather prediction (NWP) analyses.
|
42 |
+
|
43 |
+
<div style="display: flex; justify-content: center;">
|
44 |
+
<img src="assets/encoder_graph.jpeg" alt="Encoder graph" style="width: 50%;"/>
|
45 |
+
<img src="assets/decoder_graph.jpeg" alt="Decoder graph" style="width: 50%;"/>
|
46 |
+
</div>
|
47 |
+
|
48 |
+
It has a flexible and modular design and supports several levels of parallelism to enable training on
|
49 |
+
high resolution input data. AIFS forecast skill is assessed by comparing its forecasts to NWP analyses
|
50 |
+
and direct observational data.
|
51 |
+
|
52 |
+
- **Developed by:** ECMWF
|
53 |
+
- **Model type:** Encoder-processor-decoder model
|
54 |
+
- **License:** These model weights are published under a Creative Commons Attribution 4.0 International (CC BY 4.0).
|
55 |
+
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/
|
56 |
+
|
57 |
+
### Model Sources
|
58 |
+
|
59 |
+
<!-- Provide the basic links for the model. -->
|
60 |
+
|
61 |
+
|
62 |
+
- **Repository:** [Anemoi](https://anemoi-docs.readthedocs.io/en/latest/index.html)
|
63 |
+
is an open-source framework for creating machine learning (ML) weather forecasting systems, which ECMWF and a range of national meteorological services across Europe have co-developed.
|
64 |
+
- **Paper:** https://arxiv.org/pdf/2406.01465
|
65 |
+
|
66 |
+
## How to Get Started with the Model
|
67 |
+
|
68 |
+
To generate a new forecast using AIFS, you can use [anemoi-inference](https://github.com/ecmwf/anemoi-inference). In the [following notebook](run_AIFS_v0_2_1.ipynb), a
|
69 |
+
step-by-step workflow is specified to run the AIFS using the HuggingFace model:
|
70 |
+
|
71 |
+
1. **Install Required Packages and Imports**
|
72 |
+
2. **Retrieve Initial Conditions from ECMWF Open Data**
|
73 |
+
- Select a date
|
74 |
+
- Get the data from the [ECMWF Open Data API](https://www.ecmwf.int/en/forecasts/datasets/open-data)
|
75 |
+
- Get input fields
|
76 |
+
- Add the single levels fields and pressure levels fields
|
77 |
+
- Convert geopotential height into geopotential
|
78 |
+
- Create the initial state
|
79 |
+
3. **Load the Model and Run the Forecast**
|
80 |
+
- Download the Model's Checkpoint from Hugging Face
|
81 |
+
- Create a runner
|
82 |
+
- Run the forecast using anemoi-inference
|
83 |
+
4. **Inspect the generated forecast**
|
84 |
+
- Plot a field
|
85 |
+
|
86 |
+
|
87 |
+
🚨 **Note** we train AIFS using `flash_attention` (https://github.com/Dao-AILab/flash-attention).
|
88 |
+
The use of 'Flash Attention' package also imposes certain requirements in terms of software and hardware. Those can be found under #Installation and Features in https://github.com/Dao-AILab/flash-attention
|
89 |
+
|
90 |
+
🚨 **Note** the `aifs_single_v0.2.1.ckpt` checkpoint just contains the model’s weights.
|
91 |
+
That file does not contain any information about the optimizer states, lr-scheduler states, etc.
|
92 |
+
|
93 |
+
|
94 |
+
## Training Details
|
95 |
+
|
96 |
+
### Training Data
|
97 |
+
|
98 |
+
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
99 |
+
|
100 |
+
AIFS is trained to produce 6-hour forecasts. It receives as input a representation of the atmospheric states
|
101 |
+
at \\(t_{−6h}\\), \\(t_{0}\\), and then forecasts the state at time \\(t_{+6h}\\).
|
102 |
+
|
103 |
+
<div style="display: flex; justify-content: center;">
|
104 |
+
<img src="assets/aifs_diagram.png" alt="AIFS 2m Temperature" style="width: 80%;"/>
|
105 |
+
</div>
|
106 |
+
|
107 |
+
The full list of input and output fields is shown below:
|
108 |
+
|
109 |
+
| Field | Level type | Input/Output |
|
110 |
+
|-------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------|--------------|
|
111 |
+
| Geopotential, horizontal and vertical wind components, specific humidity, temperature | Pressure level: 50,100, 150, 200, 250,300, 400, 500, 600,700, 850, 925, 1000 | Both |
|
112 |
+
| Surface pressure, mean sea-level pressure, skin temperature, 2 m temperature, 2 m dewpoint temperature, 10 m horizontal wind components, total column water | Surface | Both |
|
113 |
+
| Total precipitation, convective precipitation | Surface | Output |
|
114 |
+
| Land-sea mask, orography, standard deviation of sub-grid orography, slope of sub-scale orography, insolation, latitude/longitude, time of day/day of year | Surface | Input |
|
115 |
+
|
116 |
+
Input and output states are normalised to unit variance and zero mean for each level. Some of
|
117 |
+
the forcing variables, like orography, are min-max normalised.
|
118 |
+
|
119 |
+
### Training Procedure
|
120 |
+
|
121 |
+
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
122 |
+
|
123 |
+
- **Pre-training**: It was performed on ERA5 for the years 1979 to 2020 with a cosine learning rate (LR) schedule and a total
|
124 |
+
of 260,000 steps. The LR is increased from 0 to \\(10^{-4}\\) during the first 1000 steps, then it is annealed to a minimum
|
125 |
+
of \\(3 × 10^{-7}\\).
|
126 |
+
- **Fine-tuning I**: The pre-training is then followed by rollout on ERA5 for the years 1979 to 2018, this time with a LR
|
127 |
+
of \\(6 × 10^{-7}\\). As in [Lam et al. [2023]](doi: 10.21957/slk503fs2i) we increase the
|
128 |
+
rollout every 1000 training steps up to a maximum of 72 h (12 auto-regressive steps).
|
129 |
+
- **Fine-tuning II**: Finally, to further improve forecast performance, we fine-tune the model on operational real-time IFS NWP
|
130 |
+
analyses. This is done via another round of rollout training, this time using IFS operational analysis data
|
131 |
+
from 2019 and 2020
|
132 |
+
|
133 |
+
|
134 |
+
#### Training Hyperparameters
|
135 |
+
|
136 |
+
- **Optimizer:** We use *AdamW* (Loshchilov and Hutter [2019]) with the \\(β\\)-coefficients set to 0.9 and 0.95.
|
137 |
+
|
138 |
+
- **Loss function:** The loss function is an area-weighted mean squared error (MSE) between the target atmospheric state
|
139 |
+
and prediction.
|
140 |
+
|
141 |
+
- **Loss scaling:** A loss scaling is applied for each output variable. The scaling was chosen empirically such that
|
142 |
+
all prognostic variables have roughly equal contributions to the loss, with the exception of the vertical velocities,
|
143 |
+
for which the weight was reduced. The loss weights also decrease linearly with height, which means that levels in
|
144 |
+
the upper atmosphere (e.g., 50 hPa) contribute relatively little to the total loss value.
|
145 |
+
|
146 |
+
#### Speeds, Sizes, Times
|
147 |
+
|
148 |
+
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
|
149 |
+
|
150 |
+
Data parallelism is used for training, with a batch size of 16. One model instance is split across four 40GB A100
|
151 |
+
GPUs within one node. Training is done using mixed precision (Micikevicius et al. [2018]), and the entire process
|
152 |
+
takes about one week, with 64 GPUs in total. The checkpoint size is 1.19 GB and as mentioned above, it does not include the optimizer
|
153 |
+
state.
|
154 |
+
|
155 |
+
## Evaluation
|
156 |
+
|
157 |
+
<!-- This section describes the evaluation protocols and provides the results. -->
|
158 |
+
|
159 |
+
AIFS is evaluated against ECMWF IFS (Integrated Forecast System) for 2022. The results of such evaluation are summarized in
|
160 |
+
the scorecard below that compares different forecast skill measures across a range of
|
161 |
+
variables. For verification, each system is compared against the operational ECMWF analysis from which the forecasts
|
162 |
+
are initialised. In addition, the forecasts are compared against radiosonde observations of geopotential, temperature
|
163 |
+
and windspeed, and SYNOP observations of 2 m temperature, 10 m wind and 24 h total precipitation. The definition
|
164 |
+
of the metrics, such as ACC (ccaf), RMSE (rmsef) and forecast activity (standard deviation of forecast anomaly,
|
165 |
+
sdaf) can be found in e.g Ben Bouallegue et al. ` [2024].
|
166 |
+
|
167 |
+
<div style="display: flex; justify-content: center;">
|
168 |
+
<img src="assets/aifs_v021_scorecard.png" alt="Scorecard comparing forecast scores of AIFS versus IFS (2022)" style="width: 80%;"/>
|
169 |
+
</div>
|
170 |
+
|
171 |
+
|
172 |
+
Forecasts are initialised on 00 and 12 UTC. The scorecard show relative score changes as function of lead time (day 1 to 10) for northern extra-tropics (n.hem),
|
173 |
+
southern extra-tropics (s.hem), tropics and Europe. Blue colours mark score improvements and red colours score
|
174 |
+
degradations. Purple colours indicate an increased in standard deviation of forecast anomaly, while green colours
|
175 |
+
indicate a reduction. Framed rectangles indicate 95% significance level. Variables are geopotential (z), temperature
|
176 |
+
(t), wind speed (ff), mean sea level pressure (msl), 2 m temperature (2t), 10 m wind speed (10ff) and 24 hr total
|
177 |
+
precipitation (tp). Numbers behind variable abbreviations indicate variables on pressure levels (e.g., 500 hPa), and
|
178 |
+
suffix indicates verification against IFS NWP analyses (an) or radiosonde and SYNOP observations (ob). Scores
|
179 |
+
shown are anomaly correlation (ccaf), SEEPS (seeps, for precipitation), RMSE (rmsef) and standard deviation of
|
180 |
+
forecast anomaly (sdaf, see text for more explanation).
|
181 |
+
|
182 |
+
Additional evaluation analysis including tropycal cyclone performance or comparison against other popular data-driven models can be found in AIFS preprint (https://arxiv.org/pdf/2406.01465v1) section 4.
|
183 |
+
|
184 |
+
# Known limitations
|
185 |
+
- This version of AIFS shares certain limitations with some of the other data-driven weather forecast models that are trained with a weighted MSE loss, such as blurring of the forecast fields at longer lead times.
|
186 |
+
- AIFS exhibits reduced forecast skill in the stratosphere forecast owing to the linear loss scaling with height
|
187 |
+
- AIFS currently provides reduced intensity of some high-impact systems such as tropical cyclones.
|
188 |
+
|
189 |
+
## Technical Specifications
|
190 |
+
|
191 |
+
### Hardware
|
192 |
+
|
193 |
+
<!-- {{ hardware_requirements | default("[More Information Needed]", true)}} -->
|
194 |
+
|
195 |
+
We acknowledge PRACE for awarding us access to Leonardo, CINECA, Italy. In particular, this version of the AIFS has been trained
|
196 |
+
on 64 A100 GPUs (40GB).
|
197 |
+
|
198 |
+
### Software
|
199 |
+
|
200 |
+
The model was developed and trained using the [AnemoI framework](https://anemoi-docs.readthedocs.io/en/latest/index.html).
|
201 |
+
AnemoI is a framework for developing machine learning weather forecasting models. It comprises of components or packages
|
202 |
+
for preparing training datasets, conducting ML model training and a registry for datasets and trained models. AnemoI
|
203 |
+
provides tools for operational inference, including interfacing to verification software. As a framework it seeks to
|
204 |
+
handle many of the complexities that meteorological organisations will share, allowing them to easily train models from
|
205 |
+
existing recipes but with their own data.
|
206 |
+
|
207 |
+
## Citation
|
208 |
+
|
209 |
+
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
|
210 |
+
|
211 |
+
If you use this model in your work, please cite it as follows:
|
212 |
+
|
213 |
+
**BibTeX:**
|
214 |
+
|
215 |
+
```
|
216 |
+
@article{lang2024aifs,
|
217 |
+
title={AIFS-ECMWF's data-driven forecasting system},
|
218 |
+
author={Lang, Simon and Alexe, Mihai and Chantry, Matthew and Dramsch, Jesper and Pinault, Florian and Raoult, Baudouin and Clare, Mariana CA and Lessig, Christian and Maier-Gerber, Michael and Magnusson, Linus and others},
|
219 |
+
journal={arXiv preprint arXiv:2406.01465},
|
220 |
+
year={2024}
|
221 |
+
}
|
222 |
+
```
|
223 |
+
|
224 |
+
**APA:**
|
225 |
+
|
226 |
+
```
|
227 |
+
Lang, S., Alexe, M., Chantry, M., Dramsch, J., Pinault, F., Raoult, B., ... & Rabier, F. (2024). AIFS-ECMWF's data-driven forecasting system. arXiv preprint arXiv:2406.01465.
|
228 |
+
```
|
229 |
+
|
230 |
+
|
231 |
+
## More Information
|
232 |
+
|
233 |
+
[Find the paper here](https://arxiv.org/pdf/2406.01465)
|