jordancaraballo commited on
Commit
5e2e65c
1 Parent(s): 046eae9

Adding wrf components to production

Browse files
README.md CHANGED
@@ -13,7 +13,7 @@ app_port: 7860
13
 
14
  Wildfire occurrence modeling using Terrestrial Ecosystem Models and Artificial Intelligence
15
 
16
- [CG Lightning Probability Forecast](https://jordancaraballo-alaska-wildfire-occurrence.hf.space/)
17
 
18
  ## Objectives
19
 
@@ -23,50 +23,27 @@ Wildfire occurrence modeling using Terrestrial Ecosystem Models and Artificial I
23
  - 30m local Alaska models, 1km circumpolar models
24
  - Integration of precipitation, temperature and lightning datasets
25
 
26
- ## Datasets
27
-
28
- 1. Daily Fire Ignition Points
29
-
30
- ```bash
31
- ```
32
-
33
- 2. Daily Area Burned
34
-
35
- The dataset comes from https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1559 for 2001-2019. This dataset
36
- will be extended for 2020-2025. Dataset is located under /explore/nobackup/projects/ilab/projects/LobodaTFO/data/raw_data/ABoVE_DoB.
37
-
38
- ```bash
39
- python DAACDataDownload.py -dir /explore/nobackup/projects/ilab/projects/LobodaTFO/data/raw_data/ABoVE_DoB -f URL_FROM_ORDER
40
- ```
41
-
42
- 3. Annual Fuel Composition
43
-
44
- ```bash
45
- ```
46
-
47
- 4. Human Accesibility
48
-
49
- ```bash
50
- ```
51
 
52
- 5. Topographic Influence
53
 
54
  ```bash
 
 
55
  ```
56
 
57
- All datasets described above will be delivered in the 1 km modeling grid for tundra ecoregions.
58
-
59
- ## Containers
60
 
61
- ### Python Container
62
 
63
  ```bash
64
- module load singularity
65
- singularity build --sandbox /lscratch/$USER/container/wildfire-occurrence docker://nasanccs/wildfire-occurrence:latest
66
  ```
67
 
68
  ## Extracting variables from WRF
69
 
 
 
70
  ```bash
71
  singularity shell --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence/
72
  python wrf_analysis.py
@@ -78,8 +55,11 @@ python wrf_analysis.py
78
  singularity exec --env PYTHONPATH="/explore/nobackup/people/jacaraba/development/wildfire-occurrence" --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence python /explore/nobackup/people/jacaraba/development/wildfire-occurrence/wildfire_occurrence/model/lightning/lightning_model.py
79
  ```
80
 
81
- (base) [jacaraba@gpu021 ~]$ singularity exec --env PYTHONPATH="/explore/nobackup/people/jacaraba/development/wildfire-occurrence" --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence python /explore/nobackup/people/jacaraba/development/wildfire-occurrence/wildfire_occurrence/model/lightning/lightning_model.py
82
 
 
 
 
83
 
84
  ## Contributors
85
 
 
13
 
14
  Wildfire occurrence modeling using Terrestrial Ecosystem Models and Artificial Intelligence
15
 
16
+ [CG Lightning Probability Forecast](https://huggingface.co/spaces/jordancaraballo/alaska-wildfire-occurrence)
17
 
18
  ## Objectives
19
 
 
23
  - 30m local Alaska models, 1km circumpolar models
24
  - Integration of precipitation, temperature and lightning datasets
25
 
26
+ ## Containers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
+ ### Python Container
29
 
30
  ```bash
31
+ module load singularity
32
+ singularity build --sandbox /lscratch/$USER/container/wildfire-occurrence docker://nasanccs/wildfire-occurrence:latest
33
  ```
34
 
35
+ ## Quickstart
 
 
36
 
37
+ ### Executing WRF
38
 
39
  ```bash
40
+ singularity exec --env PYTHONPATH="/explore/nobackup/people/$USER/development/wildfire-occurrence" --nv -B /explore/nobackup/projects/ilab,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/$USER/container/wildfire-occurrence python /explore/nobackup/people/$USER/development/wildfire-occurrence/wildfire_occurrence/view/wrf_pipeline_cli.py -c /explore/nobackup/people/$USER/development/wildfire-occurrence/wildfire_occurrence/templates/config.yaml --pipeline-step all --start-date 2023-06-05 --forecast-lenght 10
 
41
  ```
42
 
43
  ## Extracting variables from WRF
44
 
45
+ Running this script to extract variables from WRF and perform lightning inference
46
+
47
  ```bash
48
  singularity shell --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence/
49
  python wrf_analysis.py
 
55
  singularity exec --env PYTHONPATH="/explore/nobackup/people/jacaraba/development/wildfire-occurrence" --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence python /explore/nobackup/people/jacaraba/development/wildfire-occurrence/wildfire_occurrence/model/lightning/lightning_model.py
56
  ```
57
 
58
+ Full Data Pipeline Command
59
 
60
+ ```bash
61
+ singularity exec --env PYTHONPATH="/explore/nobackup/people/jacaraba/development/wildfire-occurrence" --nv -B /explore/nobackup/projects/ilab,/explore/nobackup/projects/3sl,$NOBACKUP,/lscratch,/explore/nobackup/people /lscratch/jacaraba/container/wildfire-occurrence python /explore/nobackup/people/jacaraba/development/wildfire-occurrence/wildfire_occurrence/model/lightning/lightning_model.py
62
+ ```
63
 
64
  ## Contributors
65
 
wildfire_occurrence/model/__init__.py ADDED
File without changes
wildfire_occurrence/model/data_download/__init__.py ADDED
File without changes
wildfire_occurrence/model/data_download/ncep_fnl.py ADDED
@@ -0,0 +1,222 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import re
3
+ import sys
4
+ import logging
5
+ import requests
6
+ import datetime
7
+ import pandas as pd
8
+ from datetime import date
9
+ from typing import List, Literal
10
+ from multiprocessing import Pool, cpu_count
11
+
12
+ __data_source__ = 'https://rda.ucar.edu/datasets/ds083.2'
13
+
14
+
15
+ class NCEP_FNL(object):
16
+
17
+ def __init__(
18
+ self,
19
+ output_dir: str,
20
+ start_date: str = date.today(),
21
+ end_date: str = date.today(),
22
+ hour_intervals: List = ['00', '06', '12', '18'],
23
+ n_procs: int = cpu_count()
24
+ ):
25
+
26
+ # output directory
27
+ self.output_dir = output_dir
28
+
29
+ # define start and end data of download
30
+ if isinstance(start_date, str):
31
+ self.start_date = datetime.datetime.strptime(
32
+ start_date, '%Y-%m-%d').date()
33
+ else:
34
+ self.start_date = start_date
35
+
36
+ # define start and end data of download
37
+ if isinstance(end_date, str):
38
+ self.end_date = datetime.datetime.strptime(
39
+ end_date, '%Y-%m-%d').date()
40
+ else:
41
+ self.end_date = end_date
42
+
43
+ # define hour intervals
44
+ self.hour_intervals = hour_intervals
45
+
46
+ # make sure we do not download data into the future
47
+ if self.end_date > datetime.datetime.now():
48
+ self.end_date = datetime.datetime.now()
49
+ self.hour_intervals = [
50
+ d for d in self.hour_intervals
51
+ if int(d) <= self.end_date.hour - 6]
52
+ logging.info(
53
+ f'Downloading data from {self.start_date} to {self.end_date}')
54
+
55
+ # check for email and password environment variables
56
+ if "NCEP_FNL_EMAIL" not in os.environ \
57
+ or "NCEP_FNL_KEY" not in os.environ:
58
+ sys.exit(
59
+ "ERROR: You need to set NCEP_FNL_EMAIL and NCEP_FNL_KEY " +
60
+ "to enable data downloads. If you do not have an " +
61
+ "account, go to https://rda.ucar.edu/ and create one."
62
+ )
63
+
64
+ # define email and password fields
65
+ self.email = os.environ['NCEP_FNL_EMAIL']
66
+ assert re.search(r'[\w.]+\@[\w.]+', self.email), \
67
+ f'{self.email} is not a valid email.'
68
+
69
+ self.password = os.environ['NCEP_FNL_KEY']
70
+
71
+ # define cookie filename to store auth
72
+ self.cookie_filename = f'/home/{os.environ["USER"]}/.ncep_cookie'
73
+
74
+ # define login url
75
+ self.auth_url = 'https://rda.ucar.edu/cgi-bin/login'
76
+ self.auth_request = {
77
+ 'email': self.email,
78
+ 'passwd': self.password,
79
+ 'action': 'login'
80
+ }
81
+
82
+ # define data url
83
+ self.data_url = 'https://rda.ucar.edu'
84
+
85
+ if self.start_date.year < 2008:
86
+ self.grib_format = 'grib1'
87
+ else:
88
+ self.grib_format = 'grib2'
89
+
90
+ self.dataset_path = f'/data/OS/ds083.2/{self.grib_format}'
91
+
92
+ # nnumber of processors to use
93
+ self.n_procs = n_procs
94
+
95
+ def _authenticate(self, action: Literal["auth", "cleanup"] = "auth"):
96
+
97
+ if action == "cleanup":
98
+ # cleanup cookie filename
99
+ os.remove(self.cookie_filename)
100
+ else:
101
+ # attempt to authenticate
102
+ ret = requests.post(self.auth_url, data=self.auth_request)
103
+ if ret.status_code != 200:
104
+ sys.exit('Bad Authentication. Check email and password.')
105
+
106
+ logging.info('Authenticated')
107
+
108
+ os.system(
109
+ f'wget --save-cookies {self.cookie_filename} ' +
110
+ '--delete-after --no-verbose ' +
111
+ f'--post-data="email={self.email}&' +
112
+ f'passwd={self.password}&action=login" {self.auth_url}'
113
+ )
114
+ return
115
+
116
+ def _download_file(self, wget_request: str):
117
+ logging.info(wget_request)
118
+ os.system(wget_request)
119
+ return
120
+
121
+ def download(self):
122
+
123
+ # authenticate against NCEP
124
+ self._authenticate(action="auth")
125
+
126
+ # get list of filenames to download
127
+ filenames = self._get_filenames()
128
+
129
+ # setup list for parallel downloads
130
+ download_requests = []
131
+ for filename in filenames:
132
+
133
+ # get year from the filename
134
+ year = re.search(r'\d{4}', filename).group(0)
135
+
136
+ # set full output directory and create it
137
+ output_dir = os.path.join(self.output_dir, year)
138
+ os.makedirs(output_dir, exist_ok=True)
139
+
140
+ # set full url and output filename
141
+ full_url = self.data_url + filename
142
+ output_filename = os.path.join(
143
+ output_dir, os.path.basename(filename))
144
+ logging.info(f'Downloading {full_url} to {output_filename}')
145
+
146
+ # download request for parallel download
147
+ if not os.path.isfile(output_filename) or \
148
+ os.path.getsize(output_filename) == 0:
149
+ download_requests.append(
150
+ f'wget --load-cookies {self.cookie_filename} ' +
151
+ f'--no-verbose -O {output_filename} {full_url}'
152
+ )
153
+
154
+ # Set pool, start parallel multiprocessing
155
+ p = Pool(processes=self.n_procs)
156
+ p.map(self._download_file, download_requests)
157
+ p.close()
158
+ p.join()
159
+
160
+ # authenticate against NCEP
161
+ self._authenticate(action="cleanup")
162
+
163
+ return
164
+
165
+ def _get_filenames(self):
166
+ filenames_list = []
167
+ daterange = pd.date_range(self.start_date, self.end_date)
168
+ for single_date in daterange:
169
+ year = single_date.strftime("%Y")
170
+ for hour in self.hour_intervals:
171
+ filename = os.path.join(
172
+ self.dataset_path,
173
+ f'{year}/{single_date.strftime("%Y.%m")}',
174
+ f'fnl_{single_date.strftime("%Y%m%d")}_' +
175
+ f'{hour}_00.{self.grib_format}'
176
+ )
177
+ filenames_list.append(filename)
178
+ return filenames_list
179
+
180
+
181
+ # -----------------------------------------------------------------------------
182
+ # Invoke the main
183
+ # -----------------------------------------------------------------------------
184
+ if __name__ == "__main__":
185
+
186
+ dates = [
187
+ '2003-06-23',
188
+ '2005-06-11',
189
+ '2005-06-29',
190
+ '2005-08-16',
191
+ '2007-07-04',
192
+ '2007-07-11',
193
+ '2008-06-25',
194
+ '2009-06-09',
195
+ '2010-07-01',
196
+ '2013-06-20',
197
+ '2013-08-16',
198
+ '2015-07-14',
199
+ '2015-06-21',
200
+ '2015-07-23',
201
+ '2016-07-11',
202
+ '2022-01-10',
203
+ '2022-07-03',
204
+ '2018-02-25',
205
+ '2019-08-04',
206
+ '2019-08-19',
207
+ '2020-09-03',
208
+ '2022-05-09',
209
+ '2023-06-04'
210
+ ]
211
+
212
+ for init_date in dates:
213
+
214
+ start_date = datetime.datetime.strptime(init_date, "%Y-%m-%d")
215
+ end_date = (start_date + datetime.timedelta(days=10))
216
+
217
+ downloader = NCEP_FNL(
218
+ output_dir='/explore/nobackup/projects/ilab/projects/LobodaTFO/data/WRF_Data/NCEP_FNL',
219
+ start_date=start_date.strftime('%Y-%m-%d'),
220
+ end_date=end_date.strftime('%Y-%m-%d')
221
+ )
222
+ downloader.download()
wildfire_occurrence/model/pipelines/wrf_pipeline.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import logging
3
+ import datetime
4
+ from wildfire_occurrence.model.config import Config
5
+ from wildfire_occurrence.model.common import read_config
6
+ from wildfire_occurrence.model.data_download.ncep_fnl import NCEP_FNL
7
+
8
+
9
+ class WRFPipeline(object):
10
+
11
+ def __init__(
12
+ self,
13
+ config_filename: str,
14
+ start_date: str,
15
+ forecast_lenght: str
16
+ ):
17
+
18
+ # Configuration file intialization
19
+ self.conf = read_config(config_filename, Config)
20
+ logging.info(f'Loaded configuration from {config_filename}')
21
+
22
+ # Set value for forecast start and end date
23
+ self.start_date = start_date
24
+ self.end_date = self.start_date + datetime.timedelta(
25
+ days=forecast_lenght)
26
+ logging.info(f'WRF start: {self.start_date}, end: {self.end_date}')
27
+
28
+ # Generate working directories
29
+ os.makedirs(self.conf.working_dir, exist_ok=True)
30
+ logging.info(f'Created working directory {self.conf.working_dir}')
31
+
32
+ # Setup working directories and dates
33
+ self.output_dir = os.path.join(
34
+ self.conf.working_dir,
35
+ f'{self.start_date.strftime("%Y-%m-%d")}_' +
36
+ f'{self.start_date.strftime("%Y-%m-%d")}'
37
+ )
38
+ os.makedirs(self.output_dir, exist_ok=True)
39
+ logging.info(f'Created output directory {self.output_dir}')
40
+
41
+ # Setup data_dir
42
+ self.data_dir = os.path.join(self.output_dir, 'data')
43
+
44
+ # -------------------------------------------------------------------------
45
+ # download
46
+ # -------------------------------------------------------------------------
47
+ def download(self):
48
+
49
+ # Working on the setup of the project
50
+ logging.info('Starting download pipeline step')
51
+
52
+ # Generate subdirectories to work with WRF
53
+ os.makedirs(self.data_dir, exist_ok=True)
54
+ logging.info(f'Created data directory {self.data_dir}')
55
+
56
+ # Generate data downloader
57
+ data_downloader = NCEP_FNL(
58
+ self.data_dir,
59
+ self.start_date,
60
+ self.end_date
61
+ )
62
+ data_downloader.download()
63
+
64
+ return
65
+
66
+ # -------------------------------------------------------------------------
67
+ # download
68
+ # -------------------------------------------------------------------------
69
+ def geogrid(self):
70
+ logging.info('Running geogrid')
71
+ return
wildfire_occurrence/view/wrf_pipeline_cli.py ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import sys
2
+ import time
3
+ import logging
4
+ import argparse
5
+ from datetime import date
6
+ from wildfire_occurrence.model.common import valid_date
7
+ from wildfire_occurrence.model.pipelines.wrf_pipeline import WRFPipeline
8
+
9
+
10
+ # -----------------------------------------------------------------------------
11
+ # main
12
+ #
13
+ # python wrf_pipeline_cli.py -c config.yaml -sd 2023-04-05 \
14
+ # -ed 2023-04-05 -s all
15
+ # -----------------------------------------------------------------------------
16
+ def main():
17
+
18
+ # Process command-line args.
19
+ desc = 'Use this application to perform CNN regression.'
20
+ parser = argparse.ArgumentParser(description=desc)
21
+
22
+ parser.add_argument('-c',
23
+ '--config-file',
24
+ type=str,
25
+ required=True,
26
+ dest='config_file',
27
+ help='Path to the configuration file')
28
+
29
+ parser.add_argument('-d',
30
+ '--start-date',
31
+ type=valid_date,
32
+ required=False,
33
+ default=date.today(),
34
+ dest='start_date',
35
+ help='Start date for WRF')
36
+
37
+ parser.add_argument('-l',
38
+ '--forecast-lenght',
39
+ type=int,
40
+ required=False,
41
+ default=10,
42
+ dest='forecast_lenght',
43
+ help='Lenght of WRF forecast')
44
+
45
+ parser.add_argument(
46
+ '-s',
47
+ '--pipeline-step',
48
+ type=str,
49
+ nargs='*',
50
+ required=True,
51
+ dest='pipeline_step',
52
+ help='Pipeline step to perform',
53
+ default=[
54
+ 'download', 'geogrid',
55
+ 'ubgrib', 'real', 'wrf', 'all'],
56
+ choices=[
57
+ 'download', 'geogrid',
58
+ 'ubgrib', 'real', 'wrf', 'all'])
59
+
60
+ args = parser.parse_args()
61
+
62
+ # Setup logging
63
+ logger = logging.getLogger()
64
+ logger.setLevel(logging.INFO)
65
+ ch = logging.StreamHandler(sys.stdout)
66
+ ch.setLevel(logging.INFO)
67
+ formatter = logging.Formatter(
68
+ "%(asctime)s; %(levelname)s; %(message)s", "%Y-%m-%d %H:%M:%S"
69
+ )
70
+ ch.setFormatter(formatter)
71
+ logger.addHandler(ch)
72
+
73
+ # Setup timer to monitor script execution time
74
+ timer = time.time()
75
+
76
+ # Initialize pipeline object
77
+ pipeline = WRFPipeline(
78
+ args.config_file, args.start_date, args.forecast_lenght)
79
+
80
+ # Regression CHM pipeline steps
81
+ if "download" in args.pipeline_step or "all" in args.pipeline_step:
82
+ pipeline.download()
83
+
84
+ logging.info(f'Took {(time.time()-timer)/60.0:.2f} min.')
85
+
86
+
87
+ # -----------------------------------------------------------------------------
88
+ # Invoke the main
89
+ # -----------------------------------------------------------------------------
90
+ if __name__ == "__main__":
91
+ sys.exit(main())