Young, A. M. et al. 2017. Climatic thresholds shape northern high-latitude fire regimes and imply
vulnerability to future climate change. Ecography 40:606-617.

Ecography DOI: 10.1111/ecog.02205
Dryad Repository DOI: 10.5061/dryad.r217r
NSF Arctic Data Center DOI: 10.18739/A2MP8P

#######################################################################################################################################
This set of files contains the data and scripts needed to recreate the results from Young et al. 2017.
#######################################################################################################################################

-------------------------------------------------------------------------------------------------------------------------
AncillaryData  This folder contains files and data that support running the analysis and making the figures in
Young et al. (2017), and contains three subsets of data files:

1. The folder (train_test_data) contains all of the fire and climate data created by randomly selecting 30 years
   from 1950-2009. These are the data that are used to train and test the BRTs in 1_AK_BRTS.R, 2_BOREAL_BRTS.R,
   and 3_TUNDRA_BRTS.R.
2. The folder "Supp_future_projection_files" contains three MATLAB (.mat) files needed to run the '8_calc_proportions.m' script
   (see 'Scripts' below, in this nested dataset). These files are ONLY needed if the '8_calc_proportions.m' script is run
   independently of the previous 7 analysis scripts. Additionally, '7_summarize_future_projections.m' will generate the files
   as well located in this folder.
3. The folder (Shapefiles) contains two shape files: (1) the outline of the 22 Alaskan ecoregions used in this
   study (ecor_outlines_for_fig1.shp), and (2) a general coastline map of Alaska the excludes islands and southeast
   Alaska (ak_shape_noislands_noSE.shp).
4. There is a collection of 10 supporting files:

	* AnnDEF_1950_2009.tif - 1950 - 2009 mean total annual moisture availability per pixel (mm).
	* climlims.mat - climatological limits for each explanatory variable in each modelling domain (AK, BOREAL, and TUNDRA).
	  Used for plotting.
	* ecor_info.csv and ecor_info.xlsx -  Classification information for each ecoregion at Level I (boreal forest or tundra)
          and Level III (specific ecoregion).
	  Columns in these files are representative of:
		(1) Ecoregion name,
		(2) Original (i.e., Nowacki et al. 2001) Level III classification value,
		(3) Level I classification (1 = Tundra, 2 = boreal forest),
		(4) Modified Level III classification,
		(5) Latitude of approximate center of ecoregion,
		(6) Longitude of approximate center of ecoregion.
	* fire_occ_1950_2009.tif - Fire occurrence for each pixel from 1950-2009 (-9999 = NaN/no fire, 1 = fire occurrence).
	* Lat.tif - latitude in decimal degrees at the center of each pixel. Used for plotting.
	* Lon.tif - longitude in decimal degrees at the center of each pixel. Used for plotting.
	* masks.mat - 725 x 687 x 3 matlab array. Contains the spatial masks for each model (AK, BOREAL, and TUNDRA).
          The first two dimensions are the rows and columns of the gridded Alaska map and the third dimension contains
          information on whether each 2-km pixel is considered within each of the three spatial domains
          (NaN = not in study area, 1 = in study area).
	* obsfrp.tif - observed 1950-2009 fire rotation period (FRP) for each ecoregion in Alaska (-9999 = NaN, -1 = no FRP estimate)
	* TempWarm_1950_2009.tif -  1950 - 2009 mean temperature of the warmest month per pixel (°C).

-------------------------------------------------------------------------------------------------------------------------
AK_FINAL_RESULTS - This folder contains output for the AK model generated by 1_AK_BRTS.R,4_Identify_Climatic_Thresholds.R,
and 6_corr_results.m scripts:

	* brt_1.RData, brt_2.RData, ..., brt_100.RData - raw results/output from the 100 gbm objects created using the gbm
          package in R.

	* pred_map_1.tif, pred_map_2.tif, ..., pred_map_100.tif - These data are gridded maps of Alaska containing the predicted
          30-yr probabiltiy of fire occurrence for each pixel during the historical period. Predictions are made using the gbm
          models on a holdout sample of the data not used to train the gbms.

	* AUC.csv - AUC results from ROC analysis of 100 BRT models. There is a single column of values representing the AUC in this
	  CSV file.

	* class_rates.csv - classification rates from ROC analysis (TPR: true positive rate, TNR: True negative rate, FPR: False
          positive rate, FNR: false negative rate). Probability thresholds for classifications were determined using methods
          described in Young et al. (2017).

	* climThresholds_AnnDEF.csv - climatic thresholds determined from Identify_Climatic_Thresholds.R for annual moisture
          availability (units = mm). These is a single column of representing these threshold values for the 100 individual BRTs.
	  ***NOTE*** - For the TUNDRA models there are two columns, representing a lower and upper threshold due to the non-monotonic
	  nature of this relationship.

	* climThresholds_TempWarm.csv - climatic thresholds determined from Identify_Climatic_Thresholds.R for temperature of
          the warmest month (units = °C).These is a single column of representing these threshold values for the 100 individual BRTs.

	* frp_o.csv - 22 x 100 matrix of observed 30-yr fire rotation periods for each ecoregion (rows) for each BRT iteration 1-100
          (columns).

	* frp_p.csv - 22 x 100 matrix of predicted 30-yr fire rotation periods for each ecoregion (rows) for each BRT iteration 1-100
          (columns).

	* frp_t.csv - Data from frp_o.csv and frp_p.csv available in a two-column matrix (observed and predicted). All FRPs among
	  all ecoregions are combined in this CSV file.

	* model_train_err.csv - 5000 x 100 matrix of training error data for each BRT. The 5000 rows refer to the 5000 regression trees
	  in each BRT model.

	* model_valid_err.csv - 5000 x 100 matrix of cross-validation error for each BRT. The 5000 rows refer to the 5000 regression trees
	  in each BRT model.

	* nTrees.csv - Optimal number of regression trees determined by cross validation for each BRT model. The 100 values refer to the
	  100 BRT models.

	* partDep_AnnDEF.csv - Partial dependence results for annual moisture availability. These data are used make Fig. 4.
	  Column labels:
		(1) The first column in this CSV file is just an index and not used in plotting.
		(2) The second column contains the AnnDEF values (i.e., explanatory variable values). [Units = mm]
		(3) Column #s 3-102 represent the predicted probabilities for plotting the partial dependence plots for the 1-100 BRTs.

	* partDep_TempWarm.csv - Partial dependence results for temperature of the warmest month. These data are used make Fig. 4.
		(1) The first column in this CSV file is just an index and not used in plotting.
		(2) The second column contains the T_WARM value (i.e., explanatory variable values). [Units = °C]
		(3) Column #s 3-102 represent the predicted probabilities for plotting the partial dependence plots for the 1-100 BRTs.

	* partDep_TR.csv - Partial dependence results for topographic ruggedness.
		(1) The first column in this CSV file is just an index and not used in plotting.
		(2) The second column contains the Topographic Ruggeddness values (i.e., explanatory variable values). [Units = meters].
		(3) Column #s 3-102 represent the predicted probabilities for plotting the partial dependence plots for the 1-100 BRTs.

	* partDep_Veg.csv - Partial dependence results for vegetation type.
		(1) The first column in this CSV file is just an index and not used in plotting.
		(2) The second column contains the Vegetation classifications (i.e., explanatory variable values). 1 = Wetland Tundra,
		    2 = Shrub Tundra, 3 = Graminoid Tundra, 4 = Barren Tundra, 5 = Boreal Forest
		(3) Column #s 3-102 represent the predicted probabilities for plotting the partial dependence plots for the 1-100 BRTs.

	* relInf.csv - Relative influence results from BRT analysis for each explanatory variable. These data are used make Fig. 3.
	  Explanatory variables are labeled in each column. The 100 rows correspond to the 100 BRTs.

	* TempWarm_AnnDEF_int.csv - Partial dependence results for the two-way interaction between temperature of the warmest month
          and annual moisture availability. These data are used make Fig. 5.
		(1) The first column in this CSV file is just an index and not used in plotting.
		(2) The second column contains the T_WARM values (i.e., explanatory variable values). [Units = °C].
	 	(3) The third column contains the AnnDEF values (i.e., explanatory variable values). [Units = mm].
		(4) Column #s 3-102 represent the predicted probabilities for plotting the partial dependence plots for the 1-100 BRTs.

	* thresholds.csv - Probability threshold determined from maximizing the summation of the true positive rates (TPR) and true
          negative rates (TNR). There are 100 values corresponding to the 100 BRTs.

	* corresults.mat - Pearson correlation results between predicted and observed FRPs. Calculated using the 6_corr_results.m
          script. These data are presented in Fig. 2.

	* med_pred_prob.mat - median predicted historical probability of fire occurrence per pixel. Created using the 100 predicted
          probability maps (i.e., pred_map_1.tif, pred_map_2.tif, ..., pred_map_100.tif). Calculated using the 6_corr_results.m script.
          These data are presented in Fig. 2.

-----------------------------------------------------------------------------------------------------------------------------
BOREAL_FINAL_RESULTS - This folder contains the same output files as in the AK_FINAL_RESULTS folder, except for the BOREAL model.

-----------------------------------------------------------------------------------------------------------------------------
TUNDRA_FINAL_RESULTS - This folder contains the same output files as in the AK_FINAL_RESULTS folder, except for the TUNDRA model.

-----------------------------------------------------------------------------------------------------------------------------
Scripts - There are three subsets of files in the Scripts folder: (1) Eight R (.R) and Matlab (.m) files sequentially numbered to
reproduce the results in Young et al. 2017, (2) six Matlab (.m) files used to make the Figures in Young et al. (2017), and (3) a
folder containing custom written functions that support the eight scripts. If the scripts are run in the same order as they are
numbered then the exact results and figures from the main text will be reproduced.

	Analysis Scripts:

	* 1_AK_BRTS.R - Script that uses randomized climatologies, vegetation data, and topographic ruggedness data in
          conjunction with the gbm package (v2.1.1) to create the set of 100 BRTs that comprise the AK model.
	* 2_BOREAL_BRTS.R - Script that uses randomized climatologies, vegetation data, and topographic ruggedness data in
          conjunction with the gbm package (v2.1.1) to create the set of 100 BRTs that comprise the BOREAL model.
	* 3_TUNDRA_BRTS.R - Script that uses randomized climatologies, vegetation data, and topographic ruggedness data in
          conjunction with the gbm package (v2.1.1) to create the set of 100 BRTs that comprise the TUNDRA model.
	* 4_Identify_Climatic_Thresholds.R - Script that uses the segmented package (v0.5-1.4) to identify climatic thresholds
          from BRT partial dependence results. Partial dependence results were exported as .csv files in the three BRT scripts
          (i.e., #1-3).
	* 5_project_21stCentury_fire.R - Script that uses the BRTs that comprise the AK model (created in 1_AK_BRTS.R) to
          project the 30-yr probability of fire occurrence per pixel for Alaska in the 21st-century.
	* 6_corr_results.m - Main purpose is to calculate the Pearson correlation between predicted and observed fire rotation
          period (FRP) estimates for Alaskan ecoregions. FRP estimates were exported in the BRT scripts (i.e., #1-3).
	* 7_summarize_future_projections.m - Takes BRT-generated future projections of the 30-yr probability of fire occurrence
          and calculates (1) the median predicted 30-yr probability of fire occurrence of all 100 BRT projections for all five GCMs
          and for three time periods, and (2) the median ratio between future and historical FRPs for all GCMs and time periods.
	* 8_calc_proportions.m - Calculate the percentage of pixels that occurs in eight different discrete classes of relative
          change in the FRP (i.e., ratio).
	* 9_FIG_2.m  - Creates Figure 2 in Young et al. (2017).
	* 10_FIG_3.m - Creates Figure 3 in Young et al. (2017).
	* 11_FIG_4.m - Creates Figure 4 in Young et al. (2017).
	* 12_FIG_5.m - Creates Figure 5 in Young et al. (2017).
	* 13_FIG_6.m - Creates Figure 6 in Young et al. (2017).
	* 14_FIG_7.m - Creates Figure 7 in Young et al. (2017).

	Functions:

	* aucroc.R - Calculates AUC value and classification rates. This function was inspired and based off of the "evaluate"
		     function in the 'dismo' package (Robert J. Hijmans et al. 2016).
	* createClimatologies.m - Function that takes gridded GeoTiff files at a monthly or annual resolution and calculates
				  climatological averages per pixel for a user-defined time period. The climate data in this
				  study were obtained from the Scenarios Network for Alaska and Arctic Planning (2015).
	* akaxes.m - plot background map of Alaska.

-------------------------------------------------------------------------------------------------------------------------
References -

Nowacki, G., Spencer, P., Brock, T., Fleming, M. & Jorgenson, T. (2001). Ecoregions of Alaska and neighboring territories -
	US Geological Survey Open-File Report 02-297 (map). Online at http://agdc.usgs.gov/data/projects/fhm/#H, USGS Reston, VA.
Hijmans, R. J., Phillips, S., Leathwick, J., and Elith, J. (2016). dismo: Species Distribution Modeling.
	R package version 1.0-15. http://CRAN.R-project.org/package=dismo
Scenarios Network for Alaska and Arctic Planning, University of Alaska. 2015. Projected Monthly Temperature and Precipitation
	- 2 km CMIP5/AR5. Retrieved January 2015 from https://www.snap.uaf.edu/tools/data-downloads.