This markdown illustrates the process of estimating growing season length and growing season mean temperatures for Arctic Alaska using PRISM temperature data and NIC IMS snow cover data. The motivation is to generate a raster map of growing season mean temperatures that is comparable to Paulsen & Körner’s (2014) definition of treeline growing seasons constrained by snow cover and temperature thresholds for xylogenesis.

Wrangling the snow cover data

The National Snow and Ice Data Center hosts amazing daily maps of snow cover data for the northern hemisphere. These data are manually generated by analysts as part of the National Ice Center’s Interactive Multisensor Snow and Ice Mapping System (IMS). The system integrates high-resolution visible band and lower resolution passive microwave satellite imagery along with other snow and ice products to produce the 4km snow and ice cover data. Details on the proceedure can be found here: https://nsidc.org/data/g02156#nesdis-derivation

Each daily file is a polar projection of Northern Hemisphere snow and ice cover (this is on 1 Jan 2019). The values are categorical: 1 = Sea, 2 = Land with no snow, 3 = Sea ice, 4 = Snow covered land.:

After downloading all of these files from the NSIDC ftp site, I can start processing them. The first step is to trim this down before doing anything else.

snow.ex.crop <- crop(snow.ex, extent(-4.55e+06, -1.5e+06, -2e+06, 0))
plot(snow.ex.crop)

Now we are zoomed in to Alaska and can re-project the raster to WGS84 lat/long. I further trim the raster down to just Arctic Alaska and remove the Sea and Ice categories (no trees there!).

Note that this map indicates that all of Arctic Alaska is snow covered on this day, which makes sense.

The above proceedure is repeated for all days within each year of snow cover data. The result is a raster stack for each year that contains 365 layes. Each year is then processed separately into a single raster that gives the julian day in which the snow first disappeared. This is accomplished with the following code (I don’t actually run it here because it is quite intensive):

calc(snow.stack, fun = function(x){try(detect_index(.x = na.omit(x), .f = function(z){z < 4}, .dir = 'forward')[1], silent = T)})

This operation is done by calc() in the raster package, which applies a function across a raster object. The super handy detect_index() funtion comes from the purrr package, which is part of tidyverse. detect_index() returns the index position specified by a user-defined function, which in our case is between 1:365 (i.e., a day of the year). The simple function z < 4 identifies when pixels first change value from 4 (snow-covered land) to 2 (bare land).

Here’s the result for 2019: a raster snowing the snow free dates for all pixels. I also resample the 4km snow cover grids to the 800m resolution of the PRISM data using the nearest neighbor method.

The pixels that never became snow free are shown as NA values here, though I left these values as 365 in the saved .tif files. This seems odd, but it facilitates the generation of a central tendency metric for snowfree date across the 14 years available (2006:2019). I needed a numeric value to avoid excluding these NAs from calculations of central tendency - crucial information that conveys that some pixels are ocassionally snow covered all year. Because I had to pick an arbitrary numeric value to represent ‘never snow free’, the mean would be invalid. I used the median snow free date instead.

A question that arises at this stage is whether there are meaningful trends in the 2006-2019 snowfree date. I think a simple way to do this would be to take a set of points in the map, then plot the snowfree dates as a time series for each point.

There is a hint of a trend of progressively earlier snow free dates across some of the strip locations, but nothing strong over this period. The strip with the really late snowfree dates in 2017-2018 is Red Sheep Creek. When I do a linear regression model that treats the snowfree ~ year relationship as dependent on strip (snowfree ~ year:strip + strip), I find no significant decreasing trends across all the strips. The only significant one is the increasing trend at Red Sheep. While this is a small subset of points (maybe there are decreasing trends somewhere), the results seem to justify using the median snowfree date as a fair characterization of the whole period. Of course, I still can’t know about the pre-2006 snowfree dates.

Defining potential growing season length from PRISM temperature

The next step is to figure out how long growing seasons are across Arctic Alaska. The essential method of this process is from Paulsen & Körner (2014): take the monthly PRISM mean temperature data and temporally interpolate it to a daily series using splines, then use pre-defined daily temperature thresholds to delineate the growing period.

The PRISM data is read in as a raster stack with 12 layers, one for each month of the year. These are monthly normals for 1981-2010. Here’s an illustration of the process on a single pixel.

Only the actual daily values are smoother:

The powerful calc() function makes it possible to apply the spline process across the whole monthly raster stack.

calc(x = tmean.stack, fun = function(x){try(spline(mos, na.omit(x), n = length(16:350), method = 'natural')$y, silent = T)})

The result of this code is a new raster stack that has 335 layers, one for each day from the 16th to the 350th day of the year. You’ll notice in the spline above that the curve starts in the middle of the 1st month and ends in the middle of the 12th month. I approximated this cut off by subtracting 15 days from each end of the year.

Now that we have a raster stack that represents the ‘daily’ mean temperature progression through a year, we can start to define constraints for the growing season. Rossi et al. (2008) made careful observations of xylogenesis in treeline conifers and found that growth starts when daily minima rise above 4.1±1.3 °C and stop when daily minima fall below 5.3±2.0 °C. These values were not statistically different and the authors summarize them as being between 4-5 °C. I used the lower end of this interval, 4 °C, as the threshold to define the begining and end of the thermal growing season, respectively. In contrast, Paulsen and Körner (2014) used a threshold of 0.9 °C.

begin.tmean <- calc(spline.tmean,fun = function(x){try(detect_index(.x = na.omit(x), .f = function(x){x >= 4}, .dir = 'forward') silent = T)})

end.tmean <- calc(spline.tmean,fun = function(x){try(detect_index(.x = na.omit(x), .f = function(x){x >= 4}, .dir = 'backward'), silent = T)})

Because the splines are monotonic on either end of the year, I told detect_index to reverse the direction of threshold search for the end threshold relative to the begin threshold. Both ways, detect_index returns the index values that correspond to the first value that is greater then or equal to 4°C. The outputs of the above code are 2 rasters that have the index position as values. To convert the index values to DoY, I add 15 to each of these rasters. The geographic variation in DoY of the peak temperature is also interesting to look at. Note that this reflects the warmest month, rather than the actual average warmest day (due to the nature of the spline interpolation method).

Adjust the start of the growing season with the first snowfree date and compute growing season length.

begin <- max(stack(begin.tmean, snowfree))
plot(begin, main = 'Growing season start date')

# subtract the end from the beginning
gsl <- end - begin

# A negative growing season length doesn't make sense. This is because of the higher temp threshold
# for ending versus begining (in pixels that aren't delayed by snow).
gsl[gsl < 0] <- 0
# This step gives the land values that are listed as NAs a real value of 0
gsl[is.na(gsl)] <- 0
gsl <- mask(gsl, BRmask)

The last step invloves computing growing season means for every pixel, each of which has its own start date, end date, and set of spline-interpolated temperatures. I found it easier to turn the raster values into a vector for this step. This allowed me to use the slick (and fast) adply() function from the plyr package (another part of tidyverse) to compute the conditional means for each row (i.e., pixel) in the data.frame. My original version used a for loop for this step, which worked well, but very slowly - somewhere in the ballpark of 12 hours. The new adply() method does the same thing in less than 1.5 hours.

A key element of the approach is labelling as NAs the pixels that have a growing season begin date that is later than the end date (this happened sometimes because of snowpack delays).

# Make a data.frame from a subsetted spline.tmean raster stack. The result is two position columns and several
# values columns for each layer. If I include begin and end columns I can use the values of those vars to bracket
# the growing season for each pixel (represented as a row in the data.frame). The gs.tmean is then calculated as
# the mean of the bracketted values.
# Subsetting makes sense because the ends of the year won't be part of any growing season, 
# and the full thing is too big for how much memory I have.
# Also, because begin and end are now in units of doy, I have to subtract 15 from the min and max values to
# match the index values of the layers, which go from 1:335.
gs_df <- as.data.frame(spline.tmean[[(minValue(begin)-15):(maxValue(end)-15)]], xy = T)

# Also make data frames for the begin and end rasters
begin_df <- as.data.frame(begin, xy = T)
colnames(begin_df) <- c("x", "y","begin")
begin_df$begin <- begin_df$begin - 15 # subtract 15 to get the index value from doy

end_df <- as.data.frame(end, xy = T)
colnames(end_df) <- c("x", "y", "end")
end_df$end <- end_df$end - 15 # subtract 15 to get the index value from doy

# Merge them together into a single data frame
gs_merge <- merge(gs_df, begin_df, by = c('x', 'y'), all.x = T)
gs_merge <- merge(gs_merge, end_df, by = c('x', 'y'), all.x = T)
head(gs_merge)

## Calculate the mean for each row (which is a pixel)
gs_merge$gstmean <- adply(gs_merge, .margins = 1, .progress = "text", 
      .fun = function(x){ # x is a row of gs_merge in this context.
        # The if statement here is to filter out the pixels that don't have growing seasons
        # These will either have no begin, no end, or will have a later begin than end
  if (is.na(x[,"begin"]) | is.na(x[,"end"]) | x[,"begin"] >= x[,"end"]){ 
    NA
  } else {
    days <- x[,"begin"]:x[,"end"]
    r.cols <- paste('layer', days, sep = '.')
    mean(as.numeric(x[,r.cols]))
  }
} # end of function
)$V1 # V1 is the default name of the new variable that gets glommed on to the original data.frame

# Lastly, put the tmean back into a raster
gs.tmean <- rasterFromXYZ(gs_merge[,c('x','y','gstmean')], 
                          crs = CRS('+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0'))

And now for the good part:

References

National Ice Center. 2008, updated daily. IMS Daily Northern Hemisphere Snow and Ice Analysis at 1 km, 4 km, and 24 km Resolutions, Version 1. 4km dataset. Boulder, Colorado USA. NSIDC: National Snow and Ice Data Center. doi: https://doi.org/10.7265/N52R3PMC.

Paulsen, J., & Körner, C. (2014). A climate-based model to predict potential treeline position around the globe. Alpine Botany, 124(1), 1–12. doi: 10.1007/s00035-014-0124-0

PRISM Climate Group, Oregon State University, http://prism.oregonstate.edu, created 4 Feb 2004.

Rossi, S., Deslauriers, A., Griçar, J., Seo, J. W., Rathgeber, C. B. K., Anfodillo, T., … Jalkanen, R. (2008). Critical temperatures for xylogenesis in conifers of cold climates. Global Ecology and Biogeography, 17(6), 696–707. doi: 10.1111/j.1466-8238.2008.00417.x