Hi all,
I am trying to extract temperature values from a raster stack for about 400 municipalities in Brazil. My final goal is to create a data frame that is going to be used as a database for an interactive map server - probably using shiny and leaflet.
The final data frame would look like this:
head(df)
Location Var Cut Year Month Freq
Campinas temperature 10 2010 1 11
Campinas temperature 10 2010 2 19
Campinas temperature 10 2010 3 30
Campinas temperature 10 2010 4 29
Campinas temperature 10 2010 5 31
Campinas temperature 10 2010 6 30
I have global raster stacks with daily data and I am counting, for each month in the raster, the number of days above certain temperature threshold. Please see below:
library(raster)
library(zoo)
library(maptools)
# Create a rasterStack similar to my data - same dimensions and layer namesr <- raster(ncol=360, nrow=180)
s <- stack(lapply(1:730, function(x) setValues(r, runif(ncell(r),min=0,max=30))))
idx <- seq(as.Date("2010/1/1"), by = "day", length.out = 730)
s <- setZ(s, idx)
s
# Define functions for 10, 15, 20 and 25 degrees - Thanks Lo?c in my previous question
fun1 <- function(x, na.rm) {
sum(x > 10, na.rm)
}
fun2 <- function(x, na.rm) {
sum(x > 15, na.rm)
}
fun3 <- function(x, na.rm) {
sum(x > 20, na.rm)
}
fun4 <- function(x, na.rm) {
sum(x > 25, na.rm)
}
# Count number of days above the threshold temperature
days.above.10 <- zApply(s, by=as.yearmon, fun = fun1)
days.above.15 <- zApply(s, by=as.yearmon, fun = fun2)
days.above.20 <- zApply(s, by=as.yearmon, fun = fun3)
days.above.25 <- zApply(s, by=as.yearmon, fun = fun4)
Now, what I would like to do is to programmatically extract values for each location on my study area. The locations are defined as a shapefile with municipal contours of the Sao Paulo state in Brazil.
In this example, however, just for reproducibility's sake, I will be using a world polygon. But keep in mind that in my actual data the polygons will be much smaller.
# Import *sample* polygon data and subset only five "locations"
data(wrld_simpl)
locs <- subset(wrld_simpl, wrld_simpl at data$NAME %in% c("Argentina","Bolivia","Brazil","Paraguay","Uruguay"))
# Plot
plot(days.above.10,1)
plot(locs,add=T)
I feel like half of the work is done, but I am just grasping with the conversion to data frames.
Based on this self-contained example I provided, what would be the best strategy to come out with a data frame per location, like this?
head(Argentina.df)
Location Var Cut Year Month Freq
Argentina temperature 10 2010 1 11
Argentina temperature 10 2010 2 19
Argentina temperature 10 2010 3 30
Argentina temperature 10 2010 4 12
Argentina temperature 10 2010 5 17
Argentina temperature 10 2010 6 14
head(Bolivia.df)
Location Var Cut Year Month Freq
Bolivia temperature 10 2010 1 29
Bolivia temperature 10 2010 2 31
Bolivia temperature 10 2010 3 30
Bolivia temperature 10 2010 4 17
Bolivia temperature 10 2010 5 19
Bolivia temperature 10 2010 6 12
and so on.
Note that "cut" refers to the temperature thresholds defined in the functions above. Each cut should come from the equivalent raster stack: days.above.10, days.above.15 and so on.
I much appreciate any input.
Greetings,
-- Thiago V. dos Santos
PhD student
Land and Atmospheric Science
University of Minnesota
Hi Thiago,
extract() and some dataframe manipulation should do the trick. See
comments in line.
Cheers,
Lo?c
On 10/29/2015 09:06 PM, Thiago V. dos Santos wrote:
Hi all,
I am trying to extract temperature values from a raster stack for about 400 municipalities in Brazil. My final goal is to create a data frame that is going to be used as a database for an interactive map server - probably using shiny and leaflet.
Cool project
The final data frame would look like this:
head(df)
Location Var Cut Year Month Freq
Campinas temperature 10 2010 1 11
Campinas temperature 10 2010 2 19
Campinas temperature 10 2010 3 30
Campinas temperature 10 2010 4 29
Campinas temperature 10 2010 5 31
Campinas temperature 10 2010 6 30
I have global raster stacks with daily data and I am counting, for each month in the raster, the number of days above certain temperature threshold. Please see below:
library(raster)
library(zoo)
library(maptools)
# additional packages for dataframes manipulation
library(dplyr)
library(tidyr)
# Create a rasterStack similar to my data - same dimensions and layer namesr <- raster(ncol=360, nrow=180)
s <- stack(lapply(1:730, function(x) setValues(r, runif(ncell(r),min=0,max=30))))
idx <- seq(as.Date("2010/1/1"), by = "day", length.out = 730)
s <- setZ(s, idx)
s
# Define functions for 10, 15, 20 and 25 degrees - Thanks Lo?c in my previous question
fun1 <- function(x, na.rm) {
sum(x > 10, na.rm)
}
fun2 <- function(x, na.rm) {
sum(x > 15, na.rm)
}
fun3 <- function(x, na.rm) {
sum(x > 20, na.rm)
}
fun4 <- function(x, na.rm) {
sum(x > 25, na.rm)
}
# Count number of days above the threshold temperature
days.above.10 <- zApply(s, by=as.yearmon, fun = fun1)
days.above.15 <- zApply(s, by=as.yearmon, fun = fun2)
days.above.20 <- zApply(s, by=as.yearmon, fun = fun3)
days.above.25 <- zApply(s, by=as.yearmon, fun = fun4)
Now, what I would like to do is to programmatically extract values for each location on my study area. The locations are defined as a shapefile with municipal contours of the Sao Paulo state in Brazil.
In this example, however, just for reproducibility's sake, I will be using a world polygon. But keep in mind that in my actual data the polygons will be much smaller.
# Import *sample* polygon data and subset only five "locations"
data(wrld_simpl)
locs <- subset(wrld_simpl, wrld_simpl at data$NAME %in% c("Argentina","Bolivia","Brazil","Paraguay","Uruguay"))
# Plot
plot(days.above.10,1)
plot(locs,add=T)
# Extract values for all polygons
spdf <- raster::extract(days.above.10, locs, fun = median, na.rm = TRUE,
sp = TRUE)
# There is also a df = TRUE option in extract, but it returns only the
extracted raster values, without binding them with the
spatialpolygondataframe attributes. I think
# Get dataframe out of spdf
df <- spdf at data
df1 <- select(df, NAME, Jan.2010:Dec.2010)
df2 <- gather(df1, period, freq, -NAME)
df3 <- separate(df2, period, into = c('month', 'year'))
# Can you add these columns "manually" or does it need to be automated?
df3$variable <- 'temperature'
df3$cut <- 10
# If you do the same for cut == 15, etc, you can then rbind() them
I feel like half of the work is done, but I am just grasping with the conversion to data frames.
Based on this self-contained example I provided, what would be the best strategy to come out with a data frame per location, like this?
head(Argentina.df)
Location Var Cut Year Month Freq
Argentina temperature 10 2010 1 11
Argentina temperature 10 2010 2 19
Argentina temperature 10 2010 3 30
Argentina temperature 10 2010 4 12
Argentina temperature 10 2010 5 17
Argentina temperature 10 2010 6 14
head(Bolivia.df)
Location Var Cut Year Month Freq
Bolivia temperature 10 2010 1 29
Bolivia temperature 10 2010 2 31
Bolivia temperature 10 2010 3 30
Bolivia temperature 10 2010 4 17
Bolivia temperature 10 2010 5 19
Bolivia temperature 10 2010 6 12
and so on.
Note that "cut" refers to the temperature thresholds defined in the functions above. Each cut should come from the equivalent raster stack: days.above.10, days.above.15 and so on.
I much appreciate any input.
Greetings,
-- Thiago V. dos Santos
PhD student
Land and Atmospheric Science
University of Minnesota
Thanks a lot again, Lo?c. It worked exactly as I was planned.
Greetings,
-- Thiago V. dos Santos
PhD student
Land and Atmospheric Science
University of Minnesota
On Thursday, October 29, 2015 4:15 PM, Lo?c Dutrieux <loic.dutrieux at wur.nl> wrote:
Hi Thiago,
extract() and some dataframe manipulation should do the trick. See
comments in line.
Cheers,
Lo?c
On 10/29/2015 09:06 PM, Thiago V. dos Santos wrote:
Hi all,
I am trying to extract temperature values from a raster stack for about 400 municipalities in Brazil. My final goal is to create a data frame that is going to be used as a database for an interactive map server - probably using shiny and leaflet.
Cool project
The final data frame would look like this:
head(df)
Location Var Cut Year Month Freq
Campinas temperature 10 2010 1 11
Campinas temperature 10 2010 2 19
Campinas temperature 10 2010 3 30
Campinas temperature 10 2010 4 29
Campinas temperature 10 2010 5 31
Campinas temperature 10 2010 6 30
I have global raster stacks with daily data and I am counting, for each month in the raster, the number of days above certain temperature threshold. Please see below:
library(raster)
library(zoo)
library(maptools)
# additional packages for dataframes manipulation
library(dplyr)
library(tidyr)
# Create a rasterStack similar to my data - same dimensions and layer namesr <- raster(ncol=360, nrow=180)
s <- stack(lapply(1:730, function(x) setValues(r, runif(ncell(r),min=0,max=30))))
idx <- seq(as.Date("2010/1/1"), by = "day", length.out = 730)
s <- setZ(s, idx)
s
# Define functions for 10, 15, 20 and 25 degrees - Thanks Lo?c in my previous question
fun1 <- function(x, na.rm) {
sum(x > 10, na.rm)
}
fun2 <- function(x, na.rm) {
sum(x > 15, na.rm)
}
fun3 <- function(x, na.rm) {
sum(x > 20, na.rm)
}
fun4 <- function(x, na.rm) {
sum(x > 25, na.rm)
}
# Count number of days above the threshold temperature
days.above.10 <- zApply(s, by=as.yearmon, fun = fun1)
days.above.15 <- zApply(s, by=as.yearmon, fun = fun2)
days.above.20 <- zApply(s, by=as.yearmon, fun = fun3)
days.above.25 <- zApply(s, by=as.yearmon, fun = fun4)
Now, what I would like to do is to programmatically extract values for each location on my study area. The locations are defined as a shapefile with municipal contours of the Sao Paulo state in Brazil.
In this example, however, just for reproducibility's sake, I will be using a world polygon. But keep in mind that in my actual data the polygons will be much smaller.
# Import *sample* polygon data and subset only five "locations"
data(wrld_simpl)
locs <- subset(wrld_simpl, wrld_simpl at data$NAME %in% c("Argentina","Bolivia","Brazil","Paraguay","Uruguay"))
# Plot
plot(days.above.10,1)
plot(locs,add=T)
# Extract values for all polygons
spdf <- raster::extract(days.above.10, locs, fun = median, na.rm = TRUE,
sp = TRUE)
# There is also a df = TRUE option in extract, but it returns only the
extracted raster values, without binding them with the
spatialpolygondataframe attributes. I think
# Get dataframe out of spdf
df <- spdf at data
df1 <- select(df, NAME, Jan.2010:Dec.2010)
df2 <- gather(df1, period, freq, -NAME)
df3 <- separate(df2, period, into = c('month', 'year'))
# Can you add these columns "manually" or does it need to be automated?
df3$variable <- 'temperature'
df3$cut <- 10
# If you do the same for cut == 15, etc, you can then rbind() them
I feel like half of the work is done, but I am just grasping with the conversion to data frames.
Based on this self-contained example I provided, what would be the best strategy to come out with a data frame per location, like this?
head(Argentina.df)
Location Var Cut Year Month Freq
Argentina temperature 10 2010 1 11
Argentina temperature 10 2010 2 19
Argentina temperature 10 2010 3 30
Argentina temperature 10 2010 4 12
Argentina temperature 10 2010 5 17
Argentina temperature 10 2010 6 14
head(Bolivia.df)
Location Var Cut Year Month Freq
Bolivia temperature 10 2010 1 29
Bolivia temperature 10 2010 2 31
Bolivia temperature 10 2010 3 30
Bolivia temperature 10 2010 4 17
Bolivia temperature 10 2010 5 19
Bolivia temperature 10 2010 6 12
and so on.
Note that "cut" refers to the temperature thresholds defined in the functions above. Each cut should come from the equivalent raster stack: days.above.10, days.above.15 and so on.
I much appreciate any input.
Greetings,
-- Thiago V. dos Santos
PhD student
Land and Atmospheric Science
University of Minnesota