Back to formatted view
Raw Message

Message-ID: <AANLkTim-aCtCfG1P8+e6xXNaxAoZUuZpBSP2bLbKOxYh@mail.gmail.com>
Date: 2010-10-29T16:52:51Z
From: Robert J. Hijmans
Subject: Advice on raster workflow: subs -> resample
In-Reply-To: <AANLkTin0AX8wBahD6r4fmmkj51RhzMSe31pprKOJ0QOC@mail.gmail.com>

Dear Lyndon,

That looks good to me, except that I do not think it is wise to
resample when the resolutions are different by an order of magnitude.
I think it is better to first use aggregate to get a similar
resolution and then use resample (although I never formally compared
these different strategies). As in the below.

lt.q.test <- raster(paste(path.in, "lt_quin_test1.img", sep = ""))
# Assign values from sum.stats (in y.mn column) to yld.test,
outputting values as integers.
setwd(path.out)
yld.test = subs(lt.q.test, sum.stats[[1]], by="VALUE", which="y.mn",
subsWithNA = TRUE, filename="yld.rst4.img", format='HFA',
datatype="INT2S")

# perhaps first aggregate
agg <- aggregate(yld.test, 10, fun=mean, na.rm=TRUE)

# Resample agg to resolution of modtest
modtest <- raster("MOD_SA_NDVI_amp_albers_test.img")
aggres <- resample(agg, modtest, method = "bilinear", filename =
"yld.resamp.test.img", datatype = "INT2S", format = "HFA")  # This
seems to work


These are functions are not very fast, but I do not think there is a
better alternative with 'raster'. Computers are patient, fortunately.

R


On Fri, Oct 29, 2010 at 7:21 AM, Lyndon Estes <lestes at princeton.edu> wrote:
> Hello,
>
> I am writing to ask for advice as to whether I am following the most
> efficient/fastest workflow for a task using the raster package. This
> involves 3 datasets:
>
> 1. modtest: A ~927 m resolution satellite-derived image:
> nrow ? ? ? ?: 141
> ncol ? ? ? ?: 189
> ncell ? ? ? : 26649
> min value ? : ?
> max value ? : ?
> projection ?: +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24
> +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
> +towgs84=0,0,0
> xmin ? ? ? ?: 415851.2
> xmax ? ? ? ?: 590983.4
> ymin ? ? ? ?: -2670093
> ymax ? ? ? ?: -2539439
> xres ? ? ? ?: 926.6254
> yres ? ? ? ?: 926.6254
>
> 2. lt.q.test: a ~92 m resolution grid representing identifiers for
> unique climate-soil combinations:
> nrow ? ? ? ?: 1403
> ncol ? ? ? ?: 1878
> ncell ? ? ? : 2634834
> min value ? : ?
> max value ? : ?
> projection ?: +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24
> +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
> +towgs84=0,0,0
> xmin ? ? ? ?: 415926.7
> xmax ? ? ? ?: 590091.6
> ymin ? ? ? ?: -2670093
> ymax ? ? ? ?: -2539979
> xres ? ? ? ?: 92.73961
> yres ? ? ? ?: 92.73961
>
> 3. sum.stats: A dataframe within a list with several variables
> relating to outputs from a crop simulations, and one variable
> containing values corresponding to the identifiers in yld.test.
>
> My task is to
> 1. assign crop yield values from sum.stats to lt.q.test
> 2. resample the result of 1 to the resolution of modtest, so that the
> images are lined up, and so that the yield values in yld.test are
> averaged/aggregated to modtest's resolution.
>
> The solution I have come up with so far is:
>
> library(raster)
>
> lt.q.test <- raster(paste(path.in, "lt_quin_test1.img", sep = ""))
>
> # Assign values from sum.stats (in y.mn column) to yld.test,
> outputting values as integers.
> setwd(path.out)
> subs(lt.q.test, sum.stats[[1]], by = "VALUE", which = "y.mn",
> subsWithNA = TRUE,
> ? ? ? ? ? ? ? ? filename = "yld.rst4.img", format = 'HFA', datatype = "INT2S")
>
> # Resample yld.test to resolution of modtest
> yld.test <- raster("yld.rst4.img")
> modtest <- raster("MOD_SA_NDVI_amp_albers_test.img")
> resample(yld.test, modtest, method = "bilinear", filename =
> "yld.resamp.test.img", datatype = "INT2S",
> ? ? ? ? format = "HFA") ?# This seems to work
>
> The above is performed on fairly small test datasets, and I must now
> apply this approach multiple times to the full size versions of the
> input datasets, where the dimensions of lt.q.test are:
>
> nrow ? ? ? ?: 15170
> ncol ? ? ? ?: 17364
> ncell ? ? ? : 263411880
>
> and modtest's dimensions are:
>
> nrow ? ? ? ?: 1651
> ncol ? ? ? ?: 1937
> ncell ? ? ? : 3197987
>
> I was thus wondering if there is a more efficient approach that I
> should adopt, or if there are any other
> concerns of which I should be aware?
>
> Thanks in advance for any advice.
>
> Cheers, Lyndon
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>