Parallel predict now in spatial.tools
R-sig-geo'ers: I finally got around to building a parallel predict statement that I've included in version 1.3.7 (or later) of spatial.tools (check http://r-forge.r-project.org/R/?group_id=1492 for the status of the build), "predict_rasterEngine". It should, in theory, be a direct swap-in for the standard generic predict() statement. Currently, it will work on any predict.* statement that has the following features: 1) The data is passed to the predict as a data frame via a newdata parameter, and 2) The data is returned from the predict statement as a vector/matrix. When using predict_rasterEngine, the object= parameter is your model, and the newdata= parameter is the raster/brick/stack to apply the model to on a pixel-by-pixel basis (note that the names of the layers must match the names of the predictor variables, in most cases). I was hoping to get some stress-testing on this, since it is a fairly oft-requested function. If a predict.* function you'd like to use doesn't work, let me know which function it is (with some test data) and I'll see if I can tweak it to work. Right now, I have confirmed this works with randomForest. Here's an example: ###################### packages_required <- c("spatial.tools","doParallel","randomForest") lapply(packages_required, require, character.only=T) # Load up a 3-band image: tahoe_highrez <- setMinMax( brick(system.file("external/tahoe_highrez.tif", package="spatial.tools"))) tahoe_highrez plotRGB(tahoe_highrez) # Load up some training points: tahoe_highrez_training_points <- readOGR( dsn=system.file("external", package="spatial.tools"), layer="tahoe_highrez_training_points") # Extract data to train the randomForest model: tahoe_highrez_training_extract <- extract( tahoe_highrez, tahoe_highrez_training_points, df=TRUE) # Fuse it back with the SPECIES info: tahoe_highrez_training_extract$SPECIES <- tahoe_highrez_training_points$SPECIES # Note the names of the bands: names(tahoe_highrez_training_extract) # the extracted data names(tahoe_highrez) # the brick # Generate a randomForest model: tahoe_rf <- randomForest(SPECIES~tahoe_highrez.1+tahoe_highrez.2+tahoe_highrez.3, data=tahoe_highrez_training_extract) tahoe_rf # This will run the predict in parallel: sfQuickInit() prediction_rf_class <- predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="response") prediction_rf_prob <- predict_rasterEngine(object=tahoe_rf,newdata=tahoe_highrez,type="prob") sfQuickStop() ############### --j
Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 259 Computing Applications Building, MC-150 605 East Springfield Avenue Champaign, IL 61820-6371 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007