-----Original Message-----
From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-
bounces at stat.math.ethz.ch] On Behalf Of Tomislav Hengl
Sent: Thursday, April 01, 2010 3:54 PM
To: r-sig-geo at stat.math.ethz.ch
Cc: Caspar.Hallmann at sovon.nl; bob.macmillan at wur.nl; 'Batjes, Niels'
Subject: [R-sig-Geo] Updated repository of worldmaps (300MB of data at 5km
resolution)
Dear R-sig-geo,
FYI: I have just updated the small repository of publicly available data sets of
interest for global modeling/mapping that I have launched about a year ago. This
now contains 62 layers at resolution of 0.05 arcdegrees and with a complete world
coverage (it use to be 65S-65N only). The data is available for download at
[http://spatial-analyst.net/worldmaps/]. Each layer comes with a raster
description file *.rdc, which typically has the same name as the attached layer
(description of the fields is available in the [http://spatial-
analyst.net/worldmaps/README.txt]). The raster description file includes also a
link to an R script that (should) show all processing steps from download to
export of maps (I advise you to run the scripts step by step because the data
sets are usually Huge). If you want to read more about what is all available on
this repository (and outside), please read the complete article [http://spatial-
analyst.net/wiki/index.php?title=Global_datasets]. You can !
also browse a gallery of worldmaps from here:
http://commons.wikimedia.org/wiki/Publicly_available_global_data_sets#
Note that some maps have limited geographical coverage (e.g. PCEVI, GLC2000),
which usually means that the data for polar regions is missing.
In about 2-3 weeks, I will tidy up the small errors and finalize the maps and
metadata. If you think that I have maybe missed some important (publicly
available) layers, please let me know. For example, I have tried to include also
the map of airline flight paths from
[http://commons.wikimedia.org/wiki/File:World_airline_routes.png], but could not
determine coordinate system, lineage etc. I am sure that there is much more what
is available (on and off the web), but I would at least try to be representative.
My next project will be to prepare the 1 km data (about 70% of maps listed are
available also in this resolution) and put them into some database format such as
WKT raster or rasdaman. This way anybody will be able to overlay point, line,
polygon features and fetch only the results of queries from the server. But it
looks as this will take more time than I have initially anticipated.
ARE THESE MAPS JUST COPIES?
Many of the layers listed (cca 20-30%) are simply resampled and reformatted maps
that are already available from the original providers (e.g. pcclim, GLC2000,
himpact etc.). The great majority of maps are basically original layers that you
will not be able to find elsewhere. For example, PCEVI1 is the 1st Principal
Component of the total time-series of monthly MODIS EVI bands (this image
basically shows the average long-term 'biomass' in the world). If you wish to
cite some of the maps I have prepared, then you should refer to the chapter #4 in
my book [http://spatial-analyst.net/book/DataSources], otherwise I advise you to
refer to the original data providers.
Each *.rdc file contains information about the data source, including the link to
where you can find the source data and peer-reviewed publication that describes
the dataset.
Personally, I find it frustrating that there are several global mapping projects
in the world that overlap (for example, there are at least four global land cover
maps!). In some cases I solve the problem by simply taking the average (e.g.
globedem is a an average between two maps), but the categorical maps cannot be
average as easy. My second frustration are the license and copyright problems.
Some data producers (usually the USA mapping agencies) have a very clear policy
and even support copying and distribution of the maps they produce (provided that
the source is acknowledged); others (e.g. himpacts) are not really clear. I am
only interested in processing and organizing the publicly available data.
MEMORY LIMIT PROBLEMS
Going from 10 km to 5 km resolution brought me to many technical headaches. Just
to download the input data takes about one week (the input data I used to
generate the 62 layers, now of size 300MB, is about 500GB!). Each layer now has
cca 26M pixels, which will obviously lead to many memory and computational limit
problems. For example, I doubt it that you will be able to load these data into R
on a standard PC (2-4GB RAM) or visualize the maps using spplot. I also tried to
derive some DEM parameters such as SAGA TWI, height above channel channels etc,
but the maps are simply too big (processing takes >24 hours), so it is very
likely that you will also face memory limit and computational problems on your
computers. PS: I used a Dell 2.8GHz with dual processor and 64Bit Window XP OS
with 4GB of RAM to run processing, and this configuration was already on the
edge.
I am really thankful to Frank Warmerdam and colleagues for providing excellent
utilities [http://www.gdal.org/gdal_utilities.html] which I heavily used to
prepare the maps. I actually did run a small comparison between the gdalwarp
utility and Erdas Imagine and ArcGIS and discovered that gdal utilities are (1)
faster and (2) more easy to script (+ you have a support for proj4 strings and
largest family of GIS data formats). The second on my list was SAGA GIS, which
can also crunch Huge data (up to 2GB) and has a large library of GIS operations.
I highly recommend these two programs and would support further development very
much. In some cases, I could not find any functionality for the analysis in gdal
utilities and SAGA, so I used ILWIS GIS (e.g. to run principle component analysis
and to extract density of lines and point features). Unfortunately linking of R
and ILWIS is not as smooth, so I often finished running part of the analysis in
ILWIS separately. This is just an !
important information for the people that will focus only on using the R
scripts.
I am looking forward to your feedback and further comments.
A copy of this mail in html format (you can insert comments below) is also
available here:
http://spatial-analyst.net/book/updated_worldmaps
yours,
T. Hengl
http://home.medewerker.uva.nl/t.hengl/