BIG DATABASE
On Fri, 25 May 2018, Javier Moreira wrote:
Can I use this answer to ask exactly for what it's mentioned. R and Postgis mostly for Easter files. Can you point books, online courses, tutorials, GitHub pages, anything, to better understand this? I had been struggling to find info.
For rpostgis, see: https://journal.r-project.org/archive/2018/RJ-2018-025/index.html and the supplementary material linked there to replicate the results in the online article (should be in the 2018-1 issue). Roger
Thanks! El vie., 25 may. 2018 1:35, Tom Philippi <tephilippi at gmail.com> escribi?:
What Roger said (as always). Note that if you use tidyverse and magrittr, dplyr and tidyverse tools work well with databases via DBI. sqldf also works with multiple SQL database backends if you're an ol dog like me and don't use tidyverse much. Also, since this is r-sig-*GEO*, note that postgreSQL has postGIS for spatial data, which does far more than the automatic tiling of large rasters in package raster. I'm seeing wonderful performance working with a 340M observation >100GB dataset of bird observation data in R via postGIS, even with "only" 32GB RAM and constrained to running win7, not linux/unix. One alternative is that if your database is running on massive hardware (tons of memory, many cores, etc.), it is possible to run R within both postgreSQL and now MS SQL Server, the first free, the second an additional cost add-on, and both usually at the cost of painful negotiations with DA administrators for permissions to run your ad hoc R code on their SQL server. If you have the hardware, you can even run R with hadoop, although I've never done that with spatial data. Tom 0 On Thu, May 24, 2018 at 5:04 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Thu, 24 May 2018, Yaya Bamba wrote: Thanks to all of you. I will try with the package RMySQL and see.
Maybe look more generally through the packages depending on and importing from DBI (https://cran.r-project.org/package=DBI) to see what is available - there are many more than RMySQL. and use the Official Statistics and HPC Task Views: https://cran.r-project.org/view=OfficialStatistics https://cran.r-project.org/view=HighPerformanceComputing to see how typical workflows (not necessarily DB-based) can be handled. The HPC TV has a section on large memory and out-of-memory approaches. If your data are spatial in raster format, the raster package provides some out-of-memory functionality. In sf, spatial vector data may be read from databases too. Roger
2018-05-24 11:33 GMT+00:00 Andres Diaz Loaiza <madiazl at gmail.com>: Hello Yaya,
Many years ago I work with a database in MySQL connected to R through
the
package RMySQL?. The data was stored in the MySQL and I was connecting and using the data from R you should have a look in: https://cran.r-project.org/web/packages/RMySQL/index.html Cheers, Andres
-- Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no http://orcid.org/0000-0003-2392-6140 https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Roger Bivand Department of Economics, Norwegian School of Economics, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no http://orcid.org/0000-0003-2392-6140 https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en