Skip to content

SPATIO-TEMPORAL ANALYSIS AND BIG DATA PROCESSING USING FREE AND OPEN SOURCE SOFTWARE

1 message · Mark Wynter

#
Hi Giuseppe,

Looks like a really interesting, and well thought through course.

Many of the teachings (particularly around parallelisation and scaling using Linux and open source software) I?m using on a daily basis, but within a big business context.  I?m continually amazed about what you can do once you know how to use these tools, but also dismayed by the complete lack of awareness within the broader business and analytics community.  If there?s one tool I would recommend you add to the curriculum, and that is PostgreSQL/PostGIS - and the multiple roles it can serve as a database backend for grass, R and as a stand alone geo-processing engine (which can also be scripted and parallelised using GNU Parallel/ xargs etc).

I?ve also observed another key skills gap around ?getting started? with cloud computing, and how to build a secure geo-analytics computing stack with all the open source packages. Many people don?t have an IT department, and those that do, probably want to avoid the red tape that often kills innovation initiatives.  I taught myself and started out purely connecting via SSH command line. I then gained a whole step-change in experience when I installed remote desktop on AWS cloud servers - plus QGIS / RStudio on the cloud desktop - so I could scale up the backend processing solutions, and visualise the outputs dynamically without being constrained by the upload and download links and without the need to continually shovel data between the cloud servers and my desktop.  Setting up a stack can be easy or difficult depending on which version of linux, and whether you have to build some of the packages from source because of dependency hell etc.  I started with Ubuntu, and have now have a full, and latest version stack deployed on RHEL/Centos because much of my work is with enterprise. 

I truly believe that accelerated learning programs in scientific computing can have huge benefits for, as well as open up career pathways for members of,  the broader data science and business analytics community.  Many businesses are relative newcomers to the field of Big Data - hence I?m confident analytics professionals can gain a lot from the teachings of environmental science?

Good luck for the course.  I?d be keen to get involved should another opportunity arise into the future.

Kind regards

Mark