Skip to content
Prev 398077 / 398502 Next

About size of data frames

Thanks to all of you.

It's great to interact with you, your comments are opportunities to learn more not only about the specific posted question, but also about many other related topics.

Most of comments agree on the single long-format data frame, and Jeff's synthesis has been particularly interesting.

I run R in a server, which is well maintained and most likely faster than my pc.


The main variables I am dealing with are snow-pach height and daily snow-fall amount; as support to these two measurements there are many other meteorological parameters (such as wind direction, wind speed, air temperature, theta-e air temperature, surface snow-pack temperature, incident radiation, reflected radiation).

The frequency of the new sensors is getting higher and higher (at the time being is 10 minutes and in case of emergency can swap to 5 minutes!), I spent a lot of efforts to "normalize" data to half-hourly frequecy.


I use this data for several different purposes, the most important are

- graphical comparisons for manual validation (these comparisons may take into account different sensors of a single meteorological station or the same sensor for several meteorological stations)

- studying some regressions that may result important

- climatological studies


A single data frame is easy to handle, this is what I've been doing so far.

Yes, in few years time my initial data frame will pass the 20M rows, it will always be a concern.


Thank you again for everything

Stefano


         (oo)
--oOO--( )--OOo--------------------------------------
Stefano Sofia MSc, PhD
Civil Protection Department - Marche Region - Italy
Meteo Section
Snow Section
Via Colle Ameno 5
60126 Torrette di Ancona, Ancona (AN)
Uff: +39 071 806 7743
E-mail: stefano.sofia at regione.marche.it
---Oo---------oO----------------------------------------