Skip to content
Prev 398063 / 398502 Next

About size of data frames

Rui, et. al. :
"real results probably depend on the functions
you want to apply to the data."

Indeed!
I would presume that one would want to analyze such data as time series of
some sort, for which I think long form is inherently "more sensible". If
so, I also would think that you would want columns for data, sensor*,
season, and day, as Rui suggested.  However, note that this presumes no
missing data, which is usually wrong. To handle this, within each day the
rows would need to be in order of the hour the data was recorded (I assume
twice per hour) with a missing code when data was missing.

*As an aside, whether <data, season, day> data is in the form of a single
data frame with an additional sensor ID column or a list of 70 frames, one
for each sensor,  is not really much of an issue these days, where
gigabytes and gigaflops are cheap and available, as it is trivial to
convert from one form to another as needed.

Feel free to disagree -- I am just amplifying Rui's comment above; "what I
would presume" and "what I think" doesn't matter. What matters is Stefano's
response to his comment.

Cheers,
Bert
On Thu, Aug 14, 2025 at 10:54?AM Rui Barradas <ruipbarradas at sapo.pt> wrote: