Skip to content
Back to formatted view

Raw Message

Message-ID: <CAHpsUFb3x_aCUssnzVm9Ws1yTvbi06CwZ8+DdwGOH8Y-_kBxGg@mail.gmail.com>
Date: 2015-04-10T20:07:32Z
From: Alexandra Catena
Subject: Finding values in a dataframe at a specified hour

Hello,

I have a large dataframe (windHW) of wind speeds (ws) at each hour
from many days over a set of years.  Some of these values are
obviously wrong (600 m/s) and I want to get rid of all the values that
are larger than 5*sigma for each hour.  The 5*sigma (variable name
sigma5) values are located in different dataframes for each season,
with each dataframe titled as a season.  For example, in the
dataframe, spring, the 5*sigma value is 79.6 m/s for hour 1.

So my question is as follows: how can I get it so that the code will
be able to find all the wind speed values in the dataframe, windHW, of
a specific hour be higher than the 5*sigma value at that hour?
For example, I would like to find if any of the wind speed values at
hour 1 are higher than 79.6 m/s, and if so, then replace that value
with NA.

I have something like this but I can't seem to figure out how to get
it for specific hours:

windHW$ws[windHW$ws>=spring$sigma5] <- NA

I imported the data using readLines and into the dataframe windHW.  I
also have R version 3.1.1

Any help would be appreciated!

Thanks,
Alexandra