[FORGED] Re: remove

Sun, Feb 12, 2017 8:19 AM

My understanding was that the discordant names has been identified. So
in the example the OP gave, removing rows with first = "Alex" is done
by:

df[df$first !="Alex",]

If that is not the case, as others have pointed out, various forms of
tapply() (by, ave, etc.) can be used. I agree that that is not so
"basic," so I apologize if my understanding was incorrect.

Cheers,
Bert




Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Feb 11, 2017 at 10:04 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:

On 12/02/17 18:36, Bert Gunter wrote:

Basic stuff!

Either subscripting or ?subset.

There are many good R tutorials on the web. You should spend some
(more?) time with some.


Uh, Bert, perhaps I'm being obtuse (a common occurrence) but it doesn't seem
basic to me.  The only way that I can see how to go at it is via
a for loop:

rdln <- function(X) {
# Remove discordant last names.
    ok <- logical(nrow(X))
    for(nm in unique(X$first)) {
        xxx <- unique(X$last[X$first==nm])
        if(length(xxx)==1) ok[X$first==nm] <- TRUE
    }
    Y <- X[ok,]
    Y <- Y[order(Y$first),]
    rownames(Y) <- 1:nrow(Y)
    Y
}

Calling the toy data frame "melvin" rather than "df" (since "df" is the name
of the built in F density function, it is bad form to use it as the name of
another object) I get:

rdln(melvin)

  first week last
1   Bob    1 John
2   Bob    2 John
3   Bob    3 John
4  Cory    1 Jack
5  Cory    2 Jack

which is the desired output.  If there is a "basic stuff" way to do this
I'd like to see it.  Perhaps I will then be toadally embarrassed, but they
say that this is good for one.

cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On Sat, Feb 11, 2017 at 9:02 PM, Val <valkremk at gmail.com> wrote:

Hi all,
I have a big data set and want to  remove rows conditionally.
In my data file  each person were recorded  for several weeks. Somehow
during the recording periods, their last name was misreported.   For
each person,   the last name should be the same. Otherwise remove from
the data. Example, in the following data set, Alex was found to have
two last names .

Alex   West
Alex   Joseph

Alex should be removed  from the data.  if this happens then I want
remove  all rows with Alex. Here is my data set

df <- read.table(header=TRUE, text='first  week last
Alex    1  West
Bob     1  John
Cory    1  Jack
Cory    2  Jack
Bob     2  John
Bob     3  John
Alex    2  Joseph
Alex    3  West
Alex    4  West ')

Desired output

      first  week last
1     Bob     1   John
2     Bob     2   John
3     Bob     3   John
4     Cory     1   Jack
5     Cory     2   Jack

[FORGED] Re: remove

Thread (16 messages)