Prev 367103 / 398506 Next

[FORGED] Re: remove

Val

Sun, Feb 12, 2017 10:51 AM

Thank you Rainer,

The question was :-
1. Identify those first names with different last names or more than
one last names.
2. Once identified (like Alex)  then exclude them.  This is because
not reliable record.

On Sun, Feb 12, 2017 at 11:17 AM, Rainer Schuermann

<Rainer.Schuermann at gmx.net> wrote:

I may not be understanding the question well enough but for me

df[ df[ , "first"]  != "Alex", ]

seems to do the job:

  first week last

Rainer




On Sonntag, 12. Februar 2017 19:04:19 CET Rolf Turner wrote:

On 12/02/17 18:36, Bert Gunter wrote:

Basic stuff!

Either subscripting or ?subset.

There are many good R tutorials on the web. You should spend some
(more?) time with some.

Uh, Bert, perhaps I'm being obtuse (a common occurrence) but it doesn't
seem basic to me.  The only way that I can see how to go at it is via
a for loop:

rdln <- function(X) {
# Remove discordant last names.
     ok <- logical(nrow(X))
     for(nm in unique(X$first)) {
         xxx <- unique(X$last[X$first==nm])
         if(length(xxx)==1) ok[X$first==nm] <- TRUE
     }
     Y <- X[ok,]
     Y <- Y[order(Y$first),]
     rownames(Y) <- 1:nrow(Y)
     Y
}

Calling the toy data frame "melvin" rather than "df" (since "df" is the
name of the built in F density function, it is bad form to use it as the
name of another object) I get:

 > rdln(melvin)

   first week last
1   Bob    1 John
2   Bob    2 John
3   Bob    3 John
4  Cory    1 Jack
5  Cory    2 Jack

which is the desired output.  If there is a "basic stuff" way to do this
I'd like to see it.  Perhaps I will then be toadally embarrassed, but
they say that this is good for one.

cheers,

Rolf

On Sat, Feb 11, 2017 at 9:02 PM, Val <valkremk at gmail.com> wrote:

Hi all,
I have a big data set and want to  remove rows conditionally.
In my data file  each person were recorded  for several weeks. Somehow
during the recording periods, their last name was misreported.   For
each person,   the last name should be the same. Otherwise remove from
the data. Example, in the following data set, Alex was found to have
two last names .

Alex   West
Alex   Joseph

Alex should be removed  from the data.  if this happens then I want
remove  all rows with Alex. Here is my data set

df <- read.table(header=TRUE, text='first  week last
Alex    1  West
Bob     1  John
Cory    1  Jack
Cory    2  Jack
Bob     2  John
Bob     3  John
Alex    2  Joseph
Alex    3  West
Alex    4  West ')

Desired output

      first  week last
1     Bob     1   John
2     Bob     2   John
3     Bob     3   John
4     Cory     1   Jack
5     Cory     2   Jack

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Thread (16 messages)

Val remove Feb 11 Bert Gunter remove Feb 11 P Tennant remove Feb 11 Rolf Turner remove Feb 11 Jeff Newmiller remove Feb 11 P Tennant remove Feb 11 Val remove Feb 12 Jeff Newmiller remove Feb 12 Bert Gunter remove Feb 12 Rainer Schuermann remove Feb 12 Val remove Feb 12 Val remove Feb 12 Jeff Newmiller remove Feb 12 Val remove Feb 12 Val remove Feb 12 P Tennant remove Feb 12