Skip to content

the first. from SAS in R

17 messages · Marius 't Hart, Joel, Gustavo Carvalho +8 more

#
Is there any similar function in R to the first. in SAS?

What it dose is:

Lets say we have this table:

  a b  c
  1 1  5
  1 0  2
  2 0  2
  2 0 NA
  2 9  2
  3 1  3


and then I want do to do one thing the first time the number 1 appers in a
and something else the secund time 1 appers in a and so on.

so 

something similar to:

if first.a {
 a$d<-1
}else{
 a$d<-0
}

This would give me

  a b  c b
  1 1  5 1
  1 0  2 0
  2 0  2 1
  2 0 NA 0
  2 9  2 0
  3 1  3 1

Is there such a function in R or anything similar?


thx

//Joel
#
Hi all,

I'm doing a 3-way ANOVA like this:

summary(aov(formula('FP ~ (lum * obj * man)^3 - Error(vp/(lum * obj * 
man)^3)'),data=dataf))

But in the output I only get 1- and 2-way effects, like this one:

Error: vp:obj:man
Df Sum Sq Mean Sq F value Pr(>F)
obj:man 1 1.5291e-34 1.5291e-34 5.7011 0.0542 .
Residuals 6 1.6093e-34 2.6822e-35
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(And a warning that the Error() model is singular.)

What am I doing wrong so that I don't get the 3-way interactions I want 
to look at?

Thank you!

Marius.
#
Perhaps something like this:

a$d <- ifelse(duplicated(a$a), 0, 1)
On Tue, Nov 23, 2010 at 1:33 PM, Joel <joda2457 at student.uu.se> wrote:
#
On Nov 23, 2010, at 8:33 AM, Joel wrote:

            
The duplicated function which returns a logical vector with those  
features can easily be coerced to numeric.

df$d <- as.numeric(!duplicated(df$a))


I was a bit puzzled about my failure to get coercion by the method  
which I thought was supposed to work, namely adding 0.

df$e <- !duplicated(df$a)+0  # does not coerce

df$e <- 0 + !duplicated(df$a) # pre-adding 0 does coerce

Maybe the rules on coercion were amended.
#
On Tue, 23 Nov 2010, Joel wrote:

            
See

 	?duplicated

then try

 	a$d <- ifelse( duplicated( a$a ), 0 , 1 )

and

 	a$d.2 <- as.numeric( !duplicated( a$a ) )

HTH,

Chuck
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
#
On Tue, 23 Nov 2010, Dennis Murphy wrote:

            
See

 	?Arithmetic

and read the paragraph under Details starting 'Logical vectors'

Chuck
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
#
On Nov 23, 2010, at 11:04 AM, Charles C. Berry wrote:

            
Chuck;

Compare these three, all of which are using binary operators on  
logical vectors which is what is being discussed in ?Arithmetic:

 > duplicated(c("a", "a", "b") ) + 0
[1] 0 1 0
 > !duplicated(c("a", "a", "b") ) + 0
[1]  TRUE FALSE  TRUE
 > 0 + !duplicated(c("a", "a", "b") )
[1] 1 0 1

I believe the proper place to go is ?Syntax where operator precedence  
is discussed. I think the precendence rules implicitly do this in the  
second instance, because "+" has higher precendence than negation:

! ( duplicated(c("a", "a", "b") ) + 0 )
#
On Tue, 23 Nov 2010, David Winsemius wrote:

            
David,

Thanks.

Both you and David Lorenz are correct in pointing to operator precedence 
as the answer to Dennis' question.

Mea culpa for my not reading Dennis' question carefully enough to 
understand what his question really was!

Best,

Chuck
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
#
On Tue, 23 Nov 2010, Seeliger.Curt at epamail.epa.gov wrote:

            
It depends on how you use duplicated()
[1] TRUE
Chuck
Charles C. Berry                            Dept of Family/Preventive Medicine
cberry at tajo.ucsd.edu			    UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
#
I often use code like Curt's encapsulated in the
following isFirstInRun function:

  isFirstInRun <- function(x,...) {
      lengthX <- length(x)
      if (lengthX == 0) return(logical(0))
      retVal <- c(TRUE, x[-1]!=x[-lengthX])
      for(arg in list(...)) {
          stopifnot(lengthX == length(arg))
          retVal <- retVal | c(TRUE, arg[-1]!=arg[-lengthX])
      }
      if (any(missing<-is.na(retVal))) # match rle: NA!=NA
          retVal[missing] <- TRUE
      retVal
  }
E.g.,
log sqrt first
1    0    1  TRUE
2    0    1 FALSE
3    1    1  TRUE
4    1    2 FALSE
5    1    2 FALSE
6    1    2 FALSE
7    1    2 FALSE
8    2    2  TRUE
9    2    3 FALSE
10   2    3 FALSE
log sqrt first
1    0    1  TRUE
2    0    1 FALSE
3    1    1  TRUE
4    1    2  TRUE
5    1    2 FALSE
6    1    2 FALSE
7    1    2 FALSE
8    2    2  TRUE
9    2    3  TRUE
10   2    3 FALSE

To do isLastInRun put the TRUE after the x[-1]!=x[-length(x)]

isLastInRun <- function(x,...) {
    lengthX <- length(x)
    if (lengthX == 0) return(logical(0))
    retVal <- c(x[-1]!=x[-lengthX], TRUE)
    for(arg in list(...)) {
        stopifnot(lengthX == length(arg))
        retVal <- retVal | c(arg[-1]!=arg[-lengthX], TRUE)
    }
    if (any(missing<-is.na(retVal))) # match rle: NA!=NA
        retVal[missing] <- TRUE
    retVal
}


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Often the purpose of first/last in sas is to facilitate grouping of
observations in a sequential algorithm. This purpose is better served in R
by using vectorized methods like those in package plyr.

Also, note that first/last has different meanings in the context of "by x;"
versus "by x notsorted;". R "duplicated" does not address the latter, which
splits noncontiguous records with equal x.

Regards,
David
4 days later
#
My apologies for coming to the party so late.

I'm sure this question has been answered a couple of times.  The
attached function is one I pulled from the help archives, but I can't
seem to duplicate the search that led me to it.

In any case, I've attached the function I found, and an .Rd file I use
as part of a local package.  I've also attached a pair of accompanying
records to retrieve the last record and the nth record.  These have the
advantage of not requiring data frames to be sorted prior to
extraction--the function will sort them for you.

Benjamin  

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of David Katz
Sent: Wednesday, November 24, 2010 10:17 AM
To: r-help at r-project.org
Subject: Re: [R] the first. from SAS in R


Often the purpose of first/last in sas is to facilitate grouping of
observations in a sequential algorithm. This purpose is better served in
R by using vectorized methods like those in package plyr.

Also, note that first/last has different meanings in the context of "by
x;"
versus "by x notsorted;". R "duplicated" does not address the latter,
which splits noncontiguous records with equal x.

Regards,
David
--
View this message in context:
http://r.789695.n4.nabble.com/the-first-from-SAS-in-R-tp3055417p3057476.
html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.