Skip to content

apply problem

7 messages · ggrothendieck@yifan.net, Brian Ripley, Laurent Gautier +1 more

#
# iris3 is first 3 rows of iris
# z compares row 1 to each row of iris3 and is correctly 
computed
[1]  TRUE FALSE FALSE

# this should do the same but is incorrect
1     2     3 
FALSE FALSE FALSE 

What's wrong here?

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Sat, 16 Mar 2002 ggrothendieck at yifan.net wrote:

            
Not a good choice of name: iris3 is another R dataset.

Your iris3 is a data frame.
You are not using array as documented. ?apply says

Arguments:

       X: the array to be used.


iris3 is not an array, so it it coerced to one via as.matrix
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 "5.1"        "3.5"       "1.4"        "0.2"       "setosa"
2 "4.9"        "3.0"       "1.4"        "0.2"       "setosa"
3 "4.7"        "3.2"       "1.3"        "0.2"       "setosa"

and that's not the same object as your iris3
#
On Sat, Mar 16, 2002 at 11:58:51PM -0500, ggrothendieck at yifan.net wrote:
Could this be because 'apply' expects an 'array' as input (try 'help(apply)') ?



Hopin' it helps,





Laurent
#
[...]
Thanks.  

Is there a way to iterate over the rows of a data
frame without writing a loop -- that was my 
original objective.

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Sun, 17 Mar 2002 ggrothendieck at yifan.net wrote:

            
No.  A row of a data frame is still a data frame and therefore arbitrarily
complex.

In any case, in R apply() does write a loop.  There are lots of legacy
myths about the efficiencies of loops vs *apply(), but the reality it
depends on the exact version of your S engine (and perhaps on how much
memory you have).

Just for the record, lapply() is the way to iterate over columns of a data
frame.
#
OK.  Efficiency aside,I believe it would still be nice to have this ability 
for compactness.  Here are some ideas:

1. Have an option on t(), the transpose function, that specifies 
that it should return a list of one row data frames.  The above 
becomes:
   sapply( t(iris3,list.out=T), function(x) identical( x, iris3[1,] ) )

2. Allow dist() to have two arguments and a distance function that is 0 
for identical rows and 1 for others (or even allow user specifiable 
distance functions).  The above becomes:
   dist(iris3[1,],iris3,method="identical")
or with user specifiable distance functions:
   dist(iris3[1,],iris3,user.method=identical)

3.  Have a right crossprod such that crossprod(a,b,right=T) is the 
same as a+.*t(b) and also user specifiable inner products:
   crossprod(iris3,iris3[1,],inner.prod=identical)

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
1 day later
#
On Sun, 17 Mar 2002 ggrothendieck at yifan.net wrote:

            
[snip]
How about:
+ y=iris3)
[1]  TRUE FALSE FALSE