Why does matrix selection behave differently when using which?

David Winsemius · 2012-12-17T20:00:09Z

On Dec 17, 2012, at 11:22 AM, Asis Hallab wrote: > Dear R community, > > I have a medium sized matrix stored in variable "t" and a simple function " > countRows" (see below) to count the number of rows in which a selected > column "C" matches a given value. If I count all rows matching all pairwise > distinct values in the column "C" and sum these counts up, I get the number > or rows of "t". If I delete the "which" calls from function "countRows" the > resulting sum of matching row numbers is

David Winsemius

Mon, Dec 17, 2012 12:00 PM

On Dec 17, 2012, at 11:22 AM, Asis Hallab wrote:

What part of "minimal" example are you having difficulty understanding? That zip file expands to a 1.8 MB file!

Since it has a header line, you will be creating all factors and it's doubtful you are getting what you want.

Instead:

 t <- read.table("test.tbl", header=TRUE)

'ps'? What is ps????

I suspect that it is not `which` that is the problem, but rahter your understanding of how `if` processes vectors. (This also should be simplified greatly to avoid stepping through vectors one element at a time.)

You didn't do anything with that result!

That value will not depend in any manner on what preceded it.  ???? It will simply be the number of rows in the local copy of "t"

You goal is _only_ to get a count? 

Why not just this:

 sum( tbl[!is.na(tbl$Domain.Architecture.Distance), "Domain.Architecture.Distance" ] == x )

E.g.:

[1] 3440

You should probably be creating a factor variable with `cut` to create reasonable intervals for grouping, and if you do not know this it suggests you need to do more stufy of the text or introductory materials.To get a quick look at the distribution this is useful"

plot( density(tbl[!is.na(tbl$Domain.Architecture.Distance), "Domain.Architecture.Distance" ] ))

(125 KB file so not attached)

(0,0.1] (0.1,0.2] (0.2,0.3] (0.3,0.4] (0.4,0.5] (0.5,0.6] (0.6,0.7] (0.7,0.8] (0.8,0.9]   (0.9,1] 
      616      1864       328       103       923      1763      1151      2490      3709     38563

The question ... as yet unanswered ....  is _how_ exactly are you calling that function. You posted a link to data "t" but there is no code that calls that function with the data. I do not see anything that would resemble a "ps"-object.

(See above.)

Please read the Posting Guide and learn to post in plain text.

David Winsemius
Alameda, CA, USA

Why does matrix selection behave differently when using which?

Thread (5 messages)