Skip to content

How to build a matrix of number of appearance?

6 messages · UriB, David Winsemius, jim holtman

#
I have a matrix of claims at year1 that I get simply by 

claims<-read.csv(file="Claims.csv")
qq1<-claims[claims$Year=="Y1",]

I have MemberID and ProviderID for every claim in qq1 both are integers

An example for the type of questions that I want to answer is 
how many times ProviderID number 345 appears together with MemberID 23 in
the table qq1

In order to answer these questions for every possible ProviderId and every
possible MemberID 
I would like to have a matrix that has first column as memberID when every
memberID in qq1 appears only once and columns that have number of appearance
of ProviderID==i for every i that has
sum(qq1$ProviderID==i)>0

My question is if there is a simple way to do it in R
Thanks in Advance 

Uri

--
View this message in context: http://r.789695.n4.nabble.com/How-to-build-a-matrix-of-number-of-appearance-tp3643248p3643248.html
Sent from the R help mailing list archive at Nabble.com.
#
On Jul 4, 2011, at 5:48 AM, UriB wrote:

            
A really quick way of finding this would be:

as.data.frame ( xtabs(  ~ ProviderID +MemberID, data= qq1) )
#
Here is another way:
+             , list(count = length(id))
+             , by = list(M, P)
+           ]
Classes ?data.table? and 'data.frame':  24 obs. of  3 variables:
 $ M    : int  1 1 1 1 1 2 2 2 2 2 ...
 $ P    : int  1 2 3 4 5 1 2 3 4 5 ...
 $ count: int  5 4 3 2 9 3 3 6 3 7 ...
M P count
   1 1     5
   1 2     4
   1 3     3
   1 4     2
   1 5     9
   2 1     3
   2 2     3
   2 3     6
   2 4     3
On Mon, Jul 4, 2011 at 5:48 AM, UriB <uriblass at gmail.com> wrote:

  
    
#
Thanks for your reply
Note that I guess that there are many providerID and I get the error cannot
allocate vector of size 2.1 Gb
(I can use the same trick for most of the other fields)

Is there a way to do the same only for providerID with relatively high
frequency?

--
View this message in context: http://r.789695.n4.nabble.com/How-to-build-a-matrix-of-number-of-appearance-tp3643248p3645550.html
Sent from the R help mailing list archive at Nabble.com.
#
On Jul 5, 2011, at 5:45 AM, UriB wrote:

            
What code?
You are posting to a mailing list from a non-official web mirror/ 
interface. Those of us using this list with mail clients cannot tell  
who you are responding to and what code is throwing an error without  
opening up a browser and following the link. below (and speaking from  
prior failed efforts at figuring out context on Nabble, maybe not even  
then.)

Get with the program. Read the Posting Guide. As the sign says:
If you persist in psoting to R-help then ...Learn to include context.

  
    
#
Provide some more information about the size of the data and the
number of different ID combinations. I have found that in some cases
like this using the 'sqldf' package helps since it can deal with large
number of combinations.
On Tue, Jul 5, 2011 at 5:45 AM, UriB <uriblass at gmail.com> wrote: