Skip to content

Support Counting

4 messages · Petr Savicky, psombe

#
Hi,
   I'm new to R and trying to some simple analysis. I have a data set with
about 88000 transactions and i want to perform a simple support count
analysis of an itemset which is say not a complete transaction but a subset
of a transaction.
say

{A,B,D} is a transaction and i want to find support of {A,B} even though it
never occurs as only A,B in the entire set


 To this i needed to create a new itemsets class and then use the support
function but somehow the answers never seem to tally.

Thanks in advance
Srinivas

--
View this message in context: http://r.789695.n4.nabble.com/Support-Counting-tp3424730p3424730.html
Sent from the R help mailing list archive at Nabble.com.
#
On Mon, Apr 04, 2011 at 01:11:37AM -0500, psombe wrote:
Hi.

The answer depends on the representation of the data set. Can you
describe the representation?

A possible representation of a data set for itemsets counting is a matrix
of 0/1. Using this representation, computing the support may be done
as follows.

  db <- matrix(0, nrow=5, ncol=5, dimnames=list(NULL, LETTERS[1:5]))
  db[1, c("A", "B", "D")] <- 1
  db[2, c("A", "B")] <- 1
  db[3, c("A", "D", "E")] <- 1
  db[4, c("B", "C", "D")] <- 1
  db[5, c("A", "B", "C")] <- 1
  db

       A B C D E
  [1,] 1 1 0 1 0
  [2,] 1 1 0 0 0
  [3,] 1 0 0 1 1
  [4,] 0 1 1 1 0
  [5,] 1 1 1 0 0

  itemset <- c("A", "B")
 
  # for each transaction, whether it contains c("A", "B")
  rowSums(db[, itemset]) == length(itemset)

  [1]  TRUE  TRUE FALSE FALSE  TRUE
 
  # the number of transactions containing c("A", "B")
  sum(rowSums(db[, itemset]) == length(itemset))

  [1] 3

Hope this helps.

Petr Savicky.
1 day later
#
well im using the "arules" package and i'm trying to use the support command.
my data is read form a file using the "read.transactions" command and a line
of data looks something like this. there are aboutt 88000 rows and 16000
different items
items
1 {33, 
    34, 
    35}
items
1 {0, 1,  10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2,  20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 3, 4,5, 6, 7,  8,  9}  

So in order to use support i have to make an object of class "itemsets" and
im kind of struggling with the "new" command.
I made an object of class itemsets by first creating a presence/absence
matrix and with something like 16000 items this is really sort of tedious. I
wonder if there is a better way.

//Currently im doing this

avec = array(dim=400) //dim is till the max number of the item im concerned
with
avec[1:400] = 0
avec[27] = 1
avec[63] = 1 //and do on for all the items i want

amat = matrix(data = avec,ncol = 400)
aset = as(amat,"transactions") //coercing the matrix as a transactions class

then say my data is "dat" i can use
[1] 0.001406470


There has to be a better way
Thanks once again

--
View this message in context: http://r.789695.n4.nabble.com/Support-Counting-tp3424730p3428062.html
Sent from the R help mailing list archive at Nabble.com.
#
On Tue, Apr 05, 2011 at 08:43:34AM -0500, psombe wrote:
Hi.

R-help can provide help for some of the frequently used CRAN packages,
but not for all. There are too many of them. It is not clear, whether
there is someone on R-help, who uses "arules". One of my students is using
Eclat for association rules directly, but not from R. I am using R, but
not for association rules.

Try to determine, whether your question is indeed specific to "arules".
If the question may be formulated without "arules", it has a good chance
to be replied here. Otherwise, send a query to the package maintainer.
Package maintainers usually welcome feedback.
Up to here, this may be simplified, if the required indices
are stored in a vector, say, "indices". For example

  indices <- c(3, 5, 6, 10)
  avec <- array(0, dim=14)
  avec[indices] <- 1
  amat <- rbind(avec)

or

  amat <- matrix(0, nrow=1, ncol=14)
  amat[1, indices] <- 1
  amat

       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
  avec    0    0    1    0    1    1    0    0    0     1     0     0     0     0

Hope this helps.

Petr Savicky.