1.Please cc to the list, as I have here, unless your comments are off topic. 2. Use dput() (?dput) to include **small** amounts of data in your message, as attachments are generally stripped from r-help. 3. I have no experience with itemsets or the arules package, but a quick glance at the docs there said that your data argument must be in a specific form coercible into an S4 "transactions" class. I suspect that neither your initial data frame nor the list deriving from split is, but maybe someone familiar with the package can tell you for sure. That's why you need to cc to the list. -- Bert
On Sun, Mar 10, 2013 at 7:04 AM, Dhiman Biswas <crazydhimu at gmail.com> wrote:
Dear Bert, My intention is to mine frequent itemsets of TRN_TYP for individual CIN out of that data. But the problem is using eclat after splitting gives the following error: Error in eclat(list) : internal error in trio library PS: I have attached my dataset. On Sat, Mar 9, 2013 at 8:27 PM, Bert Gunter <gunter.berton at gene.com> wrote:
I **suggest** that you explain what you wish to accomplish using a reproducible example rather than telling us what packages you think you should use. I believe you are making things too complicated; e.g. what do you mean by "frequent patterns"? Moreover, "basket format" is rather unclear -- and may well be unnecessary. But using lists, it could be simply accomplished by ?split ## as in the_list <- with(yourdata, split(TYP, CIN.TRN)) or possibly the_list <- with(yourdata, tapply(TYP,CIN.TRN, FUN = table)) Of course, these may be irrelevant and useless, but without knowing your purpose ...? -- Bert On Sat, Mar 9, 2013 at 4:37 AM, Dhiman Biswas <crazydhimu at gmail.com> wrote:
I have a data in the following form :
CIN TRN_TYP
9079954 1
9079954 2
9079954 3
9079954 4
9079954 5
9079954 4
9079954 5
9079954 6
9079954 7
9079954 8
9079954 9
9079954 9
. .
. .
. .
there are 100 types of CIN (9079954,12441087,15246633,...) and
respective
TRN_TYP
first of all, I want this data to be grouped into basket format:
9079954 1, 2, 3, 4, 5, ....
12441087 19, 14, 21, 3, 7, ...
.
.
.
and then apply eclat from arules package to find frequent patterns.
1) I ran the following code:
file<-read.csv("D:/R/Practice/Data_Input_NUM.csv")
file <- file[!duplicated(file),]
eclat(split(file$TRN_TYP,file$CIN))
but it gave me the following error:
Error in asMethod(object) : can not coerce list with transactions with
duplicated items
2) I ran this code:
file<-read.csv("D:/R/Practice/Data_Input_NUM.csv")
file_new<-file[,c(3,6)] # because my file Data_Input_NUM has many other
columns as well, so I selecting only CIN and TRN_TYP
file_new <- file_new[!duplicated(file_new),]
eclat(split(file_new$TRN_TYP,file_new$CIN))
but again:
Error in eclat(split(file_new$TRN_TYP, file_new$CIN)) :
internal error in trio library
PLEASE HELP
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm