hi netters: I have a series of discrete variables which form a network and I want to learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure is awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows startting from the 500 variables and pointing to the remaining 99500 variables are allowed in the network. In deal an assignment to "banlist" should help me rule out the impossible arrows. But in my case the number of "impossible arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
(no subject)
6 messages · Achim Zeileis, zhihua li, Christian Schulz
hi netters: I have a series of discrete variables which form a network and I want to learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure is awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows startting from the 500 variables and pointing to the remaining 99500 variables are allowed in the network. In deal an assignment to "banlist" should help me rule out the impossible arrows. But in my case the number of "impossible arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
This is the second time within 24 hours that you cross-posted the same question to two of the R mailing lists, please read the posting guide linked at the bottom of this mail on how to properly ask your questions. As for your question: I'm not aware of an R package that would be able to do what you are looking for, but you might also ask the maintainer of the package you're specifically interested in for more details. Z
On Fri, 25 Mar 2005, zhihua li wrote:
hi netters: I have a series of discrete variables which form a network and I want to learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure is awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows startting from the 500 variables and pointing to the remaining 99500 variables are allowed in the network. In deal an assignment to "banlist" should help me rule out the impossible arrows. But in my case the number of "impossible arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Sorry, I didn't mean to break the posting rules. I just thought that r-help and r-sig-gr are two seperate mailing list. And the reason I posted my messages twice within 24 hours was that I forgot to add subjects to my first postings, so I'm afraid my first postings would be ignored at all. Thank you.
From: Achim Zeileis <Achim.Zeileis at wu-wien.ac.at> To: zhihua li <lzhtom at hotmail.com> CC: r-help at stat.math.ethz.ch Subject: Re: [R] learning networks with a large number of variables and
pre-set parents.
Date: Fri, 25 Mar 2005 11:40:46 +0100 (CET) This is the second time within 24 hours that you cross-posted the same question to two of the R mailing lists, please read the posting guide linked at the bottom of this mail on how to properly ask your questions. As for your question: I'm not aware of an R package that would be able to do what you are looking for, but you might also ask the maintainer of the package you're specifically interested in for more details. Z On Fri, 25 Mar 2005, zhihua li wrote:
hi netters: I have a series of discrete variables which form a network and I want
to
learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure
is
awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows
startting
from the 500 variables and pointing to the remaining 99500 variables
are
allowed in the network. In deal an assignment to "banlist" should help
me
rule out the impossible arrows. But in my case the number of
"impossible
arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
Hi, you have 10000 variables and how many cases? In my experience you need a lot of memory working with this kind/size of data and deal!
dim(pk.df)
[1] 7321 24
pk <- network(pk.df) pk.prior <- jointprior(pk)
Error in rep.default(data, length.out = vl) : cannot allocate vector of length 577368000 Perhaps this is usefuel for you? Ines - Induction of Network Structure (learning probabilistic and possibilistic graphical models) http://fuzzy.cs.uni-magdeburg.de/~borgelt/ines.html regards, Christian zhihua li schrieb:
hi netters: I have a series of discrete variables which form a network and I want to learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure is awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows startting from the 500 variables and pointing to the remaining 99500 variables are allowed in the network. In deal an assignment to "banlist" should help me rule out the impossible arrows. But in my case the number of "impossible arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
I have 100 cases. So i think the dimension is (100, 10000). The PC has a pentium 4 CPU with 512M memory. I don't know if it is enough?
From: Christian Schulz <ozric at web.de> To: zhihua li <lzhtom at hotmail.com> CC: r-help at stat.math.ethz.ch Subject: Re: [R] learning networks with a large number of variables and
pre-set parents.
Date: Sat, 26 Mar 2005 08:13:34 +0100 Hi, you have 10000 variables and how many cases? In my experience you need a lot of memory working with this kind/size of data and deal!
dim(pk.df)
[1] 7321 24
pk <- network(pk.df) pk.prior <- jointprior(pk)
Error in rep.default(data, length.out = vl) : cannot allocate vector of length 577368000 Perhaps this is usefuel for you? Ines - Induction of Network Structure (learning probabilistic and possibilistic graphical models) http://fuzzy.cs.uni-magdeburg.de/~borgelt/ines.html regards, Christian zhihua li schrieb:
hi netters: I have a series of discrete variables which form a network and I want to learn the network structure from some training data. I could have used packages like deal but there are two problems. First of all, I have 10000 variables. So the possible network structure is awfully huge, I don't know how long it will take my PC to find the highest-scoring network..........maybe a month? Secondly, I have some prior knowledge that only 500 out of the 10000 variales are possible parents. In another word, only those arrows startting from the 500 variables and pointing to the remaining 99500 variables are allowed in the network. In deal an assignment to "banlist" should help me rule out the impossible arrows. But in my case the number of "impossible arrows" is 500*499+99500*99549, and so the "banlist" would get unacceptable long. Are there any methods (in deal or other packages) to specify the parents set in advance? Thanks a lot!
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html