Skip to content

NA/NaN values in bnlearn package R

4 messages · Alexsandro Cândido de Oliveira Silva, David Winsemius, Marco Scutari +1 more

#
Hello,

I am using the bnlearn package in R to handle large amounts of data in  
Bayesian networks. The variables are discrete and have more than 3  
million observations.
With bn.fit function I could easily get the conditional probability  
distribution. However, some variables have unobserved values ??(i.e.,  
NA or NaN). In some variables, unobserved values ??are almost 1  
million. This is a lot to just delete them.

In tests, I've got this:
Error in check.data (date): the data set contains null / NaN / NA values??.

So, how could I deal with the data and get the conditional probability  
distribution?
Could someone help me?


Regards.
Alexsandro C?ndido de Oliveira Silva
#
On Jun 8, 2014, at 6:27 PM, Alexsandro C?ndido de Oliveira Silva wrote:

            
You are requested not to crosspost at multiple R mailing lists (and by extension of the reasoning behind that request, crossposting on Rhelp and StackOverflow is also considered discourteous.)

Both Rhelp and SO expect yopu to provide enough information to comment intelligently, but you have not lived up to that expectation.
#
Dear Alexsandro,

On 9 June 2014 02:27, Alexsandro C?ndido de Oliveira Silva
<acos at dpi.inpe.br> wrote:
As of the current release (3.5), all functions in bnlearn require
complete data so that error message is expected. However, you can
estimate the CPTs from incomplete data using table() and prop.table()
and assemble them in a fitted BN with custom.fit(). On the other hand,
maybe it would be better to write an EM wrapper around bn.fit() to
make the best of the dependence structure of the data?

Cheers,
    Marco
1 day later