Dear all,
I have been trying to perform machine learning/feature selection tasks
in R using various packages (e.g. mlr and FSelector).
However, when giving larger data frames as input for the functions, I
get a segmentation fault (memory not mapped).
This happened first when using the mlr benchmark function with
dataframes in the order of 200 rows x 10,000 columns (all integer values).
I prepared a minimal working example where I get a segmentation fault
trying to calculate the information gain with FSelector package.
require("FSelector")
# Random dataframe 200 rows * 25,000 cols
large.df <- data.frame(replicate(25000,sample(0:1,200,rep=TRUE)))
weights <- information.gain(X24978~., large.df)
print(weights)
I am using R version 3.3.0 64-bit on Ubuntu 14.04.4 LTS with FSelector
v0.20 and rJava v0.9.8 on a machine with 32 core Intel i7 and 250 GB
Ram. Java is OpenJDK 1.7 74bit.
I would highly appreciate if you could give me any hint on how to solve
the problem.
Best
ssalentin
Sebastian Salentin, PhD student Bioinformatics Group Technische Universit?t Dresden Biotechnology Center (BIOTEC) Tatzberg 47/49 01307 Dresden, Germany