The datatable (and the split obviously) only contain characters and numeric
data.
I found that 4 regression in a row work if I don't use the calculated
columns as variables but 2 of the original columns.
RAM usage stays below 3GB!
--> Why does R has such problems with the calculated columns? Their
calculation is already done before the regression starts.
It's like this:
Create the calculated columns:
Dataset$ExtraColumn1 <- Dataset$ColumnA / Dataset$ColumnB
Dataset$ExtraColumn2 <- Dataset$ColumnC / Dataset$ColumnD
Perform the split of the dataset inc. calculated columns (the criteria for
the split have a hierarchy):
Datasplit <- split(Dataset, paste(Dataset$ColumnE, Dataset$ColumnE))
Perform the regression on the splitted data:
Regression1 <- lapply(Datasplit, function(d) lm(ExtraColumn1 ~ ExtraColumn2,
d, na.action = na.omit, singular.ok = TRUE))
BTW: There are no NA values in the data source.
What is my mistake?