Please reply to the list!
I think the warnings are an edge-case/false positive.
When reading a character vector R doesn't convert "NA" to a
not-available (NA) value (because people might have character vectors
denoting, say, North America (NA) or Nabisco (NA)). These provoke a
warning when converting. Try this function instead to convert:
as_num <- function(x) {
x <- as.character(x)
x[!is.na(x) & (x=="NA")] <- NA_character_
as.numeric(x)
}
On 8/1/21 5:00 PM, mina jahan wrote:
Dear Ben,
Thank you for your reply.
age and bmi are quantitative variables.
How can I define them as numeric variables?
I used as.numeric, but I got warning message:NAs introduced by coercion
cheers
Mina
On Mon, 2 Aug 2021 at 01:23, Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:
? ? Since you made your data available to me via google drive, I was
able to figure out the problem; I wouldn't have been able to if you
hadn't shared the data, although if you had presented the results of
summary(B) or str(B) I (or someone) would probably have been able to
diagnose the issue.
? ?The problem is that your 'bmi' variable is of type *character*;
that
means that when building the model matrix for the fixed-effect part of
the model, we end up treating it as a categorical variable and
trying to
build a model matrix that is of size
8.0*nr*((n_bmi-1)+(n_age-1) + 1 + 2)/2^30
in GB (n_age, n_bmi are the number of unique values of bmi and age);
this comes out to about 30.5 Gb.? On my machine it fails immediately by
running out of memory; the proximal problem you are probably having is
when the program tries to compute the rank of the matrix to check for
muticollinearity.? In any case, though, you probably *don't* want to
fit
a model with 42,000 fixed parameters ...
? ?If we address this problem by converting age and bmi to numeric, or
by using readr::read_csv() to read in the file in the first place,
everything works. (Note usual cautions about applying as.numeric()
directly to a *factor*; in this case (with R>4.0) we are starting with
type character, so it's OK.
? (to my surprise fread::data.table() also mis-categorizes these
columns.? I haven't been able to figure out why read.csv() and
(especially) fread() get fooled ...? the usual reasons (non-numeric
entries, large numbers of NAs at the start of the column, etc.) don't
seem to be present.
? ?cheers
? ? ?Ben Bolker
On 7/31/21 6:09 PM, mina jahan wrote:
>
>? ?I have a data set containing 20 imputed data. I want to use the
lmer
> function for computing regression coefficients for each
imputation. But
> I was exposed with under error:
> Error in qr.default(X, tol = tol, LAPACK = FALSE) :
>? ? too large a matrix for LINPACK
> I can not understand this error.? I think that this issue is
related to
> the optimization algorithms used for inference.
> R code is as follows:
> library(lme4)
>
B<-read.csv("C:/Users/USER/Desktop/micemd2.csv",header=TRUE,na.string="")
> names(B)
> names(B)[names(B) == ".imp"] <- "imp"
> B<-B[ , -2]
> names(B)
> B<-B[which(B$imp!=0),]
> head(B)
> tail(B)
>
> ###################split dataset by imp
> list_df <- split(B, B$imp)
>
> ###################Coeficient for each imputation of lmer
> result1_df <- as.data.frame(matrix(ncol=5,nrow=length(list_df)))
# make
> an empty dataframe
> colnames(result1_df)<-c("intercept","age","sex","bmi","time")
#give the
> dataframe column names
> for (i in 1:length(list_df)){ #run a loop over the dataframes in
the list
>? ? mod<-lmer(dbp~age +factor( sex) + bmi +
> time+(1|id),data=list_df[[i]]) #mixed model
>? ? result1_df[i,]<-fixef(mod) #extract coefficients to dataframe
> rownames(result1_df)[i]<-names(list_df)[i] #assign rowname to
results
> from data used
> }
> result1_df
> mean(result1_df$intercept)
> mean(result1_df$age)
> mean(result1_df$sex)
> mean(result1_df$bmi)
> mean(result1_df$time)
>
>
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
Graduate chair, Mathematics & Statistics
Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering Graduate chair, Mathematics & Statistics