Skip to content

Declaring All Variables as Factors in GLM()

2 messages · Preetam Pal, Leonardo Ferreira Fontenelle

#
Hi guys,

I am running glm(y~., data = history,family=binomial)-essentially, logistic
regression for credit scoring (y = 0 or 1). The dataset 'history' has 14
variables, a few examples:
history <- read.csv("history.csv". header = TRUE)
1> 'income = 100,200,300 (these are numbers in my dataset; however
interpretation is that these are just tags or labels,for every observation,
its income gets assigned one of these tags)
2> 'job' = 'private','government','unemployed','student'

I want to declare all the regressors and y variables *as factors*
programmatically. Would be great if anyone can help me with this (idea is
to loop over variable names and use as.factor - but not sure how to do
this). Thanks

Regards,
Preetam
#
This should do the trick:

history2 <- as.data.frame(lapply(history, as.factor))

Mind you that read.csv() by default reads string vectors as factors, so
that declaring the variables as factors should only be necessary for the
numeric ones, like income. Using as.factor() in factor variables may
drop unused levels, but in your case I believe it won't be a problem.

HTH,

Leonardo Ferreira Fontenelle
http://lattes.cnpq.br/9234772336296638

Em S?b 30 abr. 2016, ?s 04:25, Preetam Pal escreveu: