Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" & reasons$salutation==
"Miss","Female",1))
Please suggest.
Creating Dummy Var in R for regression?
7 messages · Shivi Bhatia, Rui Barradas, Bert Gunter +2 more
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas ? Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" & reasons$salutation==
"Miss","Female",1))
Please suggest.
? ? ? ? [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
?
Just commenting on the email subject, not the content (which you have already been helped with): there is no need to *ever* create a dummy variable for regression in R if what you mean by this is what is conventionally meant. R will create the model matrix with appropriate "dummy variables" for factors as needed. See ?contrasts and ?C for relevant details and/or consult an appropriate R tutorial. Of course, if this is not what you meant, than ignore. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Aug 5, 2016 at 1:49 PM, <ruipbarradas at sapo.pt> wrote:
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" & reasons$salutation==
"Miss","Female",1))
Please suggest.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks you all for the assistance. This really helps. Hi Bert: While searching nabble i got to know R with factors variables there is no need to create dummy variable. However please consider this situation: I am in the process of building a logistic regression model on NPS data. The outcome variable is CE i.e. customer experience which has 3 rating so ordinal logistic regression will be used. However most of my variables are categorical. For instance one of the variable is agent knowledge which is a 10 point scale. This agent knowledge is again a 3 rated scale: high medium low hence i need to group these 10 values into 3 groups & then as you suggested i can directly enter them in the model without creating n-1 categories. I have worked on SAS extensively hence found this a bit confusing. Thanks for the help.
On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
Just commenting on the email subject, not the content (which you have already been helped with): there is no need to *ever* create a dummy variable for regression in R if what you mean by this is what is conventionally meant. R will create the model matrix with appropriate "dummy variables" for factors as needed. See ?contrasts and ?C for relevant details and/or consult an appropriate R tutorial. Of course, if this is not what you meant, than ignore. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Aug 5, 2016 at 1:49 PM, <ruipbarradas at sapo.pt> wrote:
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always FALSE and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" & reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" &
reasons$salutation==
"Miss","Female",1))
Please suggest.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Something like:
d = data.frame(score = sample(1:10, 100, replace=TRUE))
d$score_t = "low"
d$score_t[d$score > 3] = "medium"
d$score_t[d$score >7 ] = "high"
d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
ordered=TRUE) #set ordered = FALSE for dummy variables
X = model.matrix(~score_t, data=d)
X
On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia <shivipmp82 at gmail.com> wrote:
Thanks you all for the assistance. This really helps. Hi Bert: While searching nabble i got to know R with factors variables there is no need to create dummy variable. However please consider this situation: I am in the process of building a logistic regression model on NPS data. The outcome variable is CE i.e. customer experience which has 3 rating so ordinal logistic regression will be used. However most of my variables are categorical. For instance one of the variable is agent knowledge which is a 10 point scale. This agent knowledge is again a 3 rated scale: high medium low hence i need to group these 10 values into 3 groups & then as you suggested i can directly enter them in the model without creating n-1 categories. I have worked on SAS extensively hence found this a bit confusing. Thanks for the help. On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
Just commenting on the email subject, not the content (which you have already been helped with): there is no need to *ever* create a dummy variable for regression in R if what you mean by this is what is conventionally meant. R will create the model matrix with appropriate "dummy variables" for factors as needed. See ?contrasts and ?C for relevant details and/or consult an appropriate R tutorial. Of course, if this is not what you meant, than ignore. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Aug 5, 2016 at 1:49 PM, <ruipbarradas at sapo.pt> wrote:
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always
FALSE
and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" &
reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" &
reasons$salutation==
"Miss","Female",1))
Please suggest.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.
1 day later
Thank you Jeremiah and all others for the assistance. This really helped. On Sat, Aug 6, 2016 at 5:01 AM, jeremiah rounds <roundsjeremiah at gmail.com> wrote:
Something like:
d = data.frame(score = sample(1:10, 100, replace=TRUE))
d$score_t = "low"
d$score_t[d$score > 3] = "medium"
d$score_t[d$score >7 ] = "high"
d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
ordered=TRUE) #set ordered = FALSE for dummy variables
X = model.matrix(~score_t, data=d)
X
On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia <shivipmp82 at gmail.com> wrote:
Thanks you all for the assistance. This really helps. Hi Bert: While searching nabble i got to know R with factors variables there is no need to create dummy variable. However please consider this situation: I am in the process of building a logistic regression model on NPS data. The outcome variable is CE i.e. customer experience which has 3 rating so ordinal logistic regression will be used. However most of my variables are categorical. For instance one of the variable is agent knowledge which is a 10 point scale. This agent knowledge is again a 3 rated scale: high medium low hence i need to group these 10 values into 3 groups & then as you suggested i can directly enter them in the model without creating n-1 categories. I have worked on SAS extensively hence found this a bit confusing. Thanks for the help. On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
Just commenting on the email subject, not the content (which you have already been helped with): there is no need to *ever* create a dummy variable for regression in R if what you mean by this is what is conventionally meant. R will create the model matrix with appropriate "dummy variables" for factors as needed. See ?contrasts and ?C for relevant details and/or consult an appropriate R tutorial. Of course, if this is not what you meant, than ignore. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Aug 5, 2016 at 1:49 PM, <ruipbarradas at sapo.pt> wrote:
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always
FALSE
and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender to
capture gender details and leave salutation as it is.
i tried the below syntax but it is converting all to 1.
reasons$gender<- ifelse(reasons$salutation== "Mr" &
reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" &
reasons$salutation==
"Miss","Female",1))
Please suggest.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi, please also have a look at the 'cut' function.Very handa function for these types of situations. Best, Fredrik
On Sun, Aug 7, 2016 at 8:10 PM, Shivi Bhatia <shivipmp82 at gmail.com> wrote:
Thank you Jeremiah and all others for the assistance. This really helped. On Sat, Aug 6, 2016 at 5:01 AM, jeremiah rounds <roundsjeremiah at gmail.com> wrote:
Something like:
d = data.frame(score = sample(1:10, 100, replace=TRUE))
d$score_t = "low"
d$score_t[d$score > 3] = "medium"
d$score_t[d$score >7 ] = "high"
d$score_t = factor(d$score_t, levels = c("low", "medium", "high"),
ordered=TRUE) #set ordered = FALSE for dummy variables
X = model.matrix(~score_t, data=d)
X
On Fri, Aug 5, 2016 at 3:23 PM, Shivi Bhatia <shivipmp82 at gmail.com>
wrote:
Thanks you all for the assistance. This really helps. Hi Bert: While searching nabble i got to know R with factors variables there is no need to create dummy variable. However please consider this situation: I am in the process of building a logistic regression model on NPS data. The outcome variable is CE i.e. customer experience which has 3 rating
so
ordinal logistic regression will be used. However most of my variables
are
categorical. For instance one of the variable is agent knowledge which
is
a 10 point scale. This agent knowledge is again a 3 rated scale: high medium low hence i need to group these 10 values into 3 groups & then as you suggested i can directly enter them in the model without creating n-1 categories. I have worked on SAS extensively hence found this a bit confusing. Thanks for the help. On Sat, Aug 6, 2016 at 2:30 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:
Just commenting on the email subject, not the content (which you have already been helped with): there is no need to *ever* create a dummy variable for regression in R if what you mean by this is what is conventionally meant. R will create the model matrix with appropriate "dummy variables" for factors as needed. See ?contrasts and ?C for relevant details and/or consult an appropriate R tutorial. Of course, if this is not what you meant, than ignore. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Aug 5, 2016 at 1:49 PM, <ruipbarradas at sapo.pt> wrote:
Hello, Your ifelse will never work because reasons$salutation== "Mr" & reasons$salutation=="Father" is always
FALSE
and so is reasons$salutation=="Mrs" & reasons$salutation=="Miss". Try instead | (or), not & (and). Hope this helps, Rui Barradas Citando Shivi Bhatia <shivipmp82 at gmail.com>:
Dear Team,
I need help with the below code in R:
gender_rec<- c('Dr','Father','Mr'=1, 'Miss','MS','Mrs'=2, 3)
reasons$salutation<- gender_rec[reasons$salutation].
This code gives me the correct output but it overwrites the
reason$salutation variable. I need to create a new variable gender
to
capture gender details and leave salutation as it is. i tried the below syntax but it is converting all to 1. reasons$gender<- ifelse(reasons$salutation== "Mr" &
reasons$salutation==
"Father","Male", ifelse(reasons$salutation=="Mrs" &
reasons$salutation==
"Miss","Female",1))
Please suggest.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code.
"Life is like a trumpet - if you don't put anything into it, you don't get anything out of it." [[alternative HTML version deleted]]