I constantly define variable lists from a data frame (e.g., to define a
regression equation). Line 3 below does just that. Placing each variable
name in quotation marks is too much work especially for a long list so I
do that with line 4. Is there an easier way to accomplish this----to
define a list of variable names containing "a","c","e"? Thank you!
> data<-as.data.frame(matrix(1:30,nrow=6))
> colnames(data)<-c("a","b","c","d","e"); data
? a? b? c? d? e
1 1? 7 13 19 25
2 2? 8 14 20 26
3 3? 9 15 21 27
4 4 10 16 22 28
5 5 11 17 23 29
6 6 12 18 24 30
> x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
Defining partial list of variables
12 messages · Jeff Newmiller, Eric Berger, Steven Yen +2 more
see below Steven Yen wrote/hat geschrieben on/am 05.01.2021 08:14:
I constantly define variable lists from a data frame (e.g., to define a regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
a b c d e 1 1 7 13 19 25 2 2 8 14 20 26 3 3 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
What about:
x3 <- names(data)[c(1,3,5)]
x3
[1] "a" "c" "e"
If I have to compile longer vectors of variable names I do it as follows:
First I use:
dput(names(data))
resulting in a vector of names.
c("a", "b", "c", "d", "e")
Then I edit the output by hand, e.g.
x4 <- c("a", "b", "c", "d", "e")
x4 <- c("a", "c", "e")
This is especially useful with long names, where I could easily make
typing errors.
regards,
Heinz
IMO if you want to hardcode a formula then simply hardcode a formula. If you want 20 formulas, write 20 formulas. Is that really so bad? If you want to have an abbreviated way to specify sets of variables without conforming to R syntax then put them into data files and read them in using a format of your choice. But using NSE to avoid using quotes for entering what amounts to in-script data is abuse of the language justified by laziness... the amount of work you put yourself and anyone else who reads your code through is excessive relative to the benefit gained. NSE has its strengths... but as a method of creating data objects it sucks. Note that even the tidyverse (now) requires you to use quotes when you are not directly referring to something that already exists. And if you were... you might as well be creating a formula.
On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw> wrote:
I constantly define variable lists from a data frame (e.g., to define a regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
? a? b? c? d? e 1 1? 7 13 19 25 2 2? 8 14 20 26 3 3? 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.
Thank you, Jeff. IMO, we are all here to make R work better to suit our
various needs. All I am asking is an easier way to define variable list
zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+ age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+???? unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi",?????? x=zx,x0)
> eq2t<-my.formula(y="bmi",?????? x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
??? base + somcol + grad + employed + unable + homeowner + married +
??? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
??? somcol + grad + employed + unable + homeowner + married +
??? divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
??? base + somcol + grad + employed + unable + homeowner + married +
??? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
??? somcol + grad + employed + unable + homeowner + married +
??? divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
IMO if you want to hardcode a formula then simply hardcode a formula. If you want 20 formulas, write 20 formulas. Is that really so bad? If you want to have an abbreviated way to specify sets of variables without conforming to R syntax then put them into data files and read them in using a format of your choice. But using NSE to avoid using quotes for entering what amounts to in-script data is abuse of the language justified by laziness... the amount of work you put yourself and anyone else who reads your code through is excessive relative to the benefit gained. NSE has its strengths... but as a method of creating data objects it sucks. Note that even the tidyverse (now) requires you to use quotes when you are not directly referring to something that already exists. And if you were... you might as well be creating a formula. On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw> wrote:
I constantly define variable lists from a data frame (e.g., to define a regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
? a? b? c? d? e 1 1? 7 13 19 25 2 2? 8 14 20 26 3 3? 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw> wrote:
Thank you, Jeff. IMO, we are all here to make R work better to suit our various needs. All I am asking is an easier way to define variable list zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+ age,exercise,income,white,black,hispanic,base,somcol,grad,employed, + unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
IMO if you want to hardcode a formula then simply hardcode a formula. If
you want 20 formulas, write 20 formulas. Is that really so bad?
If you want to have an abbreviated way to specify sets of variables
without conforming to R syntax then put them into data files and read them in using a format of your choice.
But using NSE to avoid using quotes for entering what amounts to
in-script data is abuse of the language justified by laziness... the amount of work you put yourself and anyone else who reads your code through is excessive relative to the benefit gained.
NSE has its strengths... but as a method of creating data objects it
sucks. Note that even the tidyverse (now) requires you to use quotes when you are not directly referring to something that already exists. And if you were... you might as well be creating a formula.
On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw> wrote:
I constantly define variable lists from a data frame (e.g., to define a regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
a b c d e 1 1 7 13 19 25 2 2 8 14 20 26 3 3 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Here we go! BUT, it works great for a continuous line. With line
break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize"? "urban"?? "male"??? "gov"???? "nongov"? "married"
> x<-strsplit("hhsize,urban,male,
+???????????? gov,nongov,married",","); x
[[1]]
[1] "hhsize"??????????? "urban"???????????? "male" "\n??????????? gov"
[5] "nongov"??????????? "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work better to
suit our
various needs. All I am asking is an easier way to define variable
list
zx, differently from the way z0 , x0, and treat are defined.
?> zx<-colnames(subset(mydata,select=c(
+ age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+???? unable,homeowner,married,divorced,widowed)))
?> z0<-c("fruit","highblood")
?> x0<-c("vgood","poor")
?> treat<-"depression"
?> eq1 <-my.formula(y="depression",x=zx,z0)
?> eq2 <-my.formula(y="bmi",?????? x=zx,x0)
?> eq2t<-my.formula(y="bmi",?????? x=zx,treat)
?> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
???? base + somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
???? somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + vgood + poor
?> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
???? base + somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
???? somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply hardcode a
formula. If you want 20 formulas, write 20 formulas. Is that
really so bad?
>
> If you want to have an abbreviated way to specify sets of
variables without conforming to R syntax then put them into data
files and read them in using a format of your choice.
>
> But using NSE to avoid using quotes for entering what amounts to
in-script data is abuse of the language justified by laziness...
the amount of work you put yourself and anyone else who reads your
code through is excessive relative to the benefit gained.
>
> NSE has its strengths... but as a method of creating data
objects it sucks. Note that even the tidyverse (now) requires you
to use quotes when you are not directly referring to something
that already exists. And if you were... you might as well be
creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame (e.g., to
define a
>>
>> regression equation). Line 3 below does just that. Placing each
>> variable
>> name in quotation marks is too much work especially for a long
list so
>> I
>> do that with line 4. Is there an easier way to accomplish
this----to
>> define a list of variable names containing "a","c","e"? Thank you!
>>
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>>? ? a? b? c? d? e
>> 1 1? 7 13 19 25
>> 2 2? 8 14 20 26
>> 3 3? 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>> [1] "a" "c" "e"
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
If your column names have no spaces the following should work
x<-strsplit(gsub("[\n ]","",
"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw> wrote:
Here we go! BUT, it works great for a continuous line. With line break(s), I got the nuisance "\n" inserted.
x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]] [1] "hhsize" "urban" "male" "gov" "nongov" "married"
x<-strsplit("hhsize,urban,male,
+ gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male"
"\n gov"
[5] "nongov" "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw> wrote:
Thank you, Jeff. IMO, we are all here to make R work better to suit our various needs. All I am asking is an easier way to define variable list zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+ age,exercise,income,white,black,hispanic,base,somcol,grad,employed, + unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
IMO if you want to hardcode a formula then simply hardcode a formula.
If you want 20 formulas, write 20 formulas. Is that really so bad?
If you want to have an abbreviated way to specify sets of variables
without conforming to R syntax then put them into data files and read them in using a format of your choice.
But using NSE to avoid using quotes for entering what amounts to
in-script data is abuse of the language justified by laziness... the amount of work you put yourself and anyone else who reads your code through is excessive relative to the benefit gained.
NSE has its strengths... but as a method of creating data objects it
sucks. Note that even the tidyverse (now) requires you to use quotes when you are not directly referring to something that already exists. And if you were... you might as well be creating a formula.
On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw>
wrote:
I constantly define variable lists from a data frame (e.g., to define a regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
a b c d e 1 1 7 13 19 25 2 2 8 14 20 26 3 3 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks Eric. Perhaps I should know when to stop. The approach produces a
slightly different variable list (note the [[1]]). Consequently, I was
not able to use xx in defining my regression formula.
> x<-colnames(subset(mydata,select=c(
+??? hhsize,urban,male,
+??? age3045,age4659,age60, # age1529
+??? highsc,tert,?????????? # primary
+??? gov,nongov,??????????? # unemp
+??? married))); x
?[1] "hhsize"? "urban"?? "male"??? "age3045" "age4659" "age60"
"highsc"? "tert"
?[9] "gov"???? "nongov"? "married"
> xx<-strsplit(gsub("[\n ]","",
+??? "hhsize,urban,male,
+???? age3045,age4659,age60,
+???? highsc,tert,
+???? gov,nongov,
+???? married"
+ ),","); xx
[[1]]
?[1] "hhsize"? "urban"?? "male"??? "age3045" "age4659" "age60"
"highsc"? "tert"
?[9] "gov"???? "nongov"? "married"
> eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
??? tert + gov + nongov + married
> eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
??? "highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
?x<-strsplit(gsub("[\n ]","",
?"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Here we go! BUT, it works great for a continuous line. With line
break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize"? "urban"?? "male"??? "gov"???? "nongov" "married"
> x<-strsplit("hhsize,urban,male,
+???????????? gov,nongov,married",","); x
[[1]]
[1] "hhsize"??????????? "urban" "male"????????????? "\n???????????
gov"
[5] "nongov"??????????? "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work better
to suit our
various needs. All I am asking is an easier way to define
variable list
zx, differently from the way z0 , x0, and treat are defined.
?> zx<-colnames(subset(mydata,select=c(
+
age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+???? unable,homeowner,married,divorced,widowed)))
?> z0<-c("fruit","highblood")
?> x0<-c("vgood","poor")
?> treat<-"depression"
?> eq1 <-my.formula(y="depression",x=zx,z0)
?> eq2 <-my.formula(y="bmi",?????? x=zx,x0)
?> eq2t<-my.formula(y="bmi",?????? x=zx,treat)
?> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
???? base + somcol + grad + employed + unable + homeowner +
married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
???? somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + vgood + poor
?> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
???? base + somcol + grad + employed + unable + homeowner +
married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
???? somcol + grad + employed + unable + homeowner + married +
???? divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply hardcode
a formula. If you want 20 formulas, write 20 formulas. Is
that really so bad?
>
> If you want to have an abbreviated way to specify sets of
variables without conforming to R syntax then put them into
data files and read them in using a format of your choice.
>
> But using NSE to avoid using quotes for entering what
amounts to in-script data is abuse of the language justified
by laziness... the amount of work you put yourself and anyone
else who reads your code through is excessive relative to the
benefit gained.
>
> NSE has its strengths... but as a method of creating data
objects it sucks. Note that even the tidyverse (now) requires
you to use quotes when you are not directly referring to
something that already exists. And if you were... you might
as well be creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen
<styen at ntu.edu.tw <mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame
(e.g., to define a
>>
>> regression equation). Line 3 below does just that. Placing
each
>> variable
>> name in quotation marks is too much work especially for a
long list so
>> I
>> do that with line 4. Is there an easier way to accomplish
this----to
>> define a list of variable names containing "a","c","e"?
Thank you!
>>
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>>? ? a? b? c? d? e
>> 1 1? 7 13 19 25
>> 2 2? 8 14 20 26
>> 3 3? 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>> [1] "a" "c" "e"
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
list -- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing
list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible
code.
wrap it in unlist xx <- unlist(strsplit( .... ))
On Tue, Jan 5, 2021 at 12:59 PM Steven Yen <styen at ntu.edu.tw> wrote:
Thanks Eric. Perhaps I should know when to stop. The approach produces a slightly different variable list (note the [[1]]). Consequently, I was not able to use xx in defining my regression formula.
x<-colnames(subset(mydata,select=c(
+ hhsize,urban,male, + age3045,age4659,age60, # age1529 + highsc,tert, # primary + gov,nongov, # unemp + married))); x [1] "hhsize" "urban" "male" "age3045" "age4659" "age60" "highsc" "tert" [9] "gov" "nongov" "married"
xx<-strsplit(gsub("[\n ]","",
+ "hhsize,urban,male, + age3045,age4659,age60, + highsc,tert, + gov,nongov, + married" + ),","); xx [[1]] [1] "hhsize" "urban" "male" "age3045" "age4659" "age60" "highsc" "tert" [9] "gov" "nongov" "married"
eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
tert + gov + nongov + married
eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
"highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
x<-strsplit(gsub("[\n ]","",
"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw> wrote:
Here we go! BUT, it works great for a continuous line. With line break(s), I got the nuisance "\n" inserted.
x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]] [1] "hhsize" "urban" "male" "gov" "nongov" "married"
x<-strsplit("hhsize,urban,male,
+ gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male"
"\n gov"
[5] "nongov" "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw> wrote:
Thank you, Jeff. IMO, we are all here to make R work better to suit our various needs. All I am asking is an easier way to define variable list zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+ age,exercise,income,white,black,hispanic,base,somcol,grad,employed, + unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black + hispanic +
base + somcol + grad + employed + unable + homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic + base +
somcol + grad + employed + unable + homeowner + married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
IMO if you want to hardcode a formula then simply hardcode a formula.
If you want 20 formulas, write 20 formulas. Is that really so bad?
If you want to have an abbreviated way to specify sets of variables
without conforming to R syntax then put them into data files and read them in using a format of your choice.
But using NSE to avoid using quotes for entering what amounts to
in-script data is abuse of the language justified by laziness... the amount of work you put yourself and anyone else who reads your code through is excessive relative to the benefit gained.
NSE has its strengths... but as a method of creating data objects it
sucks. Note that even the tidyverse (now) requires you to use quotes when you are not directly referring to something that already exists. And if you were... you might as well be creating a formula.
On January 4, 2021 11:14:54 PM PST, Steven Yen <styen at ntu.edu.tw>
wrote:
I constantly define variable lists from a data frame (e.g., to define
a
regression equation). Line 3 below does just that. Placing each variable name in quotation marks is too much work especially for a long list so I do that with line 4. Is there an easier way to accomplish this----to define a list of variable names containing "a","c","e"? Thank you!
data<-as.data.frame(matrix(1:30,nrow=6))
colnames(data)<-c("a","b","c","d","e"); data
a b c d e 1 1 7 13 19 25 2 2 8 14 20 26 3 3 9 15 21 27 4 4 10 16 22 28 5 5 11 17 23 29 6 6 12 18 24 30
x1<-c("a","c","e"); x1 # line 3
[1] "a" "c" "e"
x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
[1] "a" "c" "e"
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks Eric. Yes, "unlist" makes a difference. Below, I am doing not
regression but summary to keep the example simple.
> set.seed(123)
> data<-matrix(runif(1:25),nrow=5)
> colnames(data)<-c("x1","x2","x3","x4","x5"); data
??????????? x1??????? x2??????? x3???????? x4??????? x5
[1,] 0.2875775 0.0455565 0.9568333 0.89982497 0.8895393
[2,] 0.7883051 0.5281055 0.4533342 0.24608773 0.6928034
[3,] 0.4089769 0.8924190 0.6775706 0.04205953 0.6405068
[4,] 0.8830174 0.5514350 0.5726334 0.32792072 0.9942698
[5,] 0.9404673 0.4566147 0.1029247 0.95450365 0.6557058
> j<-strsplit(gsub("[\n ]","","x1,x3,x5"),",")
> j<-unlist(j); j
[1] "x1" "x3" "x5"
> summary(data[,j])
?????? x1?????????????? x3?????????????? x5
?Min.?? :0.2876?? Min.?? :0.1029?? Min.?? :0.6405
?1st Qu.:0.4090?? 1st Qu.:0.4533?? 1st Qu.:0.6557
?Median :0.7883?? Median :0.5726?? Median :0.6928
?Mean?? :0.6617?? Mean?? :0.5527?? Mean?? :0.7746
?3rd Qu.:0.8830?? 3rd Qu.:0.6776?? 3rd Qu.:0.8895
?Max.?? :0.9405?? Max.?? :0.9568?? Max.?? :0.9943
On 2021/1/5 ?? 07:08, Eric Berger wrote:
wrap it in unlist
xx <- unlist(strsplit( .... ))
On Tue, Jan 5, 2021 at 12:59 PM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thanks Eric. Perhaps I should know when to stop. The approach
produces a slightly different variable list (note the [[1]]).
Consequently, I was not able to use xx in defining my regression
formula.
> x<-colnames(subset(mydata,select=c(
+??? hhsize,urban,male,
+??? age3045,age4659,age60, # age1529
+??? highsc,tert,?????????? # primary
+??? gov,nongov,??????????? # unemp
+??? married))); x
?[1] "hhsize"? "urban"?? "male"??? "age3045" "age4659" "age60"??
"highsc"? "tert"
?[9] "gov"???? "nongov"? "married"
> xx<-strsplit(gsub("[\n ]","",
+??? "hhsize,urban,male,
+???? age3045,age4659,age60,
+???? highsc,tert,
+???? gov,nongov,
+???? married"
+ ),","); xx
[[1]]
?[1] "hhsize"? "urban"?? "male"??? "age3045" "age4659" "age60"??
"highsc"? "tert"
?[9] "gov"???? "nongov"? "married"
> eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
??? tert + gov + nongov + married
> eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
??? "highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
?x<-strsplit(gsub("[\n ]","",
?"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Here we go! BUT, it works great for a continuous line. With
line break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize"? "urban"?? "male"??? "gov" "nongov"? "married"
> x<-strsplit("hhsize,urban,male,
+???????????? gov,nongov,married",","); x
[[1]]
[1] "hhsize"??????????? "urban" "male"?????????????
"\n??????????? gov"
[5] "nongov"??????????? "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work
better to suit our
various needs. All I am asking is an easier way to
define variable list
zx, differently from the way z0 , x0, and treat are defined.
?> zx<-colnames(subset(mydata,select=c(
+
age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+ unable,homeowner,married,divorced,widowed)))
?> z0<-c("fruit","highblood")
?> x0<-c("vgood","poor")
?> treat<-"depression"
?> eq1 <-my.formula(y="depression",x=zx,z0)
?> eq2 <-my.formula(y="bmi", x=zx,x0)
?> eq2t<-my.formula(y="bmi", x=zx,treat)
?> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
???? base + somcol + grad + employed + unable +
homeowner + married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
???? somcol + grad + employed + unable + homeowner +
married +
???? divorced + widowed + vgood + poor
?> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
???? base + somcol + grad + employed + unable +
homeowner + married +
???? divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
???? somcol + grad + employed + unable + homeowner +
married +
???? divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply
hardcode a formula. If you want 20 formulas, write 20
formulas. Is that really so bad?
>
> If you want to have an abbreviated way to specify sets
of variables without conforming to R syntax then put
them into data files and read them in using a format of
your choice.
>
> But using NSE to avoid using quotes for entering what
amounts to in-script data is abuse of the language
justified by laziness... the amount of work you put
yourself and anyone else who reads your code through is
excessive relative to the benefit gained.
>
> NSE has its strengths... but as a method of creating
data objects it sucks. Note that even the tidyverse
(now) requires you to use quotes when you are not
directly referring to something that already exists. And
if you were... you might as well be creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen
<styen at ntu.edu.tw <mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame
(e.g., to define a
>>
>> regression equation). Line 3 below does just that.
Placing each
>> variable
>> name in quotation marks is too much work especially
for a long list so
>> I
>> do that with line 4. Is there an easier way to
accomplish this----to
>> define a list of variable names containing
"a","c","e"? Thank you!
>>
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>>? ? a? b? c? d? e
>> 1 1? 7 13 19 25
>> 2 2? 8 14 20 26
>> 3 3? 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>> [1] "a" "c" "e"
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained,
reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained,
reproducible code.
What about the Cs()-function in Hmisc? library(Hmisc) Cs(a,b,c) [1] "a" "b" "c" Steven Yen wrote/hat geschrieben on/am 05.01.2021 13:29:
Thanks Eric. Yes, "unlist" makes a difference. Below, I am doing not regression but summary to keep the example simple.
> set.seed(123)
> data<-matrix(runif(1:25),nrow=5)
> colnames(data)<-c("x1","x2","x3","x4","x5"); data
x1 x2 x3 x4 x5 [1,] 0.2875775 0.0455565 0.9568333 0.89982497 0.8895393 [2,] 0.7883051 0.5281055 0.4533342 0.24608773 0.6928034 [3,] 0.4089769 0.8924190 0.6775706 0.04205953 0.6405068 [4,] 0.8830174 0.5514350 0.5726334 0.32792072 0.9942698 [5,] 0.9404673 0.4566147 0.1029247 0.95450365 0.6557058
> j<-strsplit(gsub("[\n ]","","x1,x3,x5"),",")
> j<-unlist(j); j
[1] "x1" "x3" "x5"
> summary(data[,j])
x1 x3 x5 Min. :0.2876 Min. :0.1029 Min. :0.6405 1st Qu.:0.4090 1st Qu.:0.4533 1st Qu.:0.6557 Median :0.7883 Median :0.5726 Median :0.6928 Mean :0.6617 Mean :0.5527 Mean :0.7746 3rd Qu.:0.8830 3rd Qu.:0.6776 3rd Qu.:0.8895 Max. :0.9405 Max. :0.9568 Max. :0.9943 On 2021/1/5 ?? 07:08, Eric Berger wrote:
wrap it in unlist
xx <- unlist(strsplit( .... ))
On Tue, Jan 5, 2021 at 12:59 PM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thanks Eric. Perhaps I should know when to stop. The approach
produces a slightly different variable list (note the [[1]]).
Consequently, I was not able to use xx in defining my regression
formula.
> x<-colnames(subset(mydata,select=c(
+ hhsize,urban,male,
+ age3045,age4659,age60, # age1529
+ highsc,tert, # primary
+ gov,nongov, # unemp
+ married))); x
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> xx<-strsplit(gsub("[\n ]","",
+ "hhsize,urban,male,
+ age3045,age4659,age60,
+ highsc,tert,
+ gov,nongov,
+ married"
+ ),","); xx
[[1]]
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
tert + gov + nongov + married
> eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
"highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
x<-strsplit(gsub("[\n ]","",
"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Here we go! BUT, it works great for a continuous line. With
line break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male" "gov" "nongov" "married"
> x<-strsplit("hhsize,urban,male,
+ gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male"
"\n gov"
[5] "nongov" "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work
better to suit our
various needs. All I am asking is an easier way to
define variable list
zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+
age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+ unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply
hardcode a formula. If you want 20 formulas, write 20
formulas. Is that really so bad?
>
> If you want to have an abbreviated way to specify sets
of variables without conforming to R syntax then put
them into data files and read them in using a format of
your choice.
>
> But using NSE to avoid using quotes for entering what
amounts to in-script data is abuse of the language
justified by laziness... the amount of work you put
yourself and anyone else who reads your code through is
excessive relative to the benefit gained.
>
> NSE has its strengths... but as a method of creating
data objects it sucks. Note that even the tidyverse
(now) requires you to use quotes when you are not
directly referring to something that already exists. And
if you were... you might as well be creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen
<styen at ntu.edu.tw <mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame
(e.g., to define a
>>
>> regression equation). Line 3 below does just that.
Placing each
>> variable
>> name in quotation marks is too much work especially
for a long list so
>> I
>> do that with line 4. Is there an easier way to
accomplish this----to
>> define a list of variable names containing
"a","c","e"? Thank you!
>>
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>> a b c d e
>> 1 1 7 13 19 25
>> 2 2 8 14 20 26
>> 3 3 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>> [1] "a" "c" "e"
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained,
reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained,
reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I may not I properly understand the context of this discussion, and, in
particular what the my.formula() function does. But if I do, the following,
from ?formula, seems relevant and would indicate that the discussion is
unnecessary:
"There are two special interpretations of . in a formula. The usual one is
in the context of a data argument of model fitting functions and means ?all
columns not otherwise in the formula?:"
This means you can fit different models just by indexing the columns -- by
number -- you wish to use in a data argument, viz:
y <- runif(100)
dat <- data.frame(matrix(runif(500), ncol = 5))
names(dat) <- letters[1:5]
head(dat)
## Use columns 1,3, and 5 only
mdl1 <- lm(y ~ ., data = dat[,c(1,3,5)])
## Result:
summary(mdl1)
Call:
lm(formula = y ~ ., data = dat[, c(1, 3, 5)])
Residuals:
Min 1Q Median 3Q Max
-0.52334 -0.27494 0.01245 0.28637 0.51998
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.51461 0.08236 6.248 1.14e-08 ***
a 0.01516 0.10928 0.139 0.890
c 0.03517 0.10399 0.338 0.736
e -0.09437 0.10967 -0.861 0.392
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
Residual standard error: 0.299 on 96 degrees of freedom
Multiple R-squared: 0.008256, Adjusted R-squared: -0.02274
F-statistic: 0.2664 on 3 and 96 DF, p-value: 0.8495
If I have misunderstood and this is unhelpful, just ignore without comment.
You don't need to waste time explaining it to me.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Jan 5, 2021 at 4:49 AM Heinz Tuechler <tuechler at gmx.at> wrote:
What about the Cs()-function in Hmisc? library(Hmisc) Cs(a,b,c) [1] "a" "b" "c" Steven Yen wrote/hat geschrieben on/am 05.01.2021 13:29:
Thanks Eric. Yes, "unlist" makes a difference. Below, I am doing not regression but summary to keep the example simple.
> set.seed(123)
> data<-matrix(runif(1:25),nrow=5)
> colnames(data)<-c("x1","x2","x3","x4","x5"); data
x1 x2 x3 x4 x5 [1,] 0.2875775 0.0455565 0.9568333 0.89982497 0.8895393 [2,] 0.7883051 0.5281055 0.4533342 0.24608773 0.6928034 [3,] 0.4089769 0.8924190 0.6775706 0.04205953 0.6405068 [4,] 0.8830174 0.5514350 0.5726334 0.32792072 0.9942698 [5,] 0.9404673 0.4566147 0.1029247 0.95450365 0.6557058
> j<-strsplit(gsub("[\n ]","","x1,x3,x5"),",")
> j<-unlist(j); j
[1] "x1" "x3" "x5"
> summary(data[,j])
x1 x3 x5 Min. :0.2876 Min. :0.1029 Min. :0.6405 1st Qu.:0.4090 1st Qu.:0.4533 1st Qu.:0.6557 Median :0.7883 Median :0.5726 Median :0.6928 Mean :0.6617 Mean :0.5527 Mean :0.7746 3rd Qu.:0.8830 3rd Qu.:0.6776 3rd Qu.:0.8895 Max. :0.9405 Max. :0.9568 Max. :0.9943 On 2021/1/5 ?? 07:08, Eric Berger wrote:
wrap it in unlist
xx <- unlist(strsplit( .... ))
On Tue, Jan 5, 2021 at 12:59 PM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thanks Eric. Perhaps I should know when to stop. The approach
produces a slightly different variable list (note the [[1]]).
Consequently, I was not able to use xx in defining my regression
formula.
> x<-colnames(subset(mydata,select=c(
+ hhsize,urban,male,
+ age3045,age4659,age60, # age1529
+ highsc,tert, # primary
+ gov,nongov, # unemp
+ married))); x
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> xx<-strsplit(gsub("[\n ]","",
+ "hhsize,urban,male,
+ age3045,age4659,age60,
+ highsc,tert,
+ gov,nongov,
+ married"
+ ),","); xx
[[1]]
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
tert + gov + nongov + married
> eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
"highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
x<-strsplit(gsub("[\n ]","",
"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Here we go! BUT, it works great for a continuous line. With
line break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male" "gov" "nongov" "married"
> x<-strsplit("hhsize,urban,male,
+ gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male"
"\n gov"
[5] "nongov" "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work
better to suit our
various needs. All I am asking is an easier way to
define variable list
zx, differently from the way z0 , x0, and treat are
defined.
> zx<-colnames(subset(mydata,select=c(
+
age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+ unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply
hardcode a formula. If you want 20 formulas, write 20
formulas. Is that really so bad?
>
> If you want to have an abbreviated way to specify sets
of variables without conforming to R syntax then put
them into data files and read them in using a format of
your choice.
>
> But using NSE to avoid using quotes for entering what
amounts to in-script data is abuse of the language
justified by laziness... the amount of work you put
yourself and anyone else who reads your code through is
excessive relative to the benefit gained.
>
> NSE has its strengths... but as a method of creating
data objects it sucks. Note that even the tidyverse
(now) requires you to use quotes when you are not
directly referring to something that already exists. And
if you were... you might as well be creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen
<styen at ntu.edu.tw <mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame
(e.g., to define a
>>
>> regression equation). Line 3 below does just that.
Placing each
>> variable
>> name in quotation marks is too much work especially
for a long list so
>> I
>> do that with line 4. Is there an easier way to
accomplish this----to
>> define a list of variable names containing
"a","c","e"? Thank you!
>>
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>> a b c d e
>> 1 1 7 13 19 25
>> 2 2 8 14 20 26
>> 3 3 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>> [1] "a" "c" "e"
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line
4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained,
reproducible code.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org>
mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained,
reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.