Thanks Eric. Yes, "unlist" makes a difference. Below, I am doing not
regression but summary to keep the example simple.
> set.seed(123)
> data<-matrix(runif(1:25),nrow=5)
> colnames(data)<-c("x1","x2","x3","x4","x5"); data
x1 x2 x3 x4 x5
[1,] 0.2875775 0.0455565 0.9568333 0.89982497 0.8895393
[2,] 0.7883051 0.5281055 0.4533342 0.24608773 0.6928034
[3,] 0.4089769 0.8924190 0.6775706 0.04205953 0.6405068
[4,] 0.8830174 0.5514350 0.5726334 0.32792072 0.9942698
[5,] 0.9404673 0.4566147 0.1029247 0.95450365 0.6557058
> j<-strsplit(gsub("[\n ]","","x1,x3,x5"),",")
> j<-unlist(j); j
x1 x3 x5
Min. :0.2876 Min. :0.1029 Min. :0.6405
1st Qu.:0.4090 1st Qu.:0.4533 1st Qu.:0.6557
Median :0.7883 Median :0.5726 Median :0.6928
Mean :0.6617 Mean :0.5527 Mean :0.7746
3rd Qu.:0.8830 3rd Qu.:0.6776 3rd Qu.:0.8895
Max. :0.9405 Max. :0.9568 Max. :0.9943
On 2021/1/5 ?? 07:08, Eric Berger wrote:
wrap it in unlist
xx <- unlist(strsplit( .... ))
On Tue, Jan 5, 2021 at 12:59 PM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thanks Eric. Perhaps I should know when to stop. The approach
produces a slightly different variable list (note the [[1]]).
Consequently, I was not able to use xx in defining my regression
formula.
> x<-colnames(subset(mydata,select=c(
+ hhsize,urban,male,
+ age3045,age4659,age60, # age1529
+ highsc,tert, # primary
+ gov,nongov, # unemp
+ married))); x
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> xx<-strsplit(gsub("[\n ]","",
+ "hhsize,urban,male,
+ age3045,age4659,age60,
+ highsc,tert,
+ gov,nongov,
+ married"
+ ),","); xx
[[1]]
[1] "hhsize" "urban" "male" "age3045" "age4659" "age60"
"highsc" "tert"
[9] "gov" "nongov" "married"
> eq1<-my.formula(y="cig",x=x); eq1
cig ~ hhsize + urban + male + age3045 + age4659 + age60 + highsc +
tert + gov + nongov + married
> eq2<-my.formula(y="cig",x=xx); eq2
cig ~ c("hhsize", "urban", "male", "age3045", "age4659", "age60",
"highsc", "tert", "gov", "nongov", "married")
On 2021/1/5 ?? 06:01, Eric Berger wrote:
If your column names have no spaces the following should work
x<-strsplit(gsub("[\n ]","",
"hhsize,urban,male,
+ gov,nongov,married"),","); x
On Tue, Jan 5, 2021 at 11:47 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Here we go! BUT, it works great for a continuous line. With
line break(s), I got the nuisance "\n" inserted.
> x<-strsplit("hhsize,urban,male,gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male" "gov" "nongov" "married"
> x<-strsplit("hhsize,urban,male,
+ gov,nongov,married",","); x
[[1]]
[1] "hhsize" "urban" "male"
"\n gov"
[5] "nongov" "married"
On 2021/1/5 ?? 05:34, Eric Berger wrote:
zx<-strsplit("age,exercise,income,white,black,hispanic,base,somcol,grad,employed,unable,homeowner,married,divorced,widowed",",")
On Tue, Jan 5, 2021 at 11:01 AM Steven Yen <styen at ntu.edu.tw
<mailto:styen at ntu.edu.tw>> wrote:
Thank you, Jeff. IMO, we are all here to make R work
better to suit our
various needs. All I am asking is an easier way to
define variable list
zx, differently from the way z0 , x0, and treat are defined.
> zx<-colnames(subset(mydata,select=c(
+
age,exercise,income,white,black,hispanic,base,somcol,grad,employed,
+ unable,homeowner,married,divorced,widowed)))
> z0<-c("fruit","highblood")
> x0<-c("vgood","poor")
> treat<-"depression"
> eq1 <-my.formula(y="depression",x=zx,z0)
> eq2 <-my.formula(y="bmi", x=zx,x0)
> eq2t<-my.formula(y="bmi", x=zx,treat)
> eqs<-list(eq1,eq2); eqs
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + vgood + poor
> eqt<-list(eq1,eq2t); eqt
[[1]]
depression ~ age + exercise + income + white + black +
hispanic +
base + somcol + grad + employed + unable +
homeowner + married +
divorced + widowed + fruit + highblood
[[2]]
bmi ~ age + exercise + income + white + black + hispanic
+ base +
somcol + grad + employed + unable + homeowner +
married +
divorced + widowed + depression
On 2021/1/5 ?? 04:18, Jeff Newmiller wrote:
> IMO if you want to hardcode a formula then simply
hardcode a formula. If you want 20 formulas, write 20
formulas. Is that really so bad?
>
> If you want to have an abbreviated way to specify sets
of variables without conforming to R syntax then put
them into data files and read them in using a format of
your choice.
>
> But using NSE to avoid using quotes for entering what
amounts to in-script data is abuse of the language
justified by laziness... the amount of work you put
yourself and anyone else who reads your code through is
excessive relative to the benefit gained.
>
> NSE has its strengths... but as a method of creating
data objects it sucks. Note that even the tidyverse
(now) requires you to use quotes when you are not
directly referring to something that already exists. And
if you were... you might as well be creating a formula.
>
> On January 4, 2021 11:14:54 PM PST, Steven Yen
<styen at ntu.edu.tw <mailto:styen at ntu.edu.tw>> wrote:
>> I constantly define variable lists from a data frame
>>
>> regression equation). Line 3 below does just that.
>> variable
>> name in quotation marks is too much work especially
>> I
>> do that with line 4. Is there an easier way to
>> define a list of variable names containing
>>> data<-as.data.frame(matrix(1:30,nrow=6))
>>> colnames(data)<-c("a","b","c","d","e"); data
>> a b c d e
>> 1 1 7 13 19 25
>> 2 2 8 14 20 26
>> 3 3 9 15 21 27
>> 4 4 10 16 22 28
>> 5 5 11 17 23 29
>> 6 6 12 18 24 30
>>> x1<-c("a","c","e"); x1 # line 3
>>> x2<-colnames(subset(data,select=c(a,c,e))); x2 # line 4
>> [1] "a" "c" "e"
>>
>> ______________________________________________
>> R-help at r-project.org <mailto:R-help at r-project.org>