Skip to content

variable selectin---reduce the numbers of initial variable

8 messages · bbslover, Ricardo Gonçalves, Frank E Harrell Jr +1 more

#
hello, 

my problem is like this: now after processing the varibles, the remaining
160 varibles(independent) and a dependent y. when I used PLS method, with 10
components, the good r2 can be obtained. but I donot know how can I express
my equation with the less varibles and the y. It is better to use less
indepent varibles.  that is how can I select my indepent varibles.   Maybe
GA  is good method, but now I donot gasp it. and can you give me more good
varibles selection's methods.   and In R, which method can be used to select
the potent varibles .  and using the selected varibles to model a equation
with higher r2, q2,and less RMSP.

thank you!
#
Hi,

Nowdays there's a lot o new variable selection methods, specially using the 
Bayes Paradigm.
For your problem, I think you could try the Bayesian Model Average BMA 
package.
Or, you can reduce your data dimension by PCA, which also permits you see 
the weight of
each variable in the PC.

HTH

Rick

--------------------------------------------------
From: "bbslover" <dluthm at yeah.net>
Sent: Wednesday, November 04, 2009 10:23 AM
To: <r-help at r-project.org>
Subject: [R]  variable selectin---reduce the numbers of initial variable

            
#
thank you . I can try bayesian. PCA method that I used to is can get some
pcs, but I donot know how can i use the original variables in that equation,
maybe I should select those have high weight ones,and delete that less
weight ones. right?
Ricardo Gon?alves Silva wrote:

  
    
#
Yes, right. But I still prefer using BMA.
Best,

Rick

--------------------------------------------------
From: "bbslover" <dluthm at yeah.net>
Sent: Wednesday, November 04, 2009 11:28 PM
To: <r-help at r-project.org>
Subject: Re: [R] variable selectin---reduce the numbers of initial variable

            
#
Ricardo Gon?alves Silva wrote:
If you are entertaining only one model family, them BMA is a long, 
tedious, complex way to obtain shrinkage and the resulting averaged 
model is very difficult to interpret.  Consider a more direct approach.

Frank
#
Hi Guys,

Of course, a backward, forward, or other methods can be used directly. But 
concerning BMA, the model interpretation is far simple:

"Bayesian Model Averaging accounts for the model uncertainty inherent in the 
variable selection problem by averaging over the best models in the model 
class according to approximate posterior model probability."

If you want to learn a few more before continue, that a look at the BMA 
homepage:

http://www2.research.att.com/~volinsky/bma.html

But of course, you must do what you think is better for your problem.
By the way what is the dimension of your problem?

HTH,

Rick
--------------------------------------------------
From: "Frank E Harrell Jr" <f.harrell at vanderbilt.edu>
Sent: Thursday, November 05, 2009 4:12 PM
To: "Ricardo Gon?alves Silva" <ricardogs at terra.com.br>
Cc: "bbslover" <dluthm at yeah.net>; <r-help at r-project.org>
Subject: Re: [R] variable selectin---reduce the numbers of initial variable
#
There is also a sparse PLS model in the spls package. It uses
lasso-like regularization to reduce the number of variables. I've had
a lot of success with it.

Max


2009/11/5 Ricardo Gon?alves Silva <ricardogs at terra.com.br>:

  
    
#
thank all friends to discuss this problem, my data is   54*160 matrix. PLS is
a good method, can it give a equation with y~selected little vriables?  for
exmaple:  
y	Sv	Sp	Ms	nCIR	nAB	nC	nN	nO	nX	ZM1V
1	7.62	31.45	33.44	2.37	2	12	18	6	2	0
2	8.34	32.26	33.92	2.36	3	18	20	6	2	0
3	7.79	30.97	32.89	2.41	2	12	18	5	3	0
4	7.83	27.75	29.47	2.37	2	12	16	6	1	0
5	7.63	29.35	31.23	2.33	2	12	17	6	1	0
6	7.45	30.94	32.99	2.3	2	12	18	6	1	0
7	7.97	33.35	35.23	2.29	3	18	21	6	1	0
8	8.93	24.47	25.64	2.46	4	12	14	6	2	0
9	8.67	24.95	26.19	2.41	4	12	14	7	1	0
10	9.36	25.04	26.83	2.38	4	12	14	6	1	0
11	8.93	24.47	25.64	2.46	4	12	14	6	2	0
12	7.46	33.05	35.2	2.34	2	12	19	6	2	0
13	7.54	34.05	36.2	2.29	3	12	20	6	2	0
14	8.34	27.66	29.16	2.38	4	12	16	6	2	0
15	8.1	29.26	30.92	2.35	4	12	17	6	2	0
16	8.69	27.06	28.4	2.35	5	12	16	6	2	0
17	7.39	34.53	36.76	2.26	3	12	20	7	1	0
18	9.48	21.15	22.58	2.35	2	12	12	5	0	1
19	8.3	20.35	22.1	2.36	1	6	10	5	0	1
20	9.21	18.15	19.58	2.33	2	6	9	5	0	1
21	7.85	24.54	26.63	2.16	2	6	13	5	0	1
22	9.05	22.75	24.34	2.31	2	12	13	5	0	1
23	8.9	19.26	20.8	2.44	1	6	9	5	1	1
24	9.75	22.66	24.03	2.54	2	12	13	5	1	1
25	8.37	20.45	21.72	2.27	2	12	12	5	0	0
26	7.77	23.64	25.24	2.2	2	12	14	5	0	0
27	7.54	25.24	27.01	2.17	2	12	15	5	0	0
28	7.88	24.64	26.24	2.13	3	12	15	5	0	0
29	10.59	21.85	23.44	2.42	2	12	12	5	0	2
30	9.25	23.26	24.8	2.37	2	12	13	5	1	1
31	8.4	25.94	27.86	2.26	2	12	15	5	0	1
32	8.14	27.54	29.63	2.24	2	12	16	5	0	1
33	12.02	22.23	23.93	2.35	2	12	12	5	0	2
34	10.27	22.57	23.73	2.74	2	12	12	6	2	1
35	9.96	22.55	23.82	2.51	2	12	13	6	0	1
36	8.89	30.45	32.32	2.26	3	18	19	5	1	1
37	9.15	26.37	28.01	2.5	2	12	15	5	2	1
38	8.64	25.34	27.11	2.29	2	12	14	6	0	1
39	8.61	28.45	30.32	2.23	3	12	16	6	1	1
40	9.23	25.25	26.8	2.5	2	12	14	6	1	1
41	9.36	22.14	23.59	2.41	2	12	12	6	0	1
42	9.04	32.96	34.78	2.4	3	18	20	6	2	1
43	9.05	22.75	24.34	2.31	2	12	13	5	0	1
44	9.25	23.26	24.8	2.37	2	12	13	5	1	1
45	9.05	22.75	24.34	2.31	2	12	13	5	0	1
46	10.59	21.85	23.44	2.42	2	12	12	5	0	2
47	9.07	25.37	27.01	2.39	2	12	14	5	2	1
48	11.7	22.55	24.3	2.48	2	12	12	5	0	3
49	11.7	22.55	24.3	2.48	2	12	12	5	0	3
50	9.41	25.76	27.26	2.54	2	12	14	6	2	1
51	9.07	27.36	29.02	2.5	2	12	15	6	2	1
52	8.52	30.56	32.54	2.45	2	12	17	6	2	1
53	8.36	37.75	40.06	2.33	3	18	23	6	2	1
54	8.33	31.04	33.09	2.41	2	12	17	7	1	1

I want to obtain a equation with less varible. e.g. y~Sv+Sp+Ms+nCIR, wich
method can give it like that, not only resut like r2 q2 rms etc.  thank you!
Max Kuhn wrote: