I have a data frame that includes several columns representing
variables and variables names are indicated at the top row of the data
frame. That is, I had a csv file where variable names were stored in
the top row, and when I imported the csv file to R, R created a data
frame that appears with the name rwrdatafile (custom name I gave)
where I can see all the variables with their names on the top row in
RStudio. For example, one of the columns stores wage data and I can
create a stand alone data frame (shall I call it a vector data frame?)
for wage, but do this for all variables.
That is, I can execute the command
wage = rwrdatafile[,1,drop=FALSE]
which nicely creates wage and RStudio shows it as data in its
environment window and if I click on it, I can inspect it in a spread
sheet like view and work with that data say in regression analysis.
The problem is that there are many variables stored in the data frame
rwrdatafile, and it is very tedious to repeat the above mentioned
routine for each variable. Hence I attempted to write a for loop for
this but it helped to no avail.
In particular, I tried
for (i in 1:k){
assign(names(rwrdatafile)[i],rwrdatafile[,i])
}
and in fact this nicely assigns each column in the data frame to a
name, but I do not see the variables as data in the environment
section. But what I need are variables that I can work with in matrix
operations.
I also tried
for(i in 1:k){
names(rwrdatafile)[i] = rwrdatafile[,i,drop=FALSE]
}
thinking that this for loop would just repeat what I do for
wage = rwrdatafile[,1,drop=FALSE]
for all the variables in rwrdatafile.
Please note that I do need to use a for loop and in fact I need to
translate and imitate the MATLAB code below, which does the job in
MATLAB, as close as possible in R.
# MATLAB code generating variables from structure array rwrdatafile
[N,k] = size(rwrdatafile.data);
for i = 1:k
eval([cell2mat(rwrdatafile.textdata(i)) '= rwrdatafile.data(:,i);'])
end
How to automatically create data frames from an existing one?
4 messages · Tunga Kantarcı, Sarah Goslee, William Dunlap +1 more
I do not understand why you want to take a perfectly good data frame and split it into a whole bunch of single-column data frames instead of working with it as-is. The latter seems like an awkward and unnecessary thing to do. If you explain what you're trying to do, we can help. Referencing MATLAB code isn't useful, because R does not have the same underlying way of working. You can readily use columns of a data frame in other operations without doing this. Sarah
On Wed, Jan 11, 2017 at 6:53 AM, Tunga Kantarc? <tungakantarci at gmail.com> wrote:
I have a data frame that includes several columns representing
variables and variables names are indicated at the top row of the data
frame. That is, I had a csv file where variable names were stored in
the top row, and when I imported the csv file to R, R created a data
frame that appears with the name rwrdatafile (custom name I gave)
where I can see all the variables with their names on the top row in
RStudio. For example, one of the columns stores wage data and I can
create a stand alone data frame (shall I call it a vector data frame?)
for wage, but do this for all variables.
That is, I can execute the command
wage = rwrdatafile[,1,drop=FALSE]
which nicely creates wage and RStudio shows it as data in its
environment window and if I click on it, I can inspect it in a spread
sheet like view and work with that data say in regression analysis.
The problem is that there are many variables stored in the data frame
rwrdatafile, and it is very tedious to repeat the above mentioned
routine for each variable. Hence I attempted to write a for loop for
this but it helped to no avail.
In particular, I tried
for (i in 1:k){
assign(names(rwrdatafile)[i],rwrdatafile[,i])
}
and in fact this nicely assigns each column in the data frame to a
name, but I do not see the variables as data in the environment
section. But what I need are variables that I can work with in matrix
operations.
I also tried
for(i in 1:k){
names(rwrdatafile)[i] = rwrdatafile[,i,drop=FALSE]
}
thinking that this for loop would just repeat what I do for
wage = rwrdatafile[,1,drop=FALSE]
for all the variables in rwrdatafile.
Please note that I do need to use a for loop and in fact I need to
translate and imitate the MATLAB code below, which does the job in
MATLAB, as close as possible in R.
# MATLAB code generating variables from structure array rwrdatafile
[N,k] = size(rwrdatafile.data);
for i = 1:k
eval([cell2mat(rwrdatafile.textdata(i)) '= rwrdatafile.data(:,i);'])
end
Sarah Goslee http://www.functionaldiversity.org
You can use the 'with' function or the 'data' argument to many functions to use the variables in the data frame without copying them out to the global environment. Leaving them in the data.frame keeps them from getting lost among the temporary variables in the global environment.
Data <- read.csv(header=TRUE, text=
+ "Name,Education,Wage + Abe,PhD,105 + Bob,MS,108 + Chuck,BS,118 + Dave,PhD,102")
with(Data, tapply(Wage, Education, mean))
BS MS PhD 118.0 108.0 103.5
lm(data=Data, Wage ~ Education - 1)
Call:
lm(formula = Wage ~ Education - 1, data = Data)
Coefficients:
EducationBS EducationMS EducationPhD
118.0 108.0 103.5
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Jan 11, 2017 at 3:53 AM, Tunga Kantarc? <tungakantarci at gmail.com> wrote:
I have a data frame that includes several columns representing
variables and variables names are indicated at the top row of the data
frame. That is, I had a csv file where variable names were stored in
the top row, and when I imported the csv file to R, R created a data
frame that appears with the name rwrdatafile (custom name I gave)
where I can see all the variables with their names on the top row in
RStudio. For example, one of the columns stores wage data and I can
create a stand alone data frame (shall I call it a vector data frame?)
for wage, but do this for all variables.
That is, I can execute the command
wage = rwrdatafile[,1,drop=FALSE]
which nicely creates wage and RStudio shows it as data in its
environment window and if I click on it, I can inspect it in a spread
sheet like view and work with that data say in regression analysis.
The problem is that there are many variables stored in the data frame
rwrdatafile, and it is very tedious to repeat the above mentioned
routine for each variable. Hence I attempted to write a for loop for
this but it helped to no avail.
In particular, I tried
for (i in 1:k){
assign(names(rwrdatafile)[i],rwrdatafile[,i])
}
and in fact this nicely assigns each column in the data frame to a
name, but I do not see the variables as data in the environment
section. But what I need are variables that I can work with in matrix
operations.
I also tried
for(i in 1:k){
names(rwrdatafile)[i] = rwrdatafile[,i,drop=FALSE]
}
thinking that this for loop would just repeat what I do for
wage = rwrdatafile[,1,drop=FALSE]
for all the variables in rwrdatafile.
Please note that I do need to use a for loop and in fact I need to
translate and imitate the MATLAB code below, which does the job in
MATLAB, as close as possible in R.
# MATLAB code generating variables from structure array rwrdatafile
[N,k] = size(rwrdatafile.data);
for i = 1:k
eval([cell2mat(rwrdatafile.textdata(i)) '= rwrdatafile.data(:,i);'])
end
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I don't know what the matlab eval() function does, but this example might help you get started with the way R does things: lapply( rwrdatafile, summary) This will apply the summary() function to every column of the data frame. As others have mentioned, it is bad R to create separate variables for each column of the data frame. Anything you want to do with your variable named wage you can do with rwdatafile$wage. Regression analysis even more you should NOT take the variables out of the data frame. Instead using things like lm( wage ~ year, data=rwdatafile) But if you insist on it, then try this: for (nm in names(rwrdatafile)) assign(nm, rwrdatafile[[nm]], '.GlobalEnv') (assuming I got the parentheses matched correctly)
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 1/11/17, 3:53 AM, "R-help on behalf of Tunga Kantarc?" <r-help-bounces at r-project.org on behalf of tungakantarci at gmail.com> wrote:
I have a data frame that includes several columns representing
variables and variables names are indicated at the top row of the data
frame. That is, I had a csv file where variable names were stored in
the top row, and when I imported the csv file to R, R created a data
frame that appears with the name rwrdatafile (custom name I gave)
where I can see all the variables with their names on the top row in
RStudio. For example, one of the columns stores wage data and I can
create a stand alone data frame (shall I call it a vector data frame?)
for wage, but do this for all variables.
That is, I can execute the command
wage = rwrdatafile[,1,drop=FALSE]
which nicely creates wage and RStudio shows it as data in its
environment window and if I click on it, I can inspect it in a spread
sheet like view and work with that data say in regression analysis.
The problem is that there are many variables stored in the data frame
rwrdatafile, and it is very tedious to repeat the above mentioned
routine for each variable. Hence I attempted to write a for loop for
this but it helped to no avail.
In particular, I tried
for (i in 1:k){
assign(names(rwrdatafile)[i],rwrdatafile[,i])
}
and in fact this nicely assigns each column in the data frame to a
name, but I do not see the variables as data in the environment
section. But what I need are variables that I can work with in matrix
operations.
I also tried
for(i in 1:k){
names(rwrdatafile)[i] = rwrdatafile[,i,drop=FALSE]
}
thinking that this for loop would just repeat what I do for
wage = rwrdatafile[,1,drop=FALSE]
for all the variables in rwrdatafile.
Please note that I do need to use a for loop and in fact I need to
translate and imitate the MATLAB code below, which does the job in
MATLAB, as close as possible in R.
# MATLAB code generating variables from structure array rwrdatafile
[N,k] = size(rwrdatafile.data);
for i = 1:k
eval([cell2mat(rwrdatafile.textdata(i)) '= rwrdatafile.data(:,i);'])
end
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.