An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110727/d7a35cee/attachment.pl>
Reorganize(stack data) a dataframe inducing names
3 messages · Francesca PANCOTTO, Jim Lemon, David Winsemius
On 07/27/2011 06:28 PM, Francesca wrote:
Dear Contributors,
thanks for collaboration.
I am trying to reorganize data frame, that looks like this:
n1.Index Date PX_LAST n2.Index Date.1 PX_LAST.1
n3.Index Date.2 PX_LAST.2
1 NA 04/02/07 1.34 NA 04/02/07 1.36
NA 04/02/07 1.33
2 NA 04/09/07 1.34 NA 04/09/07
1.36 NA 04/09/07 1.33
3 NA 04/16/07 1.34 NA 04/16/07 1.36
NA 04/16/07 1.33
4 NA 04/30/07 1.36 NA 04/30/07
1.40 NA 04/30/07 1.37
5 NA 05/07/07 1.36 NA 05/07/07
1.40 NA 05/07/07 1.37
6 NA 05/14/07 1.36 NA 05/14/07 1.40
NA 05/14/07 1.37
7 NA 05/22/07 1.36 NA 05/22/07 1.40
NA 05/22/07 1.37
While what I would like to obtain is:
I would like to obtain stacked data as:
n1.Index Date PX_LAST
n1.Index 04/02/07 1.34
n1.Index 04/09/07 1.34
n1.Index 04/16/07 1.34
n1.Index 04/30/07 1.36
n1.Index 05/07/07 1.36
n1.Index 05/14/07 1.36
n1.Index 05/22/07 1.36
n2.Index 04/02/07 1.36
n2.Index 04/16/07 1.36
n2.Index 04/16/07 1.36
n2.Index 04/30/07 1.40
n2.Index 05/07/07 1.40
n2.Index 05/14/07 1.40
n2.Index 05/22/07 1.40
n3.Index 04/02/07 1.33
n3.Index 04/16/07 1.33
n3.Index 04/16/07 1.33
n3.Index 04/30/07 1.37
I have tried the function stack, but it uses only one argument. Then I
have tested the melt function from the package reshape, but it
seems not to be reproducing the correct organization of the data, as
it takes date as the id values.
PS: the n1 index names are not ordered in the original database, so
I cannot fill in the NA with the names using a recursive formula.
Hi Francesca,
Oddly enough, I answered a similar question a few days ago. The function
below turns one or more columns in a data frame into two columns, one a
factor that defaults to the name(s) of the columns and the other the
data that was in that column. It also "stretches" the remaining columns
in the data frame to the same number of rows and sticks the two
together. It doesn't do exactly what you show above, but it might be
good enough. A bit of coding could get the factor levels the way you want.
stretch.var<-function(data,to.stretch,
stretch.names=c("newvar","scores")) {
datadim<-dim(data)
to.rep<-which(!(1:datadim[2] %in% to.stretch))
nrep<-length(to.rep)
newDF<-data.frame(rep(data[,to.rep[1]],length(to.stretch)))
if(nrep > 1) {
for(repvar in 2:nrep)
newDF[[repvar]]<-rep(data[[to.rep[repvar]]],length(to.stretch))
}
newDF<-cbind(newDF,rep(names(data[,to.stretch]),each=datadim[1]),
unlist(data[,to.stretch]))
names(newDF)<-c(names(data[to.rep]),stretch.names)
rownames(newDF)<-NULL
return(newDF)
}
# read in the data
fp<-read.table("fp.dat",header=TRUE)
# pass only the columns that you want in the result
stretch.var(fp[,c(2,3,6,9)],2:4,c("n1.index","PX_LAST"))
Date n1.index PX_LAST
1 04/02/07 PX_LAST 1.34
2 04/09/07 PX_LAST 1.34
3 04/16/07 PX_LAST 1.34
4 04/30/07 PX_LAST 1.36
5 05/07/07 PX_LAST 1.36
6 05/14/07 PX_LAST 1.36
7 05/22/07 PX_LAST 1.36
8 04/02/07 PX_LAST.1 1.36
9 04/09/07 PX_LAST.1 1.36
10 04/16/07 PX_LAST.1 1.36
11 04/30/07 PX_LAST.1 1.40
12 05/07/07 PX_LAST.1 1.40
13 05/14/07 PX_LAST.1 1.40
14 05/22/07 PX_LAST.1 1.40
15 04/02/07 PX_LAST.2 1.33
16 04/09/07 PX_LAST.2 1.33
17 04/16/07 PX_LAST.2 1.33
18 04/30/07 PX_LAST.2 1.37
19 05/07/07 PX_LAST.2 1.37
20 05/14/07 PX_LAST.2 1.37
21 05/22/07 PX_LAST.2 1.37
Jim
On Jul 27, 2011, at 4:28 AM, Francesca wrote:
Dear Contributors,
thanks for collaboration.
I am trying to reorganize data frame, that looks like this:
n1.Index Date PX_LAST n2.Index Date.1 PX_LAST.1
n3.Index Date.2 PX_LAST.2
1 NA 04/02/07 1.34 NA 04/02/07
1.36
NA 04/02/07 1.33
2 NA 04/09/07 1.34 NA 04/09/07
1.36 NA 04/09/07 1.33
3 NA 04/16/07 1.34 NA 04/16/07
1.36
NA 04/16/07 1.33
4 NA 04/30/07 1.36 NA 04/30/07
1.40 NA 04/30/07 1.37
5 NA 05/07/07 1.36 NA 05/07/07
1.40 NA 05/07/07 1.37
6 NA 05/14/07 1.36 NA 05/14/07
1.40
NA 05/14/07 1.37
7 NA 05/22/07 1.36 NA 05/22/07
1.40
NA 05/22/07 1.37
While what I would like to obtain is:
I would like to obtain stacked data as:
n1.Index Date PX_LAST
n1.Index 04/02/07 1.34
n1.Index 04/09/07 1.34
n1.Index 04/16/07 1.34
n1.Index 04/30/07 1.36
n1.Index 05/07/07 1.36
n1.Index 05/14/07 1.36
n1.Index 05/22/07 1.36
n2.Index 04/02/07 1.36
n2.Index 04/16/07 1.36
n2.Index 04/16/07 1.36
n2.Index 04/30/07 1.40
n2.Index 05/07/07 1.40
n2.Index 05/14/07 1.40
n2.Index 05/22/07 1.40
n3.Index 04/02/07 1.33
n3.Index 04/16/07 1.33
n3.Index 04/16/07 1.33
n3.Index 04/30/07 1.37
I have tried the function stack, but it uses only one argument. Then I
have tested the melt function from the package reshape, but it
seems not to be reproducing the correct organization of the data, as
it takes date as the id values.
PS: the n1 index names are not ordered in the original database, so
I cannot fill in the NA with the names using a recursive formula.
Thank you for any help you can provide.
(only on the last point, since you already have been offered a solution ...) You should read more rhelp questions and answers. This thread yesterday had three different ways that you could have replaced the values of those *.Index columns with their names: [R] Recoding Multiple Variables in a Data Frame in One Step Ehlers liked Dunlap's solution, but I thought those two were equally clever. Mine was clearly not the best.
Francesca -- Francesca ---------------------------------- Francesca Pancotto, PhD Dipartimento di Economia Universit? di Bologna Piazza Scaravilli, 2 40126 Bologna Office: +39 051 2098135 Cell: +39 393 6019138 Web: http://www2.dse.unibo.it/francesca.pancotto/ ---------------------------------- [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD West Hartford, CT