Skip to content

Help with data management

5 messages · Jim Lemon, David L Carlson, André Luis Neves

#
Hi Andre,
As far as I am aware, merges can only be accomplished between two data
frames, so I think you would have to do it one by one. It is probably
possible to program this to operate on your list of data frames, but I
suspect that it would take as much time as a bit of copying and
pasting. If your data is being extracted from an external database, it
may be possible to perform the operation in SQL, I don't have the time
to work that out at the moment.

Jim
On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> wrote:
#
You can also combine the data frames into a single one and use xtabs:

ID <- names(mylist)
mylist <- Map(data.frame, mylist, dfn=ID)
mydf <- do.call(rbind, mylist)
mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family)))
xtabs(Hits~Family+dfn, mydf)
#       dfn
# Family  A  B  C
#      a  0  3  0
#      c  1  1  0
#      d  2  0  0
#      e  3  0  0
#      f  0  4  5
#      o  0  0  4
#      q  0  0 10


-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352




-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Thursday, February 23, 2017 6:00 PM
To: Andr? Luis Neves <andrluis at ualberta.ca>; r-help mailing list <r-help at r-project.org>
Subject: Re: [R] Help with data management

Hi Andre,
As far as I am aware, merges can only be accomplished between two data
frames, so I think you would have to do it one by one. It is probably
possible to program this to operate on your list of data frames, but I
suspect that it would take as much time as a bit of copying and
pasting. If your data is being extracted from an external database, it
may be possible to perform the operation in SQL, I don't have the time
to work that out at the moment.

Jim
On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> wrote:
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hi, David:

Thank you so much for your answer.

I just added some commands and got what I wanted.

The final command would be something like this:


A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3))
colnames(A) <- c ("Family", "NormalizedCount", "Hits")
A
B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3))
colnames(B) <- c ("Family", "NormalizedCount", "Hits")
B
C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5))
colnames(C) <- c ("Family", "NormalizedCount", "Hits")
C
mylist <- list(A=A,B=B,C=C)
mylist
ID <- names(mylist)
mylist <- Map(data.frame, mylist, dfn=ID)
mydf <- do.call(rbind, mylist)
mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family)))
z <- xtabs(Hits~Family+dfn, mydf)
x <- as.data.frame(z)
x
library(reshape2)
y <- dcast(x, Family ~ dfn, value.var = "Freq")
y


Thank you very much.

Andre
On Fri, Feb 24, 2017 at 8:40 AM, David L Carlson <dcarlson at tamu.edu> wrote:

            

  
    
#
You can also get there without reshape2:

z <- xtabs(Hits~Family+dfn, mydf)
x <- as.data.frame.matrix(z) # Convert the table without changing the format
y <- data.frame(Family=dimnames(z)$Family, as.data.frame.matrix(z)) # Add Family column
rownames(y) <- NULL # Optional, but it replaces the rownames numbers
str(y)
# data.frame':   7 obs. of  4 variables:
#  $ Family: Factor w/ 7 levels "a","c","d","e",..: 1 2 3 4 5 6 7
#  $ A     : num  0 1 2 3 0 0 0
#  $ B     : num  3 1 0 0 4 0 0
#  $ C     : num  0 0 0 0 5 4 10

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



From: Andr? Luis Neves [mailto:andrluis at ualberta.ca] 
Sent: Friday, February 24, 2017 10:14 AM
To: David L Carlson <dcarlson at tamu.edu>
Cc: Jim Lemon <drjimlemon at gmail.com>; r-help mailing list <r-help at r-project.org>
Subject: Re: [R] Help with data management

Hi, David:

Thank you so much for your answer.

I just added some commands and got what I wanted.

The final command would be something like this:


A= data.frame(c("c", "d", "e"),4.4:6.8,c(1,2,3))
colnames(A) <- c ("Family", "NormalizedCount", "Hits")?
A?
B= data.frame(c("c", "f", "a"),c(3.2,6.4, 4.4), c(1,4,3))?
colnames(B) <- c ("Family", "NormalizedCount", "Hits")
B
C= data.frame(c("q", "o", "f"),c(7.2,9.4, 41.4), c(10,4,5))?
colnames(C) <- c ("Family", "NormalizedCount", "Hits")
C
mylist <- list(A=A,B=B,C=C)
mylist
ID <- names(mylist)
mylist <- Map(data.frame, mylist, dfn=ID)
mydf <- do.call(rbind, mylist)
mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family)))
z <- xtabs(Hits~Family+dfn, mydf)
x <- as.data.frame(z)
x
library(reshape2)
y <- dcast(x, Family ~ dfn, value.var = "Freq")
y


Thank you very much.

Andre
On Fri, Feb 24, 2017 at 8:40 AM, David L Carlson <dcarlson at tamu.edu> wrote:
You can also combine the data frames into a single one and use xtabs:

ID <- names(mylist)
mylist <- Map(data.frame, mylist, dfn=ID)
mydf <- do.call(rbind, mylist)
mydf$Family <- factor(mydf$Family, levels=sort(levels(mydf$Family)))
xtabs(Hits~Family+dfn, mydf)
#? ? ? ?dfn
# Family? A? B? C
#? ? ? a? 0? 3? 0
#? ? ? c? 1? 1? 0
#? ? ? d? 2? 0? 0
#? ? ? e? 3? 0? 0
#? ? ? f? 0? 4? 5
#? ? ? o? 0? 0? 4
#? ? ? q? 0? 0 10


-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352




-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Lemon
Sent: Thursday, February 23, 2017 6:00 PM
To: Andr? Luis Neves <andrluis at ualberta.ca>; r-help mailing list <r-help at r-project.org>
Subject: Re: [R] Help with data management

Hi Andre,
As far as I am aware, merges can only be accomplished between two data
frames, so I think you would have to do it one by one. It is probably
possible to program this to operate on your list of data frames, but I
suspect that it would take as much time as a bit of copying and
pasting. If your data is being extracted from an external database, it
may be possible to perform the operation in SQL, I don't have the time
to work that out at the moment.

Jim
On Fri, Feb 24, 2017 at 10:53 AM, Andr? Luis Neves <andrluis at ualberta.ca> wrote:
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Thank you, David, for your help.

I'm so thankful for this R mailing list, and to all R community.

Andre
On Fri, Feb 24, 2017 at 11:00 AM, David L Carlson <dcarlson at tamu.edu> wrote: