Skip to content
Back to formatted view

Raw Message

Message-ID: <7838C5BB-D2AD-45DF-9F26-9D44376BA2A2@comcast.net>
Date: 2009-04-22T11:55:19Z
From: David Winsemius
Subject: Merging data frames, or one column/vector with a data frame	filling out empty rows with NA's
In-Reply-To: <49EEFDF3.70FB.00EE.0@dansksvineproduktion.dk>

On Apr 22, 2009, at 5:22 AM, Johannes G. Madsen wrote:

> Hello
>
> I have two data frames, SNP4 and SNP1:
>
>> head(SNP4)
>          Animal     Marker        Y
> 3213 194073197  P1001 0.021088
> 1295 194073197  P1002 0.021088
> 915   194073197  P1004 0.021088
> 2833 194073197  P1005 0.021088
> 1487 194073197  P1006 0.021088
> 1885 194073197  P1007 0.021088
>
>> head(SNP1)
>           Animal    Marker x
> 3213 194073197  P1001 2
> 1295 194073197  P1002 1
> 915   194073197  P1004 2
> 2833 194073197  P1005 0
> 1487 194073197  P1006 2
> 1885 194073197  P1007 0
>
> I want these two data frames merged by 'Marker', but when i try
>
>> SNP5 <- merge(SNP4, SNP1, by = 'Marker', all = TRUE)
> Error: cannot allocate vector of size 2.4 Gb
> In addition: Warning messages:
> 1: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
>  Reached total allocation of 1535Mb: see help(memory.size)
> 2: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
>  Reached total allocation of 1535Mb: see help(memory.size)
> 3: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
>  Reached total allocation of 1535Mb: see help(memory.size)
> 4: In merge.data.frame(SNP4, SNP1, by = "Marker", all = TRUE) :
>  Reached total allocation of 1535Mb: see help(memory.size)
>
> And error occurs.

So what are the results of:

str(SNP4) ; str(SNP1)    # this will tell us how large these objects  
are.

And are you sure you don't want the merge to occur by Animal as well?

>
>
> What i want is the column SNP1$x merged together with SNP4 by  
> Marker, so some
> markers will have NA's in the 'x'-column in the SNP5 dataset.
>
>
> I also tried this
>
>> SNP5 <- merge(SNP4, SNP1$x, by.x = 'Marker', by.y = 'Marker', all =  
>> TRUE)
> Error in fix.by(by.y, y) : 'by' must specify valid column(s)
>
> I won't work either.
>
> Does anyone have any idea how to solve this.

The second error seems pretty obvious. You are trying to merge a  
vector that has no longer any "Marker" with a dataframe that does.
>
>
> Regards,
>
> Johannes.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT