Message-ID: <BANLkTik1a0tvS60AusnLewus4bsOwDTLYg@mail.gmail.com>
Date: 2011-05-15T21:41:21Z
From: Kenn Konstabel
Subject: how to merge within range?
In-Reply-To: <20110514210812.52843909qgpc5t7g@psy2.psych.tu-dresden.de>
I'd've first said it's "simply"
sapply(df1$time, function(x) if(any(foo <- (x>=df2$from &
x<=df2$to))>0) df2$value[which(foo)] else NA )
but the following are much nicer (except that instead of NA you'll
have 0 but that's easy to change if necessary):
colSums(sapply(df1$time, function(x) (x>=df2$from & x <=df2$to) * df2$value) )
rowSums(outer(df1$time, df2$from, ">=") * outer(df1$time, df2$to,
"<=") * df2$value)
On Sat, May 14, 2011 at 10:08 PM, Ren? Mayer
<mayer at psychologie.tu-dresden.de> wrote:
> sqldf is impressive - compiled it now;
> the trick with findInterval is nice, too.
> thanks guys!!
>
>
>
>
> Zitat von "David Winsemius" <dwinsemius at comcast.net>:
>
>>
>> On May 14, 2011, at 2:27 PM, William Dunlap wrote:
>>
>>> You could use findInterval() along with a trick with c(rbind(...)):
>>>
>>>> i <- findInterval(x=df.1$time, vec=c(rbind(df.2$from, df.2$to)))
>>>> i
>>>
>>> [1] 1 1 1 2 3 3 3 5 5 6
>>
>> That's nice. I was working on a slightly different "trick"
>>
>> findInterval( df.1[,1],t(df.2[,1:2]))
>> ?[1] 1 1 1 2 3 3 3 5 5 6
>>
>> I was then trying to get the right indices with (.)'%%' 2 and (.) '%/%' 2
>>
>>
>>> The even-valued outputs would map to NA's, the odds
>>> to value[(i+1)/2], but you can use the c(rbind(...)) trick again:
>>>
>>>> c(rbind(df.2$value, NA))[i]
>>>
>>> [1] ?1 ?1 ?1 NA ?3 ?3 ?3 ?5 ?5 NA
>>
>> I'd like to understand that. Maybe, maybe... ah, got it. At first I didn't
>> realize those were the final answers since they looked like indices. My t(.)
>> trick doesn't generalize as well.
>>
>>
>> My earlier suggestion tht two merges woul do it was based on my erroneous
>> interpretation of the example, since ?I thought the task was to match on the
>> end points of the intervals.
>>
>>>
>>> Bill Dunlap
>>> Spotfire, TIBCO Software
>>> wdunlap tibco.com
>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org
>>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Ren? Mayer
>>>> Sent: Saturday, May 14, 2011 11:06 AM
>>>> To: David Winsemius
>>>> Cc: r-help at r-project.org
>>>> Subject: Re: [R] how to merge within range?
>>>>
>>>> thanks David and Ian,
>>>> let me make a better example as the first one was flawed
>>>>
>>>> df.1=data.frame(round((1:10)*100+rnorm(10)), value=NA)
>>>> names(df.1) = c("time", "value")
>>>> df.1
>>>> ?time value
>>>> 1 ? 101 ? ?NA
>>>> 2 ? 199 ? ?NA
>>>> 3 ? 301 ? ?NA
>>>> 4 ? 401 ? ?NA
>>>> 5 ? 501 ? ?NA
>>>> 6 ? 601 ? ?NA
>>>> 7 ? 700 ? ?NA
>>>> 8 ? 800 ? ?NA
>>>> 9 ? 900 ? ?NA
>>>> 10 1000 ? ?NA
>>>>
>>>> # from and to define ranges within time,
>>>> # note that from and to may not match the numbers given in time
>>>> df.2=data.frame(from=c(99,500,799),to=c(303,702,950), value=c(1,3,5))
>>>> df.2
>>>> ?from ?to value
>>>> 1 ? 99 303 ? ? 1
>>>> 2 ?500 702 ? ? 3
>>>> 3 ?799 950 ? ? 5
>>>>
>>>> what I want is:
>>>> ?time value
>>>> 1 ? 101 ? ?1
>>>> 2 ? 199 ? ?1
>>>> 3 ? 301 ? ?1
>>>> 4 ? 401 ? ?NA
>>>> 5 ? 501 ? ?3
>>>> 6 ? 601 ? ?3
>>>> 7 ? 700 ? ?3
>>>> 8 ? 800 ? ?5
>>>> 9 ? 900 ? ?5
>>>> 10 1000 ? ?NA
>>>>
>>>> @David I don't know what you mean by 2 merges,
>>>> Ren?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Zitat von "David Winsemius" <dwinsemius at comcast.net>:
>>>>
>>>>>
>>>>> On May 14, 2011, at 9:16 AM, Ian Gow wrote:
>>>>>
>>>>>> If I assume that the third column in data.frame.2 is named
>>>>
>>>> "val" then in
>>>>>>
>>>>>> SQL terms it _seems_ you want
>>>>>>
>>>>>> SELECT a.time, b.val FROM data.frame.1 AS a LEFT JOIN
>>>>
>>>> data.frame.2 AS b ON
>>>>>>
>>>>>> a.time BETWEEN b.start AND b.end;
>>>>>>
>>>>>> Not sure how to do that elegantly using R subsetting/merge,
>>>>>
>>>>> Huh? It's just two merge()'s (... once you fix the error in
>>>>
>>>> the example.)
>>>>>
>>>>> --
>>>>> David
>>>>>
>>>>>> but you might
>>>>>> try a package that allows you to use SQL, such as sqldf.
>>>>>>
>>>>>>
>>>>>> On 5/14/11 8:03 AM, "David Winsemius"
>>>>
>>>> <dwinsemius at comcast.net> wrote:
>>>>>>
>>>>>>>
>>>>>>> On May 14, 2011, at 8:12 AM, Ren? Mayer wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>> how can one merge
>>>>>>>
>>>>>>> And what happened when you typed:
>>>>>>>
>>>>>>> ?merge
>>>>>>>
>>>>>>>> two data frames when in the second data frame one column
>>>>
>>>> defines the
>>>>>>>>
>>>>>>>> start values
>>>>>>>> and another defines the end value of the to be merged range.
>>>>>>>> data.frame.1
>>>>>>>> time ...
>>>>>>>> 13
>>>>>>>> 24
>>>>>>>> 35
>>>>>>>> 46
>>>>>>>> 55
>>>>>>>> ...
>>>>>>>> data.frame.2
>>>>>>>> start end
>>>>>>>> 24 37 ?h? ?
>>>>>>>> ...
>>>>>>>>
>>>>>>>> should result in this
>>>>>>>> 13 NA
>>>>>>>> 24 ?h?
>>>>>>>> 35 ?h?
>>>>>>>> 46 NA
>>>>>>>> 55
>>>>>>>> ?
>>>>>>>
>>>>>>> And _why_ would that be?
>>>>>>>
>>>>>>>
>>>>>>>> thanks,
>>>>>>>> Ren?
>>>>>>>>
>>>>>>>> ______________________________________________
>>>>>>>> R-help at r-project.org mailing list
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>>> PLEASE do read the posting guide
>>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>>> and provide commented, minimal, self-contained,
>>>>
>>>> reproducible code.
>>>>>>>
>>>>>>> David Winsemius, MD
>>>>>>> West Hartford, CT
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>
>>>>>
>>>>> David Winsemius, MD
>>>>> West Hartford, CT
>>>>>
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>