set dataframe field value from lookup table
Offlist comments No reply needed .. This is just for emphasis and clarification.
On Dec 9, 2010, at 11:19 AM, Jon Erik Ween wrote:
David I see how findInterval is a more elegant way of doing 1). I'd need to change the indices in the lookup table, as
findInterval(36, c(0, 17, 19, 24, 29, 34, 44, 54, 64, 69, 74, 79, 84, 89) )
[1] 6 should be 7, not 6. The age range for the 7th column 35-44. But that's easy.
As you say, changing the output to agree with your expectations is "easy" but to be clear, R _is_ delivering the correct response to the question "which interval is 36 located in ... the 6th. Any ambiguity is due to your not formulating a good question (and my errors).
I can't see how findInterval will help me for 2), though.
"2)" was never very clear. I do think findInterval must be what is needed, but I am repeating my call for you to post a full example and more complete explanation to the list.
The standard score is integer and not a range.
Which is _not_ how statisticians usually think of a "z score". So it may need some further background or use of less misleading terminology. You are probably tasked with using a table handed to you that at one time was a "z-score" for <something> but has been recast in tabular form.
David.
> So it maps 1 to 1. The real problem, though, is setting the value in
> the main dataframe (df) with the value from the lookup table based
> on the identified age and score indices.
>
> My initial guess was:
>
> df$DSTz <-DSTzlook[which(DSTzlook[,1]==df
> $Agetmp),which(DSTzlook[1,]==df$DSF+df$DSB)]
>
> which could be rewritten:
>
> df$DSTz <-DSTzlook[which(DSTzlook[,1]== findInterval(df$Age, c(0,
> 17, 19, 24, 29, 34, 44, 54, 64, 69, 74, 79, 84,
> 89))),which(DSTzlook[1,]==df$DSF+df$DSB)]
>
> But it is the indirect referencing of the lookup in the main table
> that causes me trouble.
>
> Jon
>
> Soli Deo Gloria
>
> Jon Erik Ween, MD, MS
> Scientist, Kunin-Lunenfeld Applied Research Unit
> Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre
> Assistant Professor, Dept. of Medicine, Div. of Neurology
> University of Toronto Faculty of Medicine
>
> Kimel Family Building, 6th Floor, Room 644
> Baycrest Centre
> 3560 Bathurst Street
> Toronto, Ontario M6A 2E1
> Canada
>
> Phone: 416-785-2500 x3648
> Fax: 416-785-2484
> Email: jween at klaru-baycrest.on.ca
>
>
> Confidential: This communication and any attachment(s) may contain
> confidential or privileged information and is intended solely for
> the address(es) or the entity representing the recipient(s). If you
> have received this information in error, you are hereby advised to
> destroy the document and any attachment(s), make no copies of same
> and inform the sender immediately of the error. Any unauthorized use
> or disclosure of this information is strictly prohibited.
>
>
>
> On 2010-12-09, at 11:06 AM, David Winsemius wrote:
>
>>
>> On Dec 9, 2010, at 10:51 AM, Jon Erik Ween wrote:
>>
>>> Thanks David
>>>
>>> What I am trying to do is set up a script that assigns z-scores to
>>> a large dataframe (2500x300, but has Age in years and test scores
>>> as columns.) from a published table of age-corrected standard
>>> scores on this cognitive test.
>>>
>>> 1) The age intervals in the lookup table are given and not my
>>> choice.
>>
>> You may want to skip the intermediate translation to the row and
>> column labels and just use the results of findInterval:
>>
>>> findInterval( 16, c(0, 17, 19, 24, 29, 34, 44, 54, 64, 69, 74, 79,
>>> 84, 89) )
>> [1] 1
>>> findInterval( 90, c(0, 17, 19, 24, 29, 34, 44, 54, 64, 69, 74, 79,
>>> 84, 89) )
>> [1] 14
>>
>> Those look like appropriate indices for the column argument
>>>
>>> 2) Sorry I didn't post an example table, it looks something like
>>> this ("Age" is in the first row, standard scores in the first
>>> column):
>>>
>>> 17 19 24 29 34 44 ....
>>> 30 2.6 2.6 2.6 2.6 2.6 2.6
>>> 29 1.8 1.8 1.8 2.0 2.6 2.6
>>> 28 1.0 1.0 1.8 1.8 2.6 2.6
>>> 27 0.0 0.5 1.0 1.8 2.6 2.6
>>> 26 -.5 0.0 0.0 1.0 1.8 2.6
>>> .
>>> .
>>> .
>>> .
>>>
>>> So, if a subject (row) has age==29 and a standard score of 28, the
>>> value should be 1.8, etc.
>>
>> Looks like a job for two findInterval indices to be used used with
>> "[ r , c ] ".
>>
>> --
>> David.
>>
>>>
>>> Thanks
>>>
>>>
>>> Jon
>>>
>>> Soli Deo Gloria
>>>
>>> Jon Erik Ween, MD, MS
>>> Scientist, Kunin-Lunenfeld Applied Research Unit
>>> Director, Stroke Clinic, Brain Health Clinic, Baycrest Centre
>>> Assistant Professor, Dept. of Medicine, Div. of Neurology
>>> University of Toronto Faculty of Medicine
>>>
>>> Kimel Family Building, 6th Floor, Room 644
>>> Baycrest Centre
>>> 3560 Bathurst Street
>>> Toronto, Ontario M6A 2E1
>>> Canada
>>>
>>> Phone: 416-785-2500 x3648
>>> Fax: 416-785-2484
>>> Email: jween at klaru-baycrest.on.ca
>>>
>>>
>>> Confidential: This communication and any attachment(s) may contain
>>> confidential or privileged information and is intended solely for
>>> the address(es) or the entity representing the recipient(s). If
>>> you have received this information in error, you are hereby
>>> advised to destroy the document and any attachment(s), make no
>>> copies of same and inform the sender immediately of the error. Any
>>> unauthorized use or disclosure of this information is strictly
>>> prohibited.
>>>
>>>
>>>
>>> On 2010-12-09, at 10:33 AM, David Winsemius wrote:
>>>
>>>>
>>>> On Dec 9, 2010, at 9:34 AM, Jon Erik Ween wrote:
>>>>
>>>>>
>>>>> Hi
>>>>>
>>>>> This is (hopefully) a bit more cogent phrasing of a previous
>>>>> post. I'm
>>>>> trying to compute a z-score to rows in a large dataframe based
>>>>> on values in
>>>>> another dataframe. Here's the script (that does not work). 2
>>>>> questons,
>>>>>
>>>>> 1) Anyone know of a more elegant way to calculate the "rounded"
>>>>> age value
>>>>> than the nested ifelse's I've used?
>>>>>
>>>>> 2) how to reference the lookup table based on computed indices?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Jon
>>>>>
>>>>> # Define tables
>>>>> DSTzlook <-
>>>>> read.table("/Users/jween/Documents/ResearchProjects/ABC/data/
>>>>> DSTz.txt",
>>>>> header=TRUE, sep="\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>> df<-stroke
>>>>>
>>>>> # Compute rounded age.
>>>>> df$Agetmp
>>>>> <-ifelse(df$Age>=89,89,ifelse(df$Age>=84,84,ifelse(df
>>>>> $Age>=79,79,ifelse(df$Age>=74,74,ifelse(df$Age>=69,69,ifelse(df
>>>>> $Age>=64,64,ifelse(df$Age>=54,54,ifelse(df$Age>=44,44,ifelse(df
>>>>> $Age>=34,34,ifelse(df$Age>=29,29,ifelse(df$Age>=24,24,ifelse(df
>>>>> $Age>=19,19,17))))))))))))
>>>>
>>>> Ew, painful. If you want categorized ages (since what the above
>>>> coding is producing is not "rounded" in any sense of that word as
>>>> I understand it, then why not findInterval() as an index into the
>>>> ages you wnat to label these case with?
>>>>
>>>> df$Agetmp <- c(17,19,24,29,34,44,54,64,69,74,79,84)[ # note
>>>> Extract operation
>>>> findInterval(runif(100,0,100),
>>>> c(17,19,24,29,34,44,54,64,69,74,79,84,110) )
>>>> ] # close extraction
>>>>
>>>>
>>>> The other option, of course, and a more "honest" one in this
>>>> instance would be
>>>>
>>>> cut(vec, breaks=c(...), labels=c(...) )
>>>>
>>>> (It's not clear why you are not picking midpoint ages within
>>>> those brackets to me.)
>>>>
>>>>>
>>>>> # Reference the lookup table based on computed indices
>>>>> df$DSTz
>>>>> <-DSTzlook[which(DSTzlook[,1]==df$Agetmp),which(DSTzlook[1,]==df
>>>>> $DSF+df$DSB)]
>>>>
>>>> I have not been able to figure out what you are trying to do
>>>> here. Trying to use a 2d lookup looks promising a a way to
>>>> emulate what an Excel user might attempt, but an example (as
>>>> requested in the message at the bottom of every posting) would
>>>> really be of great help in making this more concrete for those of
>>>> us with insufficient abstractive abilities.
>>>>
>>>> --
>>>> David.
>>>>
>>>>>
>>>>> # Cleanup
>>>>> #rm(df)
>>>>> #df$Agetmp<-NULL
>>>>> --
>>>>> View this message in context: http://r.789695.n4.nabble.com/set-dataframe-field-value-from-lookup-table-tp3080245p3080245.html
>>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>>
>>>>
>>>>
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>>
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
David Winsemius, MD
West Hartford, CT