Skip to content

dictionary lookup

3 messages · Duncan Murdoch, Thomas Manke

#
Hi,

I have a character-valued vector (old_names) and want to translate
its entries whenever possible,  using a dictionary (dict=data.frame).
The translation direction is dict$V3 --> dict$V2, but
some values may be undefined (NA). I suppose this is a very basic
task, but I tried in vain to make it more efficient than below.
In particular I would like to avoid the explicit (and slow) loop
Any help is very much welcome.
Thank you, TM
============================================
new_names = old_names
m = match(old_names, dict$V3)
N = length(old_names)
for (i in 1:N) {
     if (is.na(m[i])) { next ; }

     nn = as.vector(dict$V2)[m[i]];
    if (nn == "" ) { next; }

    new_names[i] = nn
}
#
On 06/03/2008 6:45 PM, Thomas Manke wrote:
You can vectorize this and it should be fast.  Here's a straightforward 
replacement for the loop.  It keeps the first 2 lines, and replaces the 
rest with two more:

new_names <- old_names
m <- match(old_names, dict$V3)

change <- !is.na(m)
new_names[change] <- dict$V2[m[change]]

Duncan Murdoch
#
Duncan Murdoch wrote:
Thank you all for your responses, it certainly pointed me to the right 
direction.
For my purposes, I only had to slightly modify Duncan's suggestion

new_names <- rownames(E)
m <- match(rownames(E), dict$V3)
change <- ( !is.na(m) & dict$V2[m] != "")      
new_names[change] <- as.vector(dict$V2[m[change]] )    

Best wishes, TM