Skip to content

lookup not working properly

3 messages · Sarah Goslee, Dimitri Liakhovitski

#
Hello!

Below is my exmample. "myref" is my reference data frame with columns a and b.
"temp" is my data with column c analogous to column a in "myref".
I am trying to create a new variable b - in "temp" - that matches
values from b in "myref" to values in c. If you look at the resulting
data frame (temp - at the bottom), you'll notice that rows 19-24 are
incorrect.
How could one fix it?
Thanks a lot!

# my reference data frame:
a=c("ba ba","ca ca","da da", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","ma ma")
b=c("ba ba","ca ca","OTHER", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","OTHER")
myref<-data.frame(a=a, b=b)
(myref)

# my data:
c<-c(rep("ba ba",3),rep("ca ca",3),rep("da da",3),rep("lake lake, a",3),
  rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama
ca ,b",3),rep("ma ma",3))
temp<-data.frame(c=c)
(temp)

### Matching:
temp$b<-myref[temp$c,"b"]
(temp)
#
Dimitri,

It isn't clear to me exactly what you are trying to do, but this might
be closer.
Note the stringsAsFactors argument I added to data.frame: I don't think you
are likely to want factors for this application. Also, it's a bad idea
to create a
variable named c since that is the name of a function.

# my reference data frame:
myref<-data.frame(a=c("ba ba","ca ca","da da", "lake lake, a", "lake
lake, b","lake of","lama ca, a","lama ca, b","ma ma"), b=c("ba ba","ca
ca","OTHER", "lake lake, a", "lake lake, b","lake of","lama ca,
a","lama ca, b","OTHER"), stringsAsFactors=FALSE)

# my data:
temp<-data.frame(c=c(rep("ba ba",3),rep("ca ca",3),rep("da
da",3),rep("lake lake, a",3),
 rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama
ca ,b",3),rep("ma ma",3)), stringsAsFactors=FALSE)

newdata <- merge(myref, temp, by.x="a", by.y="c", all.x=FALSE, all.y=TRUE)

Sarah

On Tue, Apr 12, 2011 at 11:17 AM, Dimitri Liakhovitski
<dimitri.liakhovitski at gmail.com> wrote:

  
    
#
Thank you, Sarah. This seems to be working:
a=c("ba ba","ca ca","da da", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","ma ma")
b=c("ba ba","ca ca","OTHER", "lake lake, a", "lake lake, b","lake
of","lama ca, a","lama ca, b","OTHER")
myref<-data.frame(a=a, b=b)
myref$a<-as.character(myref$a)
myref$b<-as.character(myref$b)
(myref);str(myref)

for.mydata<-c(rep("ba ba",3),rep("ca ca",3),rep("da da",3),rep("lake
lake, a",3),
  rep("lake lake, b",3),rep("lake of",3),rep("lama ca, a",3),rep("lama
ca, b",3),rep("ma ma",3))
temp<-data.frame(d=for.mydata)
temp$d<-as.character(temp$d)
(temp);str(temp)

# temp$b<-myref[temp$d,2]
# (temp)

newdata <- merge(myref, temp, by.x="a", by.y="d", all.x=FALSE, all.y=TRUE)
(newdata)
dim(newdata)
(myref)

Dimitri
On Tue, Apr 12, 2011 at 11:42 AM, Sarah Goslee <sarah.goslee at gmail.com> wrote: