Skip to content

transform dataframe with look-up table

6 messages · Juan Antonio Balbuena, Adams, Jean, William Dunlap +2 more

#
Hello
   I hope that there is a simple solution to this apparently complex problem.
   Any help will be much appreciated:
   I have a dataframe with Left and Right readings (that is, elements in each
   row are paired). For instance,
       Left Right
    [1]  9    8
    [2]  4    3
    [3]  2    1
    [4]  6    5
    [5]  3    1
    [6]  4    1
    [7]  3    2
    [8]  4    2
    [9]  10   8
   [10]  9   10
   I  need  to  produce a new data frame where the values are transformed
   according to a look-up table such as
           input    output
    [1]     5      1
    [2]    10     1
    [3]     4      2
    [4]     8      3
    [5]     6      5
    [6]     5      6
    [7]     7      6
    [8]     2      7
    [9]     9      7
   [10]    10    7
   [11]     2     8
   So  [1, ] in the new dataframe would be 7 3. Quite simple so far, but what
   makes things complicated is the multiple outputs for a single input. In this
   example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must
   yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output
   for  [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are
   missing as inputs the output for [5, ] should be NA NA.
   Thank you very much for your time.
   Juan Antonio Balbuena

   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary Biology
   University of
   Valencia
   [1]http://www.uv.es/~balbuena
   P.O. Box 22085
   [2]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [3]http://cetus.uv.es/mullpardb/index.html
   e-mail: [4]j.a.balbuena at uv.es    tel. +34 963 543 658    fax +34 963 543 733
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   NOTE! For shipments by EXPRESS COURIER use the following street address:
   C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain.
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

References

   1. http://www.uv.es/%7Ebalbuena
   2. http://www.uv.es/cavanilles/zoomarin/index.htm
   3. http://cetus.uv.es/mullpardb/index.html
   4. mailto:j.a.balbuena at uv.es
#
It would be helpful if you included the expected output for your example, but I think the following does what you want by using merge() for each lookup:

f0 <- function(inputDF, lookupDF)
{
    tmp1 <- merge(inputDF, lookupDF, by.x="Left", by.y="input",all.x=TRUE)
    tmp2 <- merge(tmp1, lookupDF, by.x="Right", by.y="input", all.x=TRUE)
    with(tmp2, data.frame(ID=ID, Right=output.x, Left=output.y)[order(ID), ])
}
# Your example data with an ID column added to track where the output rows came from
myInputDF <- data.frame(
    ID = 1:10,
    Left = c(9, 4, 2, 6, 3, 4, 3, 4, 10, 9),
    Right = c(8, 3, 1, 5, 1, 1, 2, 2, 8, 10))
myLookupDF <- data.frame(
    input = c(5, 10, 4, 8, 6, 5, 7, 2, 9, 10, 2),
    output = c(1, 1, 2, 3, 5, 6, 6, 7, 7, 7, 8))

f0(myInputDF, myLookupDF)
#    ID Right Left
# 12  1     7    3
# 9   2     2   NA
# 1   3     7   NA
# 2   3     8   NA
# 10  4     5    6
# 11  4     5    1
# 3   5    NA   NA
# 4   6     2   NA
# 5   7    NA    7
# 6   7    NA    8
# 7   8     2    7
# 8   8     2    8
# 13  9     1    3
# 14  9     7    3
# 15 10     7    1
# 16 10     7    7

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote:
merge can handle both of these requirements.

First, making the two datasets reproducible:

Start <- data.frame(Left=c(9,4,2,6,3,4,3,4,10,9),
                     Right=c(8,3,1,5,1,1,2,2,8,10))

transformer <- data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2),
                           output=c(1,1,2,3,5,6,6,7,7,7,8))

Then add a marker of the original row numbers so that the work can be 
checked more easily later (not really needed for the calculations):

Start$rownum <- seq_len(nrow(Start))

Two merge statements with the columns specified and all.x set to TRUE 
(to keep cases even without a match):

End <- merge(merge(Start, transformer, by.x="Left", by.y="input", 
all.x=TRUE),
              transformer, by.x="Right", by.y="input", all.x=TRUE)

Then we can look at the output, resorted by the original row numbers:

End[order(End$rownum),]

which gives

    Right Left rownum output.x output.y
12     8    9      1        7        3
9      3    4      2        2       NA
1      1    2      3        7       NA
2      1    2      3        8       NA
10     5    6      4        5        6
11     5    6      4        5        1
3      1    3      5       NA       NA
4      1    4      6        2       NA
5      2    3      7       NA        7
6      2    3      7       NA        8
7      2    4      8        2        7
8      2    4      8        2        8
13     8   10      9        1        3
14     8   10      9        7        3
15    10    9     10        7        1
16    10    9     10        7        7

  
    
#
Here's an approach that seems to work. I added an 11th case to
your data since you did not have a case where both Left and
Right had multiple values in the lookup table. This creates an
id value so that we can merge left and right separately and
then merge them back together:

# Create test data frames
Left <- c(9, 4, 2, 6, 3, 4, 3, 4, 10, 9, 2)
Right <- c(8, 3, 1, 5, 1, 1, 2, 2, 8, 10, 5)
ID <- 1:11
Pair <- data.frame(ID, Left, Right)
input <- c(5, 10, 4, 8, 6, 5, 7, 2, 9, 10, 2)
output <- c(1, 1, 2, 3, 5, 6, 6, 7, 7, 7, 8)
Lookup <- data.frame(input, output)
# Merges
Lout <- merge(Pair, Lookup, by.x="Left", by.y="input", 
	all.x=TRUE)[,c("ID", "Left", "output")]
Rout <- merge(Pair, Lookup, by.x="Right", by.y="input", 
	all.x=TRUE)[, c("ID", "Right", "output")]
names(Rout)[3] <- "outputR"
names(Lout)[3] <- "outputL"
merge(Lout, Rout, all=TRUE)[,c(1, 2, 4, 3, 5)]

   ID Left Right outputL outputR
1   1    9     8       7       3
2   2    4     3       2      NA
3   3    2     1       7      NA
4   3    2     1       8      NA
5   4    6     5       5       6
6   4    6     5       5       1
7   5    3     1      NA      NA
8   6    4     1       2      NA
9   7    3     2      NA       7
10  7    3     2      NA       8
11  8    4     2       2       8
12  8    4     2       2       7
13  9   10     8       1       3
14  9   10     8       7       3
15 10    9    10       7       1
16 10    9    10       7       7
17 11    2     5       7       6
18 11    2     5       7       1
19 11    2     5       8       6
20 11    2     5       8       1

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Juan
Antonio Balbuena
Sent: Thursday, July 25, 2013 10:13 AM
To: r-help at r-project.org
Subject: [R] transform dataframe with look-up table


   Hello
   I hope that there is a simple solution to this apparently
complex problem.
   Any help will be much appreciated:
   I have a dataframe with Left and Right readings (that is,
elements in each
   row are paired). For instance,
       Left Right
    [1]  9    8
    [2]  4    3
    [3]  2    1
    [4]  6    5
    [5]  3    1
    [6]  4    1
    [7]  3    2
    [8]  4    2
    [9]  10   8
   [10]  9   10
   I  need  to  produce a new data frame where the values are
transformed
   according to a look-up table such as
           input    output
    [1]     5      1
    [2]    10     1
    [3]     4      2
    [4]     8      3
    [5]     6      5
    [6]     5      6
    [7]     7      6
    [8]     2      7
    [9]     9      7
   [10]    10    7
   [11]     2     8
   So  [1, ] in the new dataframe would be 7 3. Quite simple
so far, but what
   makes things complicated is the multiple outputs for a
single input. In this
   example, 10 corresponds to 1 and 7 so [9, ] in the input
dataframe must
   yield two rows in its output counterpart: 1 3 and 7 3.
Likewise the output
   for  [10, ] should be 7 1 and 7 7. In addition, given that
3 and 1 are
   missing as inputs the output for [5, ] should be NA NA.
   Thank you very much for your time.
   Juan Antonio Balbuena

   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary
Biology
   University of
   Valencia
   [1]http://www.uv.es/~balbuena
   P.O. Box 22085
   [2]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [3]http://cetus.uv.es/mullpardb/index.html
   e-mail: [4]j.a.balbuena at uv.es    tel. +34 963 543 658
fax +34 963 543 733

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   NOTE! For shipments by EXPRESS COURIER use the following
street address:
   C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia),
Spain.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

References

   1. http://www.uv.es/%7Ebalbuena
   2. http://www.uv.es/cavanilles/zoomarin/index.htm
   3. http://cetus.uv.es/mullpardb/index.html
   4. mailto:j.a.balbuena at uv.es
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.
#
Hello
   First thank you very much to Jean, Bill, Brian and David for the answers and
   code. I very extremely grateful.
   I am eventually going to adapt Brian's code with a very minor alteration. If
   one follows the original syntax
End <- merge(merge(Start, transformer, by.x="Left", by.y="input", 
all.x=TRUE),
              transformer, by.x="Right", by.y="input", all.x=TRUE)

   the "Left" variables are listed right and vice versa, which seems odd. Just
   swapping "Left" and "Right" will put it right (pun intentional):
End <- merge(merge(Start, transformer, by.x="Right", by.y="input", 
all.x=TRUE),
              transformer, by.x="Left", by.y="input", all.x=TRUE)

   Best wishes
   Juan Antonio

   El 25/07/2013 18:02, Brian Diggs escribi??:
On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote:
Hello
    I hope that there is a simple solution to this apparently complex problem.
    Any help will be much appreciated:
    I have a dataframe with Left and Right readings (that is, elements in each
    row are paired). For instance,
        Left Right
     [1]  9    8
     [2]  4    3
     [3]  2    1
     [4]  6    5
     [5]  3    1
     [6]  4    1
     [7]  3    2
     [8]  4    2
     [9]  10   8
    [10]  9   10
    I  need  to  produce a new data frame where the values are transformed
    according to a look-up table such as
            input    output
     [1]     5      1
     [2]    10     1
     [3]     4      2
     [4]     8      3
     [5]     6      5
     [6]     5      6
     [7]     7      6
     [8]     2      7
     [9]     9      7
    [10]    10    7
    [11]     2     8
    So  [1, ] in the new dataframe would be 7 3. Quite simple so far, but what
    makes things complicated is the multiple outputs for a single input. In thi
s
    example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must
    yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output
    for  [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are
    missing as inputs the output for [5, ] should be NA NA.
    Thank you very much for your time.
    Juan Antonio Balbuena

merge can handle both of these requirements.

First, making the two datasets reproducible:

Start <- data.frame(Left=c(9,4,2,6,3,4,3,4,10,9),
                     Right=c(8,3,1,5,1,1,2,2,8,10))

transformer <- data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2),
                           output=c(1,1,2,3,5,6,6,7,7,7,8))

Then add a marker of the original row numbers so that the work can be 
checked more easily later (not really needed for the calculations):

Start$rownum <- seq_len(nrow(Start))

Two merge statements with the columns specified and all.x set to TRUE 
(to keep cases even without a match):

End <- merge(merge(Start, transformer, by.x="Left", by.y="input", 
all.x=TRUE),
              transformer, by.x="Right", by.y="input", all.x=TRUE)

Then we can look at the output, resorted by the original row numbers:

End[order(End$rownum),]

which gives

    Right Left rownum output.x output.y
12     8    9      1        7        3
9      3    4      2        2       NA
1      1    2      3        7       NA
2      1    2      3        8       NA
10     5    6      4        5        6
11     5    6      4        5        1
3      1    3      5       NA       NA
4      1    4      6        2       NA
5      2    3      7       NA        7
6      2    3      7       NA        8
7      2    4      8        2        7
8      2    4      8        2        8
13     8   10      9        1        3
14     8   10      9        7        3
15    10    9     10        7        1
16    10    9     10        7        7


    --

    Dr. Juan A. Balbuena
    Marine Zoology Unit
    Cavanilles Institute of Biodiversity and Evolutionary Biology
    University of
    Valencia
    [1][1]http://www.uv.es/~balbuena
    P.O. Box 22085
    [2][2]http://www.uv.es/cavanilles/zoomarin/index.htm
    46071 Valencia, Spain
    [3][3]http://cetus.uv.es/mullpardb/index.html
    e-mail: [[4]4]j.a.balbuena at uv.es    tel. +34 963 543 658    fax +34 963 543
 733
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    NOTE! For shipments by EXPRESS COURIER use the following street address:
    C/ Catedr????tico Jos???? Beltr????n 2, 46980 Paterna (Valencia), Spain.
    ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

References

    1. [5]http://www.uv.es/%7Ebalbuena
    2. [6]http://www.uv.es/cavanilles/zoomarin/index.htm
    3. [7]http://cetus.uv.es/mullpardb/index.html
    4. [8]mailto:j.a.balbuena at uv.es


   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary Biology
   University of
   Valencia
   [9]http://www.uv.es/~balbuena
   P.O. Box 22085
   [10]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [11]http://cetus.uv.es/mullpardb/index.html
   e-mail: [12]j.a.balbuena at uv.es    tel. +34 963 543 658    fax +34 963 543
   733
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   NOTE! For shipments by EXPRESS COURIER use the following street address:
   C/ Catedr??tico Jos?? Beltr??n 2, 46980 Paterna (Valencia), Spain.
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

References

   1. http://www.uv.es/~balbuena
   2. http://www.uv.es/cavanilles/zoomarin/index.htm
   3. http://cetus.uv.es/mullpardb/index.html
   4. mailto:4]j.a.balbuena at uv.es
   5. http://www.uv.es/%7Ebalbuena
   6. http://www.uv.es/cavanilles/zoomarin/index.htm
   7. http://cetus.uv.es/mullpardb/index.html
   8. mailto:j.a.balbuena at uv.es
   9. http://www.uv.es/%7Ebalbuena
  10. http://www.uv.es/cavanilles/zoomarin/index.htm
  11. http://cetus.uv.es/mullpardb/index.html
  12. mailto:j.a.balbuena at uv.es