Skip to content
Back to formatted view

Raw Message

Message-ID: <4B491F20.3040504@ucdavis.edu>
Date: 2010-01-10T00:28:16Z
From: Greg Hirson
Subject: string functions
In-Reply-To: <82FE69CD-42A4-4C26-9F14-A8ABEDC64388@gmx.ch>

Laetitia,

One approach:

lettermatch <- function(stringA, stringB) {
     sum(unique(unlist(strsplit(stringA, ""))) %in% 
unique(unlist(strsplit(stringB, ""))))
}

lettermatch("Hello World","Hello Peter")
yields 6, as the l is only singly counted.

This treats uppercase and lowercase as different letters and counts how 
many of the unique letters in stringA show up in stringB.

In another approach, letters are set to lowercase first. This I think 
gives you what you want:

lettermatch2 <- function(stringA, stringB) {
     tb <- merge(as.data.frame(table(strsplit(tolower(stringA), ""))), 
as.data.frame(table(strsplit(tolower(stringB), ""))), by="Var1")
     sum(apply(tb[-1], 1, min))
}

lettermatch("Hello World","Hello Peter")
yields 7.

Greg

On 1/9/10 1:51 PM, Laetitia Schmid wrote:
> Hi!
> Does anybody know a string function that would calculate how many 
> characters two strings share? I.e. ("Hello World","Hello Peter") would 
> be 7.
> Thanks.
> Laetitia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Greg Hirson
ghirson at ucdavis.edu

Graduate Student
Agricultural and Environmental Chemistry

1106 Robert Mondavi Institute North
One Shields Avenue
Davis, CA 95616