Skip to content

anonymizing subject identifiers for survival analysis

3 messages · William Dunlap, Christopher W. Ryan

#
I would like to conduct a survival analysis, examining a subject's
time to *next* appearance in a database, after their first appearance.
It is a database of dated events.

I need to obfuscate or anonymize or mask the subject identifiers (a
combination of name and birthdate). And obviously any given subject
should have the same anonymous code ever time he/she appears in the
database.  I'm not talking "safe from the NSA" here. And I won't be
releasing it. It's just sensitive data and I don't want to be working
every day with cleartext versions of it.

I've looked at packages digest, anonymizer, and anonymize.  What do
you think of this approach:

# running R 3.1.1 on Windows 7 Enterprise
library(digest)
dd <- data.frame(id=1:6, name = c("Harry", "Ron", "Hermione", "Luna",
"Ginny", "Harry"), dob = c("1990-01-01", "1990-06-15", "1990-04-08",
"1999-11-26", "1990-07-21", "1990-01-01"))
dd.2 <- transform(dd, code=paste0(tolower(name), tolower(dob), sep=""))
library(digest)
anonymize <- function(x, algo="sha256"){
  unq_hashes <- vapply(x, function(object) digest(object, algo=algo),
FUN.VALUE="", USE.NAMES=TRUE)
  unname(unq_hashes[x])
}
dd.2$codex <- anonymize(dd.2$code)
dd.2
table(duplicated(dd.2$codex))

Thanks.

--Chris Ryan
Broome County Health Department
#
You can also use match(code, unique(code)), as in
  transform(dd.2, codex2 = paste0("Person", match(code, unique(code))))
It is not guaranteed that x!=y implies digest(x)!=digest(y), but it is
extremely
unlikely to fail.  This match idiom guarantees that.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, May 12, 2016 at 1:06 PM, Christopher W Ryan <cryan at binghamton.edu>
wrote:

  
  
#
Excellent, thanks. Much simpler.

--Chris

Christopher W. Ryan, MD, MS
cryanatbinghamtondotedu
https://www.linkedin.com/in/ryancw

Early success is a terrible teacher. You?re essentially being rewarded
for a lack of preparation, so when you find yourself in a situation
where you must prepare, you can?t do it. You don?t know how.
--Chris Hadfield, An Astronaut's Guide to Life on Earth
William Dunlap wrote: