Skip to content

Creating unique code

2 messages · Hannah Van Impe, Jim Lemon

#
Hello,

I need some help in creating a new variable. I need to create a 'couple identifier', which gives a unique code for every couple/triple/... in a household. So, I can identify couples. To do this, I should use 4 variables:

  *   SERIAL = a unique numeric code for each household
  *   PERNUM = a unique numeric code for each person
  *   SPLOC = the numeric code of the spouse in the household, it is equal to the PERNUM code of the spouse
  *   SPRULE = rules for linking a spouse, numeric code from 00 to 06


To create the couple identifier, I need these conditions:

  *   SERIAL needs to be equal for these persons in the couples
  *   SPLOC > 0
  *   SPLOC = PERNUM
  *   SPRULE = 01 or 02

What I already did is this:

attach(ipumsi_00008_dta)
library(tinytex)
library(dplyr)
library(ggplot2)
library(tidyr)
library(knitr)
library(forcats)
library(mice)
library(pander)
library(ggcorrplot)
library(lubridate)
# true/false code when sploc is greater than zero
ipumsi_00008_dta <- mutate(ipumsi_00008_dta, sploc_greater_than_zero = sploc>0)
# true/false code when sploc is greater then zero and sprule is equal to 1 or 2
ipumsi_00008_dta <- mutate(ipumsi_00008_dta, rule_union = sploc>0 & sprule==1 | sprule==2)

=> Now I want to create a numeric code for true values of rule_union when serials are equal, so when they are persons of the same household.
What method should I use to do this?

Thank you very much!!
#
Hi Hannah,
Without knowing how the data are organized and what each numeric
code means, it is a bit difficult. If it is assumed that each row in the
data frame(?) ipumsi_00008_dta is a case (individual) and an individual may
have zero or more spouses, there would have to be more than one field for
"sploc" for those who had more than one "spouse". I would approach it by
creating a variable named "relcode" that was unique for each "union", so
that if more than one individual had the same non-zero "relcode" they would
all be in the same "relationship". That still leaves us with exclusive
relationships, so there would have to be multiple fields for "relcode" for
groups of people who were in different relationships in the same household.
I know that this is being pedantic, but it looks like a set intersection
problem of the Bob and Carol and Ted and Alice variety.

Jim

On Wed, Oct 28, 2020 at 6:39 AM Hannah Van Impe <hannahvanimpe at outlook.com>
wrote: