Message-ID: <CAAxdm-7x_VCKcxmHAC4Fg3OHk9jq_wK94z3krJBPmFNWdUESog@mail.gmail.com>
Date: 2012-11-03T18:30:47Z
From: jim holtman
Subject: Replacing NAs in long format
In-Reply-To: <CALrjt7_v6uf5LvckNbNGcZsSa3gUtwSbohWdEOEtsKuB0ef2EQ@mail.gmail.com>
> x <- read.table(text = "idr schyear year
+ 1 8 0
+ 1 9 1
+ 1 10 NA
+ 2 4 NA
+ 2 5 -1
+ 2 6 0
+ 2 7 1
+ 2 8 2
+ 2 9 3
+ 2 10 4
+ 2 11 NA
+ 2 12 6
+ 3 4 NA
+ 3 5 -2
+ 3 6 -1
+ 3 7 0
+ 3 8 1
+ 3 9 2
+ 3 10 3
+ 3 11 NA", header = TRUE)
> # you did not specify if there might be multiple contiguous NAs,
> # so there are a lot of checks to be made
> x.l <- lapply(split(x, x$idr), function(.idr){
+ # check for all NAs -- just return indeterminate state
+ if (sum(is.na(.idr$year)) == nrow(.idr)) return(.idr)
+ # repeat until all NAs have been fixed; takes care of contiguous ones
+ while (any(is.na(.idr$year))){
+ # find all the NAs
+ for (i in which(is.na(.idr$year))){
+ if ((i == 1L) && (!is.na(.idr$year[i + 1L]))){
+ .idr$year[i] <- .idr$year[i + 1L] - 1
+ } else if ((i > 1L) && (!is.na(.idr$year[i - 1L]))){
+ .idr$year[i] <- .idr$year[i - 1L] + 1
+ } else if ((i < nrow(.idr)) && (!is.na(.idr$year[i + 1L]))){
+ .idr$year[i] <- .idr$year[i + 1L] -1
+ }
+ }
+ }
+ return(.idr)
+ })
> do.call(rbind, x.l)
idr schyear year
1.1 1 8 0
1.2 1 9 1
1.3 1 10 2
2.4 2 4 -2
2.5 2 5 -1
2.6 2 6 0
2.7 2 7 1
2.8 2 8 2
2.9 2 9 3
2.10 2 10 4
2.11 2 11 5
2.12 2 12 6
3.13 3 4 -3
3.14 3 5 -2
3.15 3 6 -1
3.16 3 7 0
3.17 3 8 1
3.18 3 9 2
3.19 3 10 3
3.20 3 11 4
>
>
On Sat, Nov 3, 2012 at 1:14 PM, Christopher Desjardins
<cddesjardins at gmail.com> wrote:
> Hi,
> I have the following data:
>
>> data[1:20,c(1,2,20)]
> idr schyear year
> 1 8 0
> 1 9 1
> 1 10 NA
> 2 4 NA
> 2 5 -1
> 2 6 0
> 2 7 1
> 2 8 2
> 2 9 3
> 2 10 4
> 2 11 NA
> 2 12 6
> 3 4 NA
> 3 5 -2
> 3 6 -1
> 3 7 0
> 3 8 1
> 3 9 2
> 3 10 3
> 3 11 NA
>
> What I want to do is replace the NAs in the year variable with the
> following:
>
> idr schyear year
> 1 8 0
> 1 9 1
> 1 10 2
> 2 4 -2
> 2 5 -1
> 2 6 0
> 2 7 1
> 2 8 2
> 2 9 3
> 2 10 4
> 2 11 5
> 2 12 6
> 3 4 -3
> 3 5 -2
> 3 6 -1
> 3 7 0
> 3 8 1
> 3 9 2
> 3 10 3
> 3 11 4
>
> I have no idea how to do this. What it needs to do is make sure that for
> each subject (idr) that it either adds a 1 if it is preceded by a value in
> year or subtracts a 1 if it comes before a year value.
>
> Does that make sense? I could do this in Excel but I am at a loss for how
> to do this in R. Please reply to me as well as the list if you respond.
>
> Thanks!
> Chris
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.