Skip to content
Prev 257684 / 398502 Next

Can R replicate this data manipulation in SAS?

On Apr 21, 2011, at 16:00 , Bert Gunter wrote:

            
Hum, there is a point, though: If you take the crude translation approach, you will soon realize that there is very little that SAS (or SPSS, or...) can do that you literally can't do in R. 

It is often the case that there is much neater and well-structured approach in R, but the flip side is that there are cases where the neat solution is hard to find, and maybe some cases where it doesn't really exist (e.g. not everything can be vectorized). This is the sort of thing that in some circles give R a reputation for being poorly suited for data handling, compared to the DATA step in SAS. Do notice the circular logic that occurs when defining "typical statistical task" as "something you can do in SAS", though. 

(One example is "last observation carried forward", a rather dubious technique for filling in missing observations in longitudinal studies, which probably directly stems from the RETAIN directive in SAS. 

In R, you may find yourself doing something like 

  x[is.na(x)] <- x[!is.na(x)][cumsum(!is.na(x))[is.na(x)]]

which isn't even completely failsafe. However, you'll get the result soon enough with 

  for (i in seq_len(x)) if (is.na(x[i])) x[i] <- t else t <- x[i]

and this time, you can actually read the code.

Of course, approx() will do the trick much more swiftly than either of the above.)