Skip to content

urgent: question concerning data manipulation

4 messages · David Studer, MacQueen, Don, ONKELINX, Thierry +1 more

#
Here is one way. There will be many ways to do it; I offer this one
because it is very general.

-Don


tmp <- split(testdata, testdata$personId)

myfun <- function(df) {
  dfo <- df
  if (any(df$law=='SVG')) dfo$svg <- 1 else dfo$svg <- 0
  dfo
}

tmpo <- lapply(tmp,myfun)

testout <- do.call('rbind', tmpo)
personId  law article svg
1.1         1  SVG      10   1
1.2         1  SVG      10   1
2.3         2 StGB     123   1
2.4         2 StGB     122   1
2.5         2  SVG      10   1
2.6         2  AuG      40   1
2.7         2 StGB     126   1
3           3  SVG      10   1
4.9         4 StGB     111   0
4.10        4  AuG      40   0
#
Have a look at cast() from the reshape package.

library(reshape)
cast(personId ~ law, data = testdata, value = "article", fun = length)
cast(personId ~ law, data = testdata, value = "article", fun = function(x){1 * (length(x) > 0)})

________________________________________
Van: r-help-bounces at r-project.org [r-help-bounces at r-project.org] namens David Studer [studerov at gmail.com]
Verzonden: maandag 4 maart 2013 16:44
Aan: r-help at r-project.org
Onderwerp: [R] urgent: question concerning data manipulation

Hello everyone!

Does anyone of you know how I could solve the following problem.
I guess, it is not a very difficult question, but I simply lack of the
right idea:

I have a dataset containing data of convictions. This dataset contains 4
columns:
- personId: individual number that identifies the offender
- law: law which has been violated
- article: article which has been violated

# Testdata:
personId<-c(1,1,2,2,2,2,2,3,4,4)
law<-c("SVG", "SVG", "StGB", "StGB", "SVG", "AuG", "StGB", "SVG", "StGB",
"AuG")
article<-c(10, 10, 123, 122, 10, 40, 126, 10, 111, 40)
testdata<-data.frame(personId, law, article)

Now I'd like to create three additional dummy-coded columns for each law
(SVG, StGB, AuG).
For each offender (all offenders have the same personId) it should be
checked, whether there are
any violations against the three laws. If there are any violations against
SVG (for example), then
in all rows of this offender the column SVG should have the value 1
(otherwise 0).

For example offender 2 has once violated against law "SVG" therefore his
four entries should have
the value 1 at the column "SVG".

I hope you can understand my problem. I'd really appreciate any hints and
solutions!

Thank you!
David


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
#
There?s more than one way to skin a cat, here is another

mm<-model.matrix(~personId+law+0,testdata)
merge(testdata,aggregate(mm[,-1],list(personId=mm[,"personId"]),max))

cheers

Am 04.03.2013 16:44, schrieb David Studer: