Skip to content
Prev 396570 / 398502 Next

Extract

Here is another way... for data analysis, the idiomatic result is usually more useful, though for presentation in a final result the wide result might be desired.

library(dplyr)
library(tidyr)

dat<-read.csv(text=
"Year, Sex,string
2002,F,15 xc Ab
2003,F,14
2004,M,18 xb 25 35 21
2005,M,13 25
2006,M,14 ac 256 AV 35
2007,F,11"
, header=TRUE )

idiomatic <- (
    dat
    %>% mutate( string = strsplit( string, " " ) )
    %>% unnest( cols = string )
    %>% group_by( Year, Sex )
    %>% mutate( s_name = paste0( "S", seq_along( string ) ) )
    %>% ungroup()
)
idiomatic # each row has unique Year, Sex, and s_name

wide <- (
    idiomatic
    %>% spread( s_name, string )
)
wide
On July 19, 2024 11:23:48 AM PDT, Val <valkremk at gmail.com> wrote: