dataframe: string operations on columns
Well, my solution with the loop might be slower (even though I don't see
any difference with my system, at least with up to 100 lines and 3
strings to separate), but it works whatever the number of strings.
But I should have renamed the columns outside of the loop:
names(df)[2:3] <- paste("a", 1:2, sep="") ##or a more general solution
for the indexes
Ivan
Le 1/19/2011 01:42, Niels Richard Hansen a ?crit :
On 2011-01-18 08:14, Ivan Calandra wrote:
Hi,
I guess it's not the nicest way to do it, but it should work for you:
#create some sample data
df<- data.frame(a=c("A B", "C D", "A C", "A D", "B D"),
stringsAsFactors=FALSE)
#split the column by space
df_split<- strsplit(df$a, split=" ")
#place the first element into column a1 and the second into a2
for (i in 1:length(df_split[[1]])){
df[i+1]<- unlist(lapply(df_split, FUN=function(x) x[i]))
names(df)[i+1]<- paste("a",i,sep="")
}
I hope people will give you more compact solutions.
HTH,
Ivan
You can replace the loop with
df <- transform(df, a1 = sapply(df_split, "[[", 1),
a2 = sapply(df_split, "[[", 2))
df <- cbind(df, do.call(rbind, df_split) seems to do the same (up to column names) but faster. However, all the solutions rely on there being exactly two strings when you split. The different solutions behave differently if this assumption is violated and none of them really checks this. You can, for instance, check this with all(sapply(df_split, length) == 2) Best, Niels R. Hansen
Peter Ehlers
Le 1/18/2011 16:30, boris pezzatti a ?crit :
Dear all, how can I perform a string operation like strsplit(x," ") on a column of a dataframe, and put the first or the second item of the split into a new dataframe column? (so that on each row it is consistent) Thanks Boris
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php