Split
Another way to make columns out of the stuff before and after the
underscore, with NAs if there is no underscore, is
utils::strcapture("([^_]*)_(.*)", F1$text,
proto=data.frame(Before_=character(), After_=character()))
-Bill
On Tue, Sep 22, 2020 at 4:25 PM Bert Gunter <bgunter.4567 at gmail.com> wrote:
To be clear, I think Rui's solution is perfectly fine and probably better than what I offer below. But just for fun, I wanted to do it without the lapply(). Here is one way. I think my comments suffice to explain.
## which are the non "_" indices?
wh <- grep("_",F1$text, fixed = TRUE, invert = TRUE)
## paste "_." to these
F1[wh,"text"] <- paste(F1[wh,"text"],".",sep = "_")
## Now strsplit() and unlist() them to get a vector
z <- unlist(strsplit(F1$text, "_"))
## now cbind() to the data frame
F1 <- cbind(F1, matrix(z, ncol = 2, byrow = TRUE))
F1
ID1 ID2 text 1 2 1 A1 B1 NONE_. NONE . 2 A1 B1 cf_12 cf 12 3 A1 B1 NONE_. NONE . 4 A2 B2 X2_25 X2 25 5 A2 B3 fd_15 fd 15
## You can change the names of the 2 columns yourself
Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Sep 22, 2020 at 12:19 PM Rui Barradas <ruipbarradas at sapo.pt> wrote:
Hello,
A base R solution with strsplit, like in your code.
F1$Y1 <- +grepl("_", F1$text)
tmp <- strsplit(as.character(F1$text), "_")
tmp <- lapply(tmp, function(x) if(length(x) == 1) c(x, ".") else x)
tmp <- do.call(rbind, tmp)
colnames(tmp) <- c("X1", "X2")
F1 <- cbind(F1[-3], tmp) # remove the original column
rm(tmp)
F1
# ID1 ID2 Y1 X1 X2
#1 A1 B1 0 NONE .
#2 A1 B1 1 cf 12
#3 A1 B1 0 NONE .
#4 A2 B2 1 X2 25
#5 A2 B3 1 fd 15
Note that cbind dispatches on F1, an object of class "data.frame".
Therefore it's the method cbind.data.frame that is called and the result
is also a df, though tmp is a "matrix".
Hope this helps,
Rui Barradas
?s 20:07 de 22/09/20, Rui Barradas escreveu:
Hello,
Something like this?
F1$Y1 <- +grepl("_", F1$text)
F1 <- F1[c(1, 2, 4, 3)]
F1 <- tidyr::separate(F1, text, into = c("X1", "X2"), sep = "_", fill =
"right")
F1
Hope this helps,
Rui Barradas
?s 19:55 de 22/09/20, Val escreveu:
HI All,
I am trying to create new columns based on another column string
content. First I want to identify rows that contain a particular
string. If it contains, I want to split the string and create two
variables.
Here is my sample of data.
F1<-read.table(text="ID1 ID2 text
A1 B1 NONE
A1 B1 cf_12
A1 B1 NONE
A2 B2 X2_25
A2 B3 fd_15 ",header=TRUE,stringsAsFactors=F)
If the variable "text" contains this "_" I want to create an indicator
variable as shown below
F1$Y1 <- ifelse(grepl("_", F1$text),1,0)
Then I want to split that string in to two, before "_" and after "_"
and create two variables as shown below
x1= strsplit(as.character(F1$text),'_',2)
My problem is how to combine this with the original data frame. The
desired output is shown below,
ID1 ID2 Y1 X1 X2
A1 B1 0 NONE .
A1 B1 1 cf 12
A1 B1 0 NONE .
A2 B2 1 X2 25
A2 B3 1 fd 15
Any help?
Thank you.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.