Skip to content
Prev 316261 / 398502 Next

Removing values containing a specific character

Hi, 
I tried with bigger dataset.

set.seed(25)
names <- sample(c("bob", "joe", "craig at gmail.com", "emily", "jane at yahoo.com"),5e6,replace=TRUE)
set.seed(1651)
emails
 <- sample(c("bobj at cup.com", "joesmith at gmail.com", "craig at gmail.com",
 "emily2 at yahoo.com", "jane at yahoo.com"),5e6,replace=TRUE)

?df <- data.frame(names, emails) 
?dim(df)
#[1] 5000000?????? 2
?df[]<-lapply(df,as.character)
?system.time(df[,1][grep("@",df$names)]<- "" )
#?? user? system elapsed 
#? 1.732?? 0.108?? 1.844 
?system.time(dfNew1<-df[grep("\\w+",df$names),])
#?? user? system elapsed 
#? 0.896?? 0.024?? 0.923 
?system.time(dfNew2<- df[df$names!="",])
#?? user? system elapsed 
?# 0.460?? 0.028?? 0.490 
A.K.