[FORGED] function for remove white space
Try the following function to apply gsub to all character or factor
columns of a data.frame (and maintain change the class of all
columns):
gsubDataFrame <- function(pattern, replacement, x, ...) {
stopifnot(is.data.frame(x))
for(i in seq_len(ncol(x))) {
if (is.character(x[[i]])) {
x[[i]] <- gsub(pattern, replacement, x[[i]], ...)
} else if (is.factor(x[[i]])) {
levels(x[[i]]) <- gsub(pattern, replacement, levels(x[[i]]), ...)
} # else do nothing for numeric or other column types
}
x
}
E.g.,
d <- data.frame(stringsAsFactors = FALSE,
+ Int=1:5,
+ Char=c("a a", "baa", "a a ", " aa", "b a a"),
+ Fac=factor(c("x x", "yxx", "x x ", " xx", "y x x")))
str(d)
'data.frame': 5 obs. of 3 variables: $ Int : int 1 2 3 4 5 $ Char: chr "a a" "baa" "a a " " aa" ... $ Fac : Factor w/ 5 levels " xx","x x","x x ",..: 2 5 3 1 4
str(gsubDataFrame(" ", "", d)) # delete spaces, use "[[:space:]]" for whitespace
'data.frame': 5 obs. of 3 variables: $ Int : int 1 2 3 4 5 $ Char: chr "aa" "baa" "aa" "aa" ... $ Fac : Factor w/ 2 levels "xx","yxx": 1 2 1 1 2 Bill Dunlap TIBCO Software wdunlap tibco.com
On Tue, Feb 21, 2017 at 11:35 PM, Jos? Luis <josestadistico at gmail.com> wrote:
Thank's for your answer. I'm using read.csv. Enviado desde mi iPad
El 22/2/2017, a las 3:39, William Michels <wjm1 at caa.columbia.edu> escribi?: Hi Jos? (and Rolf), It's not entirely clear what type of 'whitespace' you're referring to, but if you're using read.table() or read.csv() to create your dataframe in the first place, setting 'strip.white = TRUE' will remove leading and trailing whitespace 'from unquoted character fields (numeric fields are always stripped).'
?read.table ?read.csv
Cheers, Bill
On 2/21/17, Rolf Turner <r.turner at auckland.ac.nz> wrote:
On 22/02/17 12:51, Jos? Luis Aguilar wrote: Hi all, i have a dataframe with 34 columns and 1534 observations. In some columns I have strings with spaces, i want remove the space. Is there a function that removes whitespace from the entire dataframe? I use gsub but I would need some function to automate this.
Something like
X <- as.data.frame(lapply(X,function(x){gsub(" ","",x)}))
Untested, since you provide no reproducible example (despite being told
by the posting guide to do so).
I do not know what my idea will do to numeric columns or to factors.
However it should give you at least a start.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.