R 3.3.1
OS X
Colleagues,
I have encountered an unexpected regex problem
I have read an Excel file into R using the readxl package. Columns names are:
COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject No. ", "Collection Date",
"Collection Time", "Scheduled Time Point", "Concentration", "Concentration Units",
"LLOQ", "ULOQ", "Comment?)
As you can see, there is a trailing space in ?Subject No. ?. I would like to delete that space. The following works:
sub(? $?, ??, COLNAMES)
However, I would like a more general approach that removes any trailing whitespace.
I tried variations such as:
sub("[:blank:]$", "", COLNAMES)
(also, without the $ and ?space' instead of ?blank') without success ? to my surprise, characters other than the trailing space were deleted but the trailing space remained.
Guidance on the correct syntax would be appreciated.
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
Regression expression to delete one or more spaces at end of string
4 messages · Dennis Fisher, Marc Schwartz, William Dunlap +1 more
On Aug 2, 2016, at 11:46 AM, Dennis Fisher <fisher at plessthan.com> wrote:
R 3.3.1
OS X
Colleagues,
I have encountered an unexpected regex problem
I have read an Excel file into R using the readxl package. Columns names are:
COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject No. ", "Collection Date",
"Collection Time", "Scheduled Time Point", "Concentration", "Concentration Units",
"LLOQ", "ULOQ", "Comment?)
As you can see, there is a trailing space in ?Subject No. ?. I would like to delete that space. The following works:
sub(? $?, ??, COLNAMES)
However, I would like a more general approach that removes any trailing whitespace.
I tried variations such as:
sub("[:blank:]$", "", COLNAMES)
(also, without the $ and ?space' instead of ?blank') without success ? to my surprise, characters other than the trailing space were deleted but the trailing space remained.
Guidance on the correct syntax would be appreciated.
Dennis
Dennis,
There is actually an example in ?gsub:
## trim trailing white space
str <- "Now is the time "
sub(" +$", "", str) ## spaces only
The '+' sign will match the preceding space one or more times at the end of the character string.
Note that as per ?regex, it is [:space:], not [:blank:] and the brackets need to be doubled in the regex to define the enclosing character group. An example would be:
sub("[[:space:]]+$", "", str) ## white space, POSIX-style
which is also in ?gsub.
Regards,
Marc Schwartz
First, use [[:blank:]] instead of [:blank:]. that latter matches colon, b, l, a, n, and k, the former whitespace. Second, put + after [[:blank:]] to match one or more of them. Bill Dunlap TIBCO Software wdunlap tibco.com
On Tue, Aug 2, 2016 at 9:46 AM, Dennis Fisher <fisher at plessthan.com> wrote:
R 3.3.1
OS X
Colleagues,
I have encountered an unexpected regex problem
I have read an Excel file into R using the readxl package. Columns names
are:
COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject
No. ", "Collection Date",
"Collection Time", "Scheduled Time Point", "Concentration", "Concentration
Units",
"LLOQ", "ULOQ", "Comment?)
As you can see, there is a trailing space in ?Subject No. ?. I would like
to delete that space. The following works:
sub(? $?, ??, COLNAMES)
However, I would like a more general approach that removes any trailing
whitespace.
I tried variations such as:
sub("[:blank:]$", "", COLNAMES)
(also, without the $ and ?space' instead of ?blank') without success ? to
my surprise, characters other than the trailing space were deleted but the
trailing space remained.
Guidance on the correct syntax would be appreciated.
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Double the [[]] and add a + for one-or-more characters:
sub("[[:blank:]]+$", "", COLNAMES)
On Aug 2, 2016, at 12:46 PM, Dennis Fisher <fisher at plessthan.com> wrote:
R 3.3.1
OS X
Colleagues,
I have encountered an unexpected regex problem
I have read an Excel file into R using the readxl package. Columns names are:
COLNAMES <- c("Study ID", "Test and Biological Matrix", "Subject No. ", "Collection Date",
"Collection Time", "Scheduled Time Point", "Concentration", "Concentration Units",
"LLOQ", "ULOQ", "Comment?)
As you can see, there is a trailing space in ?Subject No. ?. I would like to delete that space. The following works:
sub(? $?, ??, COLNAMES)
However, I would like a more general approach that removes any trailing whitespace.
I tried variations such as:
sub("[:blank:]$", "", COLNAMES)
(also, without the $ and ?space' instead of ?blank') without success ? to my surprise, characters other than the trailing space were deleted but the trailing space remained.
Guidance on the correct syntax would be appreciated.
Dennis
Dennis Fisher MD
P < (The "P Less Than" Company)
Phone / Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Dr. David Forrest drf at vims.edu 804-684-7900w 757-968-5509h 804-413-7125c #240 Andrews Hall Virginia Institute of Marine Science Route 1208, Greate Road PO Box 1346 Gloucester Point, VA, 23062-1346