Skip to content

String manipulation

3 messages · Luis Ridao Cruz, John Fox, Dimitris Rizopoulos

#
R-help,

I have a data frame which contains a character string column that is
something like;

II11
II18
II23
III1
III13
III16
III19
III2
III7
IV10
IV11
IV12
IX16
IX4
V12
V18
V2
V20
V23
V4
VII14
VII18
VII21
VII26
VII28
VII33
VII4
VII48
VII5
....
....
....

I want to apply a function (e.g mean) by grouping according to the
roman part of the string, i.e,

by I
by V
by VII
...
...
and so on.

I have looked at string manipulation functions (grep, pmatch,,,) but I
can't really get it the way I want.
Can anyone help?

Thanks in advance.
#
Dear Luis,

How about gsub("[0-9]", "",  x) ? This assumes that x contains the character
data and not a factor, as would usually be the case in a data frame. If the
variable is really a factor, then use as.character(x) in the call to gsub().

I hope this helps,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
--------------------------------
#
you could use "gsub()", i.e.,

strg <- c("II11", "II18", "II23", "III1", "III13", "III16", "III19", 
"III2", "III7", "IV10", "IV11", "IV12")
#########
x <- as.numeric(gsub("[^0-9]", "", strg))
y <- gsub("[0-9]", "", strg)
tapply(x, y, mean)


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Luis Ridao Cruz" <Luisr at frs.fo>
To: <r-help at stat.math.ethz.ch>
Sent: Thursday, October 20, 2005 3:23 PM
Subject: [R] String manipulation
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm