Skip to content
Back to formatted view

Raw Message

Message-ID: <1344236323.34136.YahooMailNeo@web142603.mail.bf1.yahoo.com>
Date: 2012-08-06T06:58:43Z
From: arun
Subject: regexpr with accents
In-Reply-To: <F649E9A1-B354-4178-8FF8-E97BE6C4C135@gmail.com>

HI,

It works with me.? I am using R 2.15 on Ubuntu 12.04.

?d1 <- data.frame(V1 = 1:5, V2=c("some text = 9", "some t?xt=9","s?me t?xt=9",? 
"s?me text=9", "some t?xt=9"))
d1
#? V1??????????? V2
#1? 1 some text = 9
#2? 2?? some t?xt=9
#3? 3?? s?me t?xt=9
#4? 4?? s?me text=9
#5? 5?? some t?xt=9
? 
d1$V1[regexpr("some t?xt=9",d1$V2)>0]<-9
d1$V1[regexpr("s?me text=9",d1$V2)>0] <-9
d1$V1[regexpr("some t?xt=9",d1$V2)>0] <-9
d1$V1[regexpr("s?me t?xt=9",d1$V2)>0] <-9
d1$V1[regexpr("some text = 9",d1$V2)>0] <-9

d1
#? V1??????????? V2
#1? 9 some text = 9
#2? 9?? some t?xt=9
#3? 9?? s?me t?xt=9
#4? 9?? s?me text=9
#5? 9?? some t?xt=9

A.K.




----- Original Message -----
From: Luca Meyer <lucam1968 at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, August 6, 2012 1:55 AM
Subject: [R] regexpr with accents

Hello,

I have build a syntax to find out if a given substring is included in a larger string that works like this:

d1$V1[regexpr("some text = 9",d1$V2)>0] <- 9

and this works all right till "some text" contains standard ASCII set. However, it does not work when accents are included as the following:

d1$V1[regexpr("some t?xt = 9",d1$V2)>0] <- 9

I have tried to substitute "?" with several wildcards but it did not work, can anyone suggest how to have the syntax parse the string ignoring the accent?

Thank you in advance,

Luca

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.