________________________________
From: Bert Gunter <bgunter.4567 at gmail.com>
Sent: Wednesday, 26 August 2020 4:26 AM
To: Peter Bishop <bishop_peterj at hotmail.com>
Cc: r-help at r-project.org <r-help at r-project.org>
Subject: Re: [R] Matching backslash in a table's column using R
language
1. I am far from an expert on such matters
2. It is unclear to me what your input is -- I assume a file.
The problem, as you indicate, is that R's parser sees "\B" as an
incorrect escape character, so, for example:
cat("\B")
Error: '\B' is an unrecognized escape in character string starting
""\B"
In any case, I think you should look at ?scan. Here is an example where
I scan from the keyboard first and then remove the "\". You may have to
scan from a file to do this.
z <-scan(file = "", what = "character")
1: A\BCDEFG
2: #CR terminates input
Read 1 item
cat(z)
A\BCDEFG
nchar(z)
[1] 8 ## scan read in the "\" as a single character from the console.
sub("\\\\","",z) ## Yes, 4 backslashes
[1] "ABCDEFG"
There may be better ways to do this, but as I said, I'm no expert.
BTW, in posting here, please post in *plain text,* as the server can
mangle html.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Aug 25, 2020 at 9:02 AM Peter Bishop
<bishop_peterj at hotmail.com<mailto:bishop_peterj at hotmail.com>> wrote:
In SQL, I'm using R as a way to filter data based on:
- 20 characters in the range <space> to <tilde>
- excluding <quote>, <apostrophe>, <comma>, <question mark>,
<backslash>, <backtick>
Given a SQL column containing the data:
code
----
A\BCDEFG
and the T-SQL script:
EXEC [sys].[sp_execute_external_script]
@language=N'R',
@script=N'
pattern1 = "^[\x20-\x7e]{1,20}$"
pattern2 = "[\x22\x27\x2c\x3f\x5c\x60]"
outData <- subset(inData, grepl(pattern1, code, perl=TRUE) &
!grepl(pattern2, code, perl=TRUE))',
@input_data_1 = N'SELECT [code] FROM [dbo].[products]',
@input_data_1_name = N'inData',
@output_data_1_name = N'outData'
WITH
RESULT SETS (AS OBJECT [dbo].[products]);
GO
why does the row detailed above get returned? I know that backslash is
a special character but not in the SQL table. Consequently, the T-SQL
code:
SELECT ASCII(SUBSTRING([value], 2, 1)) FROM [table]
returns 92 (the ASCII code for <backslash>) which shows that this is
being recognised as a backslash character and not as an escape
indicator for the following "B".
Can anyone advise how I can filter out the <backslash> in the way that
the other identified characters are being successfully filtered? As the
data is being retrieved from a table, I can?t ask the data provider to
use ?\\? instead of ?\? as that will be invalid for other uses.
Thanks.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To
UNSUBSCRIBE and more, see
https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=02%7C01%7C%7C61b6372acf8c4d215faa08d8493093e1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637339820268206663&sdata=9t4wAFzYNfo%2B%2BITjORuSPUVNDtcSHm5hJN8yYGnEUnU%3D&reserved=0<https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=02%7C01%7C%7C61b6372acf8c4d215faa08d8493093e1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637339820268206663&sdata=9t4wAFzYNfo%2B%2BITjORuSPUVNDtcSHm5hJN8yYGnEUnU%3D&reserved=0>
PLEASE do read the posting guide
https://eur04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=02%7C01%7C%7C61b6372acf8c4d215faa08d8493093e1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637339820268206663&sdata=UVGq0iXkHMdLFC2oL1l5WG730HW1fvEJPDFaiHS3a98%3D&reserved=0<https://eur04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=02%7C01%7C%7C61b6372acf8c4d215faa08d8493093e1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637339820268206663&sdata=UVGq0iXkHMdLFC2oL1l5WG730HW1fvEJPDFaiHS3a98%3D&reserved=0>
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]