Sent from my phone. Please excuse my brevity.
On December 15, 2016 8:46:55 AM PST, Steven Nagy <nstefi at gmail.com> wrote:
>I tried to send this email, but it didn't go through. I guess pictures
>are
>not allowed to send through HTML formatted emails?
>I'm re-sending it again without the picture, just comment there instead
>as
>placeholder.
>
>Thanks,
>Steven
>
>
>From: Steven Nagy [mailto:nstefi at gmail.com]
>Sent: Monday, December 12, 2016 10:50 PM
>To: 'Bert Gunter' <bgunter.4567 at gmail.com>
>Cc: 'R-help' <r-help at r-project.org>
>Subject: RE: [R] Need some help with regular expression
>
>Hi Bert and all,
>
>Sorry I was too busy at work and didn't have much time to continue this
>until now.
>So I studied "?regexp" and I can understand your regular expression
>now:
>sub(".*: *([[:alnum:]]* *-> *STU|STU *-> *[[:alnum:]]*).*","\\1",x)
>
>But I also wanted to split up these results in 2 columns, so your
>previous
>command would give me this result:
>[1] "NMA -> STU" "STU -> REG" "-> STU"
>
>and I wanted to further split them up to show this:
>From To
>NMA STU
>STU REG
> STU
>
>I still don?t quite understand the backreferences, and how could I have
>2
>backreferences, one for the left side of the ?->? sign and one for the
>right
>side?
>
>So it seems like I need to apply the ?sub? function twice, similar how
>I
>used the ?strapply? function twice in my original post:
>strapply(strapply(a, "(file://w+ -> STU|STU -> file://w+)", c, backref
>= -1,
>perl = TRUE), "(file://w+) -> (file://w+)", c, backref = -2, perl =
>TRUE)
>
>or maybe there would be a more simple way of using only 1 ?sub?
>function and
>2 backreferences?
>
>Also I?m not sure what do I do after I get the data? How could I
>represent
>the member type changes graphically? We need to analyze the behavior of
>switching from STU to another type or from another type to STU.
>Google Analytics has a nice chart under Behavior Flow, or Users Flow,
>and it
>looks like this:
><here was my picture from Google Analytics - it's from Behavior Flow or
>Users Flow showing flows from one category to another one and further
>to
>another one>
>
>
>
>Is there any graphical representation in R that is similar to this?
>
>Thanks a lot,
>Steven
>
>-----Original Message-----
>From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Bert
>Gunter
>Sent: Sunday, November 20, 2016 10:05 PM
>To: Aliz Csonka <mailto:lyzae.ro at gmail.com>
>Cc: R-help <mailto:r-help at r-project.org>
>Subject: Re: [R] Need some help with regular expression
>
>Although others may respond, I think you will do much better studying
>?regexp, which will answer all your questions. I believe the effort you
>will
>make figuring it out will pay dividends for your future R/regular
>expression
>usage that you cannot gain from my direct explanation.
>
>Good luck.
>
>Best,
>Bert
>Bert Gunter
>
>"The trouble with having an open mind is that people keep coming along
>and
>sticking things into it."
>-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
>On Sun, Nov 20, 2016 at 6:40 PM, Steven Nagy <mailto:nstefi at gmail.com>
>wrote:
>> Thanks a lot Bert. That's amazing. I am very new to both R and
>regular
>> expressions. I don't really understand the regular expression that
>you
>> used below.
>> And looks like I don't even need any special library, like the
>> "gsubfn" for the strapply function.
>> I was trying to use the regexr.com website to analyze your regular
>> expression, but it doesn't seem to match any text there.
>> Can you explain me the regular expression that you used?
>> ".*: *([[:alnum:]]* *-> *STU|STU *-> *[[:alnum:]]*).*"
>> So the dot in the front means any character and the star after that
>> means that it can repeat 0 or more times, right?
>> Then followed by a colon character ":" and a space, and what is the
>> next star after that? It means that the sequence before that again
>can
>> repeat 0 or more times?
>> And what are the double square brackets?
>> Is ":alnum:" specific to R? I don't think "regexr.com" understands
>> that. Or maybe that site is for regular expressions in Javascript,
>and
>> the syntax is different in R?
>>
>> Thank you,
>> Steven
>>
>> -----Original Message-----
>> From: Bert Gunter [mailto:bgunter.4567 at gmail.com]
>> Sent: Sunday, November 20, 2016 2:15 PM
>> To: Steven Nagy <mailto:nstefi at gmail.com>
>> Cc: R-help <mailto:r-help at r-project.org>
>> Subject: Re: [R] Need some help with regular expression
>>
>> If I understand you correctly, I think you are making it more complex
>
>> than necessary. Using your example (thanks!!), the following should
>> get you
>> started:
>>
>>
>>> x<- c("Name.MEMBER_TYPE: NMA -> STU ; CATEGORY:? -> 1 ; CITY:
>>> MISSISSAUGA -> Mississauga ; ZIP: L5N1H9 -> L5N 1H9 ; COUNTRY: CAN
>->
>>> ; MEMBER_STATUS:? -> N", "Name.MEMBER_TYPE: STU -> REG ; CATEGORY: 1
>>> ->","Name.MEMBER_TYPE: -> STU")
>>>
>>> x
>> [1] "Name.MEMBER_TYPE: NMA -> STU ; CATEGORY:? -> 1 ; CITY:
>> MISSISSAUGA -> Mississauga ; ZIP: L5N1H9 -> L5N 1H9 ; COUNTRY: CAN ->
>
>> ;
>> MEMBER_STATUS:? -> N"
>>
>> [2] "Name.MEMBER_TYPE: STU -> REG ; CATEGORY: 1 ->"
>> [3] "Name.MEMBER_TYPE: -> STU"
>>>
>>> sub(".*: *([[:alnum:]]* *-> *STU|STU *->
>*[[:alnum:]]*).*","file://1",x)
>> [1] "NMA -> STU" "STU -> REG" "-> STU"
>>
>>
>> I am sure that you can get things to the form you desire in one go
>> with some fiddling of the above, but it was easier for me to write
>the
>> regex to pick out the pieces you wanted and leave the rest to you.
>> Others may have slicker ways to do it, of course.
>>
>> HTH
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming
>along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Sat, Nov 19, 2016 at 8:06 PM, Steven Nagy
><mailto:nstefi at gmail.com>
>wrote:
>>> I tried out a regular expression on this website:
>>>
>>> http://regexr.com/3en1m
>>>
>>>
>>>
>>> So the input text is:
>>>
>>> "Name.MEMBER_TYPE:? -> STU"
>>>
>>>
>>>
>>> The regular expression is: ((?:\w+|\s) -> STU|STU -> (?:\w+|\s))
>>>
>>> And it returns:
>>>
>>> "? -> STU"
>>>
>>>
>>>
>>> but when I use in R, it doesn't return the same result:
>>>
>>> strapply(c, "((?:\\w+|\\s) -> STU|STU -> (?:\\w+|\\s))", c, backref
>=
>>> -1, perl = TRUE)
>>>
>>> returns:
>>> "Name.MEMBER_TYPE: -> STU"
>>>
>>>
>>>
>>>
>>>
>>> Here is what I was trying to do:
>>>
>>>
>>>
>>> I need to extract some values from a log table, and I created a
>>> regular expression that helps me with that.
>>>
>>> The log table has cells with values like:
>>>
>>> a = "Name.MEMBER_TYPE: NMA -> STU ; CATEGORY:? -> 1 ; CITY:
>>> MISSISSAUGA -> Mississauga ; ZIP: L5N1H9 -> L5N 1H9 ; COUNTRY: CAN
>->
>>> ; MEMBER_STATUS:? -> N"
>>>
>>> or
>>> b = "Name.MEMBER_TYPE: STU -> REG ; CATEGORY: 1 ->"
>>>
>>> so I needed to extract the values that a STU member type is changing
>
>>> from and to, so I needed NMA, STU in the 1st case or STU, REG in the
>
>>> 2nd
>> case.
>>>
>>> I came up with this expression which worked in both cases:
>>>
>>> strapply(strapply(a, "(file://w+ -> STU|STU -> file://w+)", c,
>backref =
>-1,
>>> perl = TRUE), "(file://w+) -> (file://w+)", c, backref = -2, perl =
>TRUE)
>>>
>>>
>>>
>>> But I had a 3rd case when the source member type was blank:
>>>
>>> c = "Name.MEMBER_TYPE: -> STU"
>>>
>>> and in that case it returned an error:
>>>
>>> strapply(strapply(c, "(file://w+ -> STU|STU -> file://w+)", c,
>backref =
>-1,
>>> perl = TRUE), "(file://w+) -> (file://w+)", c, backref = -2, perl =
>TRUE)
>>>
>>> Error: is.character(x) is not TRUE
>>>
>>>
>>>
>>> I found that the error is because this returns NULL:
>>>
>>> strapply(c, "(file://w+ -> STU|STU -> file://w+)", c, backref = -1,
>perl
>=
>>> TRUE)
>>>
>>>
>>>
>>>
>>>
>>> So I tried to modify the regular expression to match any word or
>>> blank
>>> space:
>>>
>>> strapply(c, "((?:\\w+|\\s) -> STU|STU -> (?:\\w+|\\s))", c, backref
>=
>>> -1, perl = TRUE)
>>>
>>>
>>>
>>> but this returned me the whole value of "c":
>>>
>>> "Name.MEMBER_TYPE:? -> STU"
>>>
>>> and I only needed "? -> STU" as it shows on the website regxr.com
>>>
>>>
>>>
>>> Is the result wrong on the regxr.com website or strapply returns the
>
>>> wrong result?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Steven
>>>
>>>
>>>???????? [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>______________________________________________
>mailto:R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.