Sent from my phone. Please excuse my brevity.
On May 31, 2016 6:26:31 PM PDT, Val <valkremk at gmail.com> wrote:
>Thank you so much Jeff. It worked for this example.
>
>When I read it from a file (c:\data\test.txt) it did not work
>
>KLEM="c:\data"
>KR=paste(KLEM,"\test.txt",sep="")
>indta <- readLines(KR, skip=46) # not interested in the first 46
>lines)
>
>pattern <- "^.*group (\\d+)[^:]*: *([-+0-9.eE]*).*$"
>firstlines <- grep( pattern, indta )
># Replace the matched portion (entire string) with the first capture #
>string
>v1 <- as.numeric( sub( pattern, "\\1", indta[ firstlines ] ) )
># Replace the matched portion (entire string) with the second capture #
>string
>v2 <- as.numeric( sub( pattern, "\\2", indta[ firstlines ] ) )
># Convert the lines just after the first lines to numeric
>v3 <- as.numeric( indta[ firstlines + 1 ] )
># put it all into a data frame
>result <- data.frame( Group = v1, Mean = v2, SE = v3 )
>
>result
>[1] Group Mean SE
><0 rows> (or 0-length row.names)
>
>Thank you in advance
>
>
>On Tue, May 31, 2016 at 1:12 AM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
>> Please learn to post in plain text (the setting is in your email
>client...
>> somewhere), as HTML is "What We See Is Not What You Saw" on this
>mailing
>> list. In conjunction with that, try reading some of the fine
>material
>> mentioned in the Posting Guide about making reproducible examples
>like this
>> one:
>>
>> # You could read in a file
>> # indta <- readLines( "out.txt" )
>> # but there is no "current directory" in an email
>> # so here I have used the dput() function to make source code
>> # that creates a self-contained R object
>>
>> indta <- c(
>> "Mean of weight group 1, SE of mean : 72.289037489555276",
>> " 11.512956539215610",
>> "Average weight of group 2, SE of Mean : 83.940053900595013",
>> " 10.198495690144522",
>> "group 3 mean , SE of Mean : 78.310441258245469",
>> " 13.015876679555",
>> "Mean of weight of group 4, SE of Mean :
>76.967516495101669",
>> " 12.1254882985", "")
>>
>> # Regular expression patterns are discussed all over the internet
>> # in many places OTHER than R
>> # You can start with ?regex, but there are many fine tutorials also
>>
>> pattern <- "^.*group (\\d+)[^:]*: *([-+0-9.eE]*).*$"
>> # For this task the regex has to match the whole "first line" of each
>set
>> # ^ =match starting at the beginning of the string
>> # .* =any character, zero or more times
>> # "group " =match these characters
>> # ( =first capture string starts here
>> # \\d = any digit (first backslash for R, second backslash for
>regex)
>> # + =one or more of the preceding (any digit)
>> # ) =end of first capture string
>> # [^:] =any non-colon character
>> # * =zero or more of the preceding (non-colon character)
>> # : =match a colon exactly
>> # " *" =match zero or more spaces
>> # ( =second capture string starts here
>> # [ =start of a set of equally acceptable characters
>> # -+ =either of these characters are acceptable
>> # 0-9 =any digit would be acceptable
>> # . =a period is acceptable (this is inside the [])
>> # eE =in case you get exponential notation input
>> # ] =end of the set of acceptable characters (number)
>> # * =number of acceptable characters can be zero or more
>> # ) =second capture string stops here
>> # .* =zero or more of any character (just in case)
>> # $ =at end of pattern, requires that the match reach the end
>> # of the string
>>
>> # identify indexes of strings that match the pattern
>> firstlines <- grep( pattern, indta )
>> # Replace the matched portion (entire string) with the first capture
>#
>> string
>> v1 <- as.numeric( sub( pattern, "\\1", indta[ firstlines ] ) )
>> # Replace the matched portion (entire string) with the second capture
>#
>> string
>> v2 <- as.numeric( sub( pattern, "\\2", indta[ firstlines ] ) )
>> # Convert the lines just after the first lines to numeric
>> v3 <- as.numeric( indta[ firstlines + 1 ] )
>> # put it all into a data frame
>> result <- data.frame( Group = v1, Mean = v2, SE = v3 )
>>
>> Figuring out how to deliver your result (output) is a separate
>question that
>> depends where you want it to go.
>>
>>
>> On Mon, 30 May 2016, Val wrote:
>>
>>> Hi all,
>>>
>>> I have a messy text file and from this text file I want extract some
>>> information
>>> here is the text file (out.txt). One record has tow lines. The mean
>comes
>>> in the first line and the SE of the mean is on the second line. Here
>is
>>> the
>>> sample of the data.
>>>
>>> Mean of weight group 1, SE of mean : 72.289037489555276
>>> 11.512956539215610
>>> Average weight of group 2, SE of Mean : 83.940053900595013
>>> 10.198495690144522
>>> group 3 mean , SE of Mean : 78.310441258245469
>>> 13.015876679555
>>> Mean of weight of group 4, SE of Mean :
>76.967516495101669
>>> 12.1254882985
>>>
>>> I want produce the following table. How do i read it first and then
>>> produce a
>>>
>>>
>>> Gr1 72.289037489555276 11.512956539215610
>>> Gr2 83.940053900595013 10.198495690144522
>>> Gr3 78.310441258245469 13.015876679555
>>> Gr4 76.967516495101669 12.1254882985
>>>
>>>
>>> Thank you in advance
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller The ..... ..... Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
>Go...
>> Live: OO#.. Dead: OO#..
>Playing
>> Research Engineer (Solar/Batteries O.O#. #.O#. with
>> /Software/Embedded Controllers) .OO#. .OO#.
>rocks...1k
>>
>---------------------------------------------------------------------------
[[alternative HTML version deleted]]