I am given a text file of records to be converted into a table format. I have searched related topics or packages, but can't find any similar cases. Please help. Sample record is given below. Take note the last element doesn't have a semi colon. ###---------Start of record---------------------- Name : John Height: 170cm Weight : 70kg Age: 30 Status: Married Children: 2 Employment Engineer ###---------End of record----------------------- Table format should have this header Name Height Weight Age Status Children Employment
How to parse text file into a table?
3 messages · Daren Tan, Ben Bolker, jim holtman
Daren Tan <darentan76 <at> gmail.com> writes:
I am given a text file of records to be converted into a table format. I have searched related topics or packages, but can't find any similar cases. Please help. Sample record is given below. Take note the last element doesn't have a semi colon. ###---------Start of record---------------------- Name : John Height: 170cm Weight : 70kg Age: 30 Status: Married Children: 2 Employment Engineer ###---------End of record----------------------- Table format should have this header Name Height Weight Age Status Children Employment
You can put something together with readLines, grep (to find the appropriate lines), gsub (to strip off the first field) ... if you need an explicit example, ask again ... Ben Bolker
Here is one way of reading in your data:
input <- readLines(textConnection("###---------Start of record----------------------
+ + Name: John + Height: 170cm + Weight: 70kg + Age: 30 + + Status: Married + Children: 2 + + Employment: Engineer + + ###---------End of record-----------------------"))
closeAllConnections()
result <- list()
recordNo <- 1
for (i in input){
+ if (nchar(i) == 0) next
+ if (length(grep("Start of record", i)) != 0){
+ # initialize next element of the list
+ result[[recordNo]] <- c(Name=NA, Height=NA, Weight=NA,
+ Age=NA, Status=NA, Children=NA)
+
+ }
+ else if (length(grep("End of record", i)) != 0) recordNo <- recordNo + 1
+ else {
+ # follow assumes you have consistent naming with ":" terminating data
+ # if not, add some error checking code
+ name <- sub("^(.*):.*", "\\1", i)
+ value <- sub(".*:\\s*(.*)", "\\1", i)
+ result[[recordNo]][name] <- value
+ }
+ }
result
[[1]]
Name Height Weight Age Status Children Employment
"John" "170cm" "70kg" "30" "Married" "2" "Engineer"
On Sun, Feb 22, 2009 at 8:45 AM, Daren Tan <darentan76 at gmail.com> wrote:
I am given a text file of records to be converted into a table format. I have searched related topics or packages, but can't find any similar cases. Please help. Sample record is given below. Take note the last element doesn't have a semi colon. ###---------Start of record---------------------- Name : John Height: 170cm Weight : 70kg Age: 30 Status: Married Children: 2 Employment Engineer ###---------End of record----------------------- Table format should have this header Name Height Weight Age Status Children Employment
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?