Read
It looks like we can look at the last digit of the data and that would be the column number; is that correct? Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.
On Mon, Feb 22, 2021 at 5:34 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
This gets it into a data frame. If you know which columns should be numeric you can convert them. s <- "x1 x2 x3 x4 1 B22 2 C33 322 B22 D34 4 D44 51 D53 60 D62 " tc <- textConnection( s ) lns <- readLines(tc) close(tc) if ( "" == lns[ length( lns ) ] ) lns <- lns[ -length( lns ) ] L <- strsplit( lns, " +" ) m <- do.call( rbind, lapply( L[-1], function(v) if (length(v)<length(L[[1]])) c( v, rep(NA, length(L[[1]]) - length(v) ) ) else v ) ) colnames( m ) <- L[[1]] result <- as.data.frame( m, stringsAsFactors = FALSE ) result On February 22, 2021 4:42:57 PM PST, Val <valkremk at gmail.com> wrote:
That is my problem. The spacing between columns is not consistent. It may be single space or multiple spaces (two or three). On Mon, Feb 22, 2021 at 6:14 PM Bill Dunlap <williamwdunlap at gmail.com> wrote:
You said the column values were separated by space characters. Copying the text from gmail shows that some column names and column values are separated by single spaces (e.g., between x1 and x2) and some by multiple spaces (e.g., between x3 and x4. Did the mail mess up the spacing or is there some other way to tell where the omitted values are? -Bill On Mon, Feb 22, 2021 at 2:54 PM Val <valkremk at gmail.com> wrote:
I Tried that one and it did not work. Please see the error message Error in read.table(text = "x1 x2 x3 x4\n1 B12 \n2 C23 \n322 B32 D34 \n4 D44 \n51 D53\n60 D62
",
: more columns than column names On Mon, Feb 22, 2021 at 5:39 PM Bill Dunlap
<williamwdunlap at gmail.com> wrote:
Since the columns in the file are separated by a space character,
" ",
add the read.table argument sep=" ". -Bill On Mon, Feb 22, 2021 at 2:21 PM Val <valkremk at gmail.com> wrote:
Hi all, I am trying to read a messy data but facing
difficulty. The
data has several columns separated by blank space(s). Each
column
value may have different lengths across the rows. The first row(header) has four columns. However, each row may not have
the four
column values. For instance, the first data row has only the
first
two column values. The fourth data row has the first and last
column
values, the second and the third column values are missing for
this
row.. How do I read this data set correctly? Here is my sample
data
set, output and desired output. To make it clear to each data
point
I have added the row and column numbers. I cannot use fixed
width
format reading because each row may have different length for
a
given column.
dat<-read.table(text="x1 x2 x3 x4
1 B22
2 C33
322 B22 D34
4 D44
51 D53
60 D62 ",header=T, fill=T,na.strings=c("","NA"))
Output
x1 x2 x3 x4
1 1 B12 <NA> NA
2 2 C23 <NA> NA
3 322 B32 D34 NA
4 4 D44 <NA> NA
5 51 D53 <NA> NA
6 60 D62 <NA> NA
Desired output
x1 x2 x3 x4
1 1 B22 <NA> NA
2 2 <NA> C33 NA
3 322 B32 NA D34
4 4 <NA> NA D44
5 51 <NA> D53 NA
6 60 D62 <NA> NA
Thank you,
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Sent from my phone. Please excuse my brevity. -- Sent from my phone. Please excuse my brevity.
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.