reading in table with different number of elements in each row
This is in the Detail of the help page: The number of data columns is determined by looking at the first five lines of input (or the whole file if it has less than five lines), or from the length of col.names if it is specified and is longer. This could conceivably be wrong if fill or blank.lines.skip are true, so specify col.names if necessary. try: read.table(..., col.names=1:30) This will assume there are 30 columns of data (you only said a max of 15, but lets double it) On Tue, May 25, 2010 at 8:05 PM, Johan Jackson
<johan.h.jackson at gmail.com> wrote:
HI all,
This is probably simple, but I haven't been able to locate the answer either
in the Import Manual or from searching the listserve.
I have tab-delimited data with different numbers of elements in each row. I
want to read it into R, such that R fills in "NA" in elements that have no
data. How do I accomplish this?
Example:
DATA on disk:
? ? ?1 -0.068191 ? ? ? -0.050729 ? ? ? -0.113982 ? ? ? -0.044363
-0.072445 ? ? ? -0.044516 ? ? ? -0.048597 ? ? ? -0.051866
-0.051563 ? ? ? -0.041576
? ? ?2 -0.032645 ? ? ? -0.062389 ? ? ? -0.054491 ? ? ? -0.058061
-0.034690 ? ? ? -0.038044 ? ? ? -0.045332 ? ? ? -0.043785
-0.050639 ? ? ? -0.049617
? ? ?3 -0.068191 ? ? ? -0.044207 ? ? ? -0.058061 ? ? ? -0.050729
-0.034991 ? ? ? -0.045360 ? ? ? -0.051563 ? ? ? -0.060290
-0.043785 ? ? ? -0.048757
? ? ?4 -0.068191 ? ? ? -0.062389 ? ? ? -0.050729 ? ? ? -0.058579
-0.056481 ? ? ? -0.044363 ? ? ? -0.042347 ? ? ? -0.060290
-0.051563 ? ? ? -0.037216 ? ? ? -0.041576 ? ? ? -0.056476
? ? ?5 -0.068191 ? ? ? -0.047649 ? ? ? -0.062389 ? ? ? -0.058061
-0.034227 ? ? ? -0.185829 ? ? ? -0.071855 ? ? ? -0.064096
-0.195645
? ? ?6 -0.040208 ? ? ? -0.068191 ? ? ? -0.036475 ? ? ? -0.041268
-0.044207 ? ? ? -0.044363 ? ? ? -0.034991 ? ? ? -0.059810
-0.051619 ? ? ? -0.051563 ? ? ? -0.037216 ? ? ? -0.041576
-0.019762
? ? ?7 -0.068191 ? ? ? -0.034227 ? ? ? -0.044363 ? ? ? -0.051563
-0.041576 ? ? ? -0.053823 ? ? ? -0.057023 ? ? ? -0.046083
-0.089374 ? ? ? -0.057436
? ? ?8 -0.068191 ? ? ? -0.050731 ? ? ? -0.044207 ? ? ? -0.169714
-0.060025 ? ? ? -0.048597 ? ? ? -0.037827 ? ? ? -0.053823
-0.055154
? ? ?9 -0.062389 ? ? ? -0.044207 ? ? ? -0.050729 ? ? ? -0.044363
-0.043785
? ? 10 -0.040208 ? ? ? -0.036716 ? ? ? -0.068191 ? ? ? -0.051466
-0.050731 ? ? ? -0.050729 ? ? ? -0.048095 ? ? ? -0.044363
-0.044817 ? ? ? -0.059810 ? ? ? -0.051563 ? ? ? -0.037827
-0.053985 ? ? ? -0.059573 ? ? ? -0.052893
? ? 11 -0.068191 ? ? ? -0.034227 ? ? ? -0.048597 ? ? ? -0.051563
-0.041576 ? ? ? -0.056512
? ? 12 -0.040208 ? ? ? -0.050731 ? ? ? -0.044207 ? ? ? -0.048095
-0.044363 ? ? ? -0.044817 ? ? ? -0.037827 ? ? ? -0.053985 ? ? ? -0.059573
My attempts:
x <- read.table("DATA",fill=TRUE,sep="\t",colClasses="numeric")
x
? ? ? ? ?V1 ? ? ? ?V2 ? ? ? ?V3 ? ? ? ?V4 ? ? ? ?V5 ? ? ? ?V6 V7 ? ? ? ?V8 ? ? ? ?V9 ? ? ? V10 ? ? ? V11 ? ? ? V12 ? ? ? V13 1 ?-0.068191 -0.050729 -0.113982 -0.044363 -0.072445 -0.044516 -0.048597 -0.051866 -0.051563 -0.041576 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 2 ?-0.032645 -0.062389 -0.054491 -0.058061 -0.034690 -0.038044 -0.045332 -0.043785 -0.050639 -0.049617 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 3 ?-0.068191 -0.044207 -0.058061 -0.050729 -0.034991 -0.045360 -0.051563 -0.060290 -0.043785 -0.048757 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 4 ?-0.068191 -0.062389 -0.050729 -0.058579 -0.056481 -0.044363 -0.042347 -0.060290 -0.051563 -0.037216 -0.041576 -0.056476 ? ? ? ?NA 5 ?-0.068191 -0.047649 -0.062389 -0.058061 -0.034227 -0.185829 -0.071855 -0.064096 -0.195645 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 6 ?-0.040208 -0.068191 -0.036475 -0.041268 -0.044207 -0.044363 -0.034991 -0.059810 -0.051619 -0.051563 -0.037216 -0.041576 -0.019762 7 ?-0.068191 -0.034227 -0.044363 -0.051563 -0.041576 -0.053823 -0.057023 -0.046083 -0.089374 -0.057436 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 8 ?-0.068191 -0.050731 -0.044207 -0.169714 -0.060025 -0.048597 -0.037827 -0.053823 -0.055154 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 9 ?-0.062389 -0.044207 -0.050729 -0.044363 -0.043785 ? ? ? ?NA NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 10 -0.040208 -0.036716 -0.068191 -0.051466 -0.050731 -0.050729 -0.048095 -0.044363 -0.044817 -0.059810 -0.051563 -0.037827 -0.053985 11 -0.059573 -0.052893 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 12 -0.068191 -0.034227 -0.048597 -0.051563 -0.041576 -0.056512 NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA 13 -0.040208 -0.050731 -0.044207 -0.048095 -0.044363 -0.044817 -0.037827 -0.053985 -0.059573 ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA ? ? ? ?NA The above is almost right, but x has 13 rows instead of 12! WHY? Row 10 (which has 15 elements) was cut off at 13, and then the last two elements were put in a new row. WHY? I have tried messing with colClasses to no avail. Any help would be ... umm... helpful! JJ ? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?