How to transpose it in a fast way?

Peter Dalgaard · 2013-03-08T10:08:37Z

On Mar 7, 2013, at 01:18 , Yao He wrote: > Dear all: > > I have a big data file of 60000 columns and 60000 rows like that: > > AA AC AA AA .......AT > CC CC CT CT.......TC > .......................... > ......................... > > I want to transpose it and the output is a new like that > AA CC ............ > AC CC............ > AA CT............. > AA CT......... > .................... > .................... > AT TC............. > > The keypoint is I can't read it into R by read.table()

Peter Dalgaard

Fri, Mar 8, 2013 2:08 AM

On Mar 7, 2013, at 01:18 , Yao He wrote:

As others have pointed out, that's a lot of data! 

You seem to have the right idea: If you read the columns line by line there is nothing to transpose. A couple of points, though:

- The cbind() is a potential performance hit since it copies the list every time around. geno_t <- vector("list", 60000) and then 
geno_t[[i]] <- <etc>

- You might use scan() instead of readLines, strsplit

- Perhaps consider the data type as you seem to be reading strings with 16 possible values (I suspect that R already optimizes string storage to make this point moot, though.)

Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

How to transpose it in a fast way?

Thread (12 messages)