Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote="\"'" without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction? TIA, Adrian
not supressing leading zeros when reading a table?
5 messages · Adrian Dusa, Marc Schwartz, Duncan Murdoch +1 more
On Sun, 2005-07-10 at 18:13 +0000, Adrian Dusa wrote:
Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote="\"'" without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction? TIA, Adrian
With your example data saved in a file called "test.txt":
df <- read.table("test.txt", header = TRUE, colClasses = "character")
df
name surname answer 1 xx yyy 00100 2 rrr hhh 01
str(df)
`data.frame': 2 obs. of 3 variables: $ name : chr "xx" "rrr" $ surname: chr "yyy" "hhh" $ answer : chr "00100" "01" See the colClasses argument in ?read.table. HTH, Marc Schwartz
Adrian Dusa wrote:
Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote="\"'" without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction?
By default, read.table guesses about the column type. Yours looks numeric, even though it is not. Use the colClasses argument of read.table to specify the column type. If you only have the 3 columns above, colClasses="character" should work. Duncan Murdoch
Adrian,
To prevent coercion to numeric, try:
mydata <- read.table("myfile", colClasses="character")
HTH.
alejandro
On 7/10/05, Adrian Dusa <dusa.adrian at gmail.com> wrote:
Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote="\"'" without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction? TIA, Adrian
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
On 7/10/05, alejandro munoz <guerinche at gmail.com> wrote:
Adrian,
To prevent coercion to numeric, try:
mydata <- read.table("myfile", colClasses="character")
HTH.
alejandro
On 7/10/05, Adrian Dusa <dusa.adrian at gmail.com> wrote:
Dear R list,
[...snip...]
Thank you all, I got it. This is my favourite super fast ever helpful help list (gosh, I didn't even expect an answer Sundays at 10 pm! ). Best, Adrian