Dear R list, I have a dataset with a column which should be read as character, like this: name surname answer 1 xx yyy "00100" 2 rrr hhh "01" When reading this dataset with read.table, I get 1 xx yyy 100 2 rrr hhh 1 The string column consists in answers to multiple choice questions, not all having the same number of answers. I could format the answers using formatC but there are over a hundred different questions in there. I tried with quote="\"'" without any luck. Googling after this take me nowhere either. It should be simple but I seem to miss it... Can anybody point me to the right direction? TIA, Adrian
Marc Schwartz
2005-Jul-10 20:18 UTC
[R] not supressing leading zeros when reading a table?
On Sun, 2005-07-10 at 18:13 +0000, Adrian Dusa wrote:> Dear R list, > > I have a dataset with a column which should be read as character, like this: > > name surname answer > 1 xx yyy "00100" > 2 rrr hhh "01" > > When reading this dataset with read.table, I get > 1 xx yyy 100 > 2 rrr hhh 1 > > The string column consists in answers to multiple choice questions, not all > having the same number of answers. I could format the answers using formatC but > there are over a hundred different questions in there. > > I tried with quote="\"'" without any luck. Googling after this take me nowhere > either. It should be simple but I seem to miss it... > Can anybody point me to the right direction? > > TIA, > AdrianWith your example data saved in a file called "test.txt":> df <- read.table("test.txt", header = TRUE, colClasses = "character")> dfname surname answer 1 xx yyy 00100 2 rrr hhh 01> str(df)`data.frame': 2 obs. of 3 variables: $ name : chr "xx" "rrr" $ surname: chr "yyy" "hhh" $ answer : chr "00100" "01" See the colClasses argument in ?read.table. HTH, Marc Schwartz
Duncan Murdoch
2005-Jul-10 20:20 UTC
[R] not supressing leading zeros when reading a table?
Adrian Dusa wrote:> Dear R list, > > I have a dataset with a column which should be read as character, like this: > > name surname answer > 1 xx yyy "00100" > 2 rrr hhh "01" > > When reading this dataset with read.table, I get > 1 xx yyy 100 > 2 rrr hhh 1 > > The string column consists in answers to multiple choice questions, not all > having the same number of answers. I could format the answers using formatC but > there are over a hundred different questions in there. > > I tried with quote="\"'" without any luck. Googling after this take me nowhere > either. It should be simple but I seem to miss it... > Can anybody point me to the right direction?By default, read.table guesses about the column type. Yours looks numeric, even though it is not. Use the colClasses argument of read.table to specify the column type. If you only have the 3 columns above, colClasses="character" should work. Duncan Murdoch
alejandro munoz
2005-Jul-10 20:26 UTC
[R] not supressing leading zeros when reading a table?
Adrian, To prevent coercion to numeric, try: mydata <- read.table("myfile", colClasses="character") HTH. alejandro On 7/10/05, Adrian Dusa <dusa.adrian at gmail.com> wrote:> Dear R list, > > I have a dataset with a column which should be read as character, like this: > > name surname answer > 1 xx yyy "00100" > 2 rrr hhh "01" > > When reading this dataset with read.table, I get > 1 xx yyy 100 > 2 rrr hhh 1 > > The string column consists in answers to multiple choice questions, not all > having the same number of answers. I could format the answers using formatC but > there are over a hundred different questions in there. > > I tried with quote="\"'" without any luck. Googling after this take me nowhere > either. It should be simple but I seem to miss it... > Can anybody point me to the right direction? > > TIA, > Adrian > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >