Dear R users, I had somehow expected that read.table() would treat the column specified by the row.names argument as of class character. That seems to be the only sensible class allowed for a column containing row names. However, that does not seem to be the case, as the following example shows: x <- cbind.data.frame(ID = c("010007787048271871", "1007109516820319", "10094843652996959", "010145176274075487"), X1 = 1:4, X2 = 4:1); write.table(x, "tmp.txt", quote = FALSE, row.names = FALSE); y <- read.table("tmp.txt", header= TRUE, row.names=1)> yX1 X2 10007787048271872 1 4 1007109516820319 2 3 10094843652996960 3 2 10145176274075488 4 1> xID X1 X2 1 010007787048271871 1 4 2 1007109516820319 2 3 3 10094843652996959 3 2 4 010145176274075487 4 1 The first column was not read in as a string, which mangled the IDs. I could use colClasses explicitly, but then I would need to know the number and classes of the remaining columns in advance. Is this a bug or expected behavior ? Any advice would be most helpful. Thanks, Markus [[alternative HTML version deleted]]
Markus Loecher <markus.loecher <at> gmail.com> writes:> [...] > > x > ID X1 X2 > 1 010007787048271871 1 4 > 2 1007109516820319 2 3 > 3 10094843652996959 3 2 > 4 010145176274075487 4 1 > > The first column was not read in as a string, which mangled the IDs. > I could use colClasses explicitly, but then I would need to know the number > and classes of the remaining columns in advance. > Is this a bug or expected behavior ? > Any advice would be most helpful.You could use a generic colClasses for all columns, like: y <- read.table("tmp.txt", header= TRUE, row.names=1, colClasses="character") y X1 X2 010007787048271871 1 4 1007109516820319 2 3 10094843652996959 3 2 010145176274075487 4 1 In this case, all columns are read as character and need to be converted manually, but your row names are appropriate. Hoping this helps, Adrian
Markus Loecher-4 wrote:> > Dear R users, > I had somehow expected that read.table() would treat the column specified > by > the row.names argument as of class character. That seems to be the only > sensible class allowed for a column containing row names. However, that > does > not seem to be the case, as the following example shows: > > x <- cbind.data.frame(ID = c("010007787048271871", "1007109516820319", > "10094843652996959", "010145176274075487"), X1 = 1:4, X2 = 4:1) > [...snip...] >As a better alternative, why not move directly the first column in the rownames? rownames(x) <- x$ID write.table(x[, -1], "tmp.txt") y <- read.table("tmp.txt", header=T) y X1 X2 010007787048271871 1 4 1007109516820319 2 3 10094843652996959 3 2 010145176274075487 4 1 In this case, X1 and X2 variables are read as numeric, while the first column is read as character and assigned directly to the rownames. HTH, Adrian -- View this message in context: http://www.nabble.com/read.table%2C-row.names-arg-tp23888975p23892826.html Sent from the R help mailing list archive at Nabble.com.