Dear R users,
I had somehow expected that read.table() would treat the column specified by
the row.names argument as of class character. That seems to be the only
sensible class allowed for a column containing row names. However, that does
not seem to be the case, as the following example shows:
x <- cbind.data.frame(ID = c("010007787048271871",
"1007109516820319",
"10094843652996959", "010145176274075487"), X1 = 1:4, X2 =
4:1);
write.table(x, "tmp.txt", quote = FALSE, row.names = FALSE);
y <- read.table("tmp.txt", header= TRUE, row.names=1)
> y
X1 X2
10007787048271872 1 4
1007109516820319 2 3
10094843652996960 3 2
10145176274075488 4 1> x
ID X1 X2
1 010007787048271871 1 4
2 1007109516820319 2 3
3 10094843652996959 3 2
4 010145176274075487 4 1
The first column was not read in as a string, which mangled the IDs.
I could use colClasses explicitly, but then I would need to know the number
and classes of the remaining columns in advance.
Is this a bug or expected behavior ?
Any advice would be most helpful.
Thanks,
Markus
[[alternative HTML version deleted]]
Markus Loecher <markus.loecher <at> gmail.com> writes:> [...] > > x > ID X1 X2 > 1 010007787048271871 1 4 > 2 1007109516820319 2 3 > 3 10094843652996959 3 2 > 4 010145176274075487 4 1 > > The first column was not read in as a string, which mangled the IDs. > I could use colClasses explicitly, but then I would need to know the number > and classes of the remaining columns in advance. > Is this a bug or expected behavior ? > Any advice would be most helpful.You could use a generic colClasses for all columns, like: y <- read.table("tmp.txt", header= TRUE, row.names=1, colClasses="character") y X1 X2 010007787048271871 1 4 1007109516820319 2 3 10094843652996959 3 2 010145176274075487 4 1 In this case, all columns are read as character and need to be converted manually, but your row names are appropriate. Hoping this helps, Adrian
Markus Loecher-4 wrote:> > Dear R users, > I had somehow expected that read.table() would treat the column specified > by > the row.names argument as of class character. That seems to be the only > sensible class allowed for a column containing row names. However, that > does > not seem to be the case, as the following example shows: > > x <- cbind.data.frame(ID = c("010007787048271871", "1007109516820319", > "10094843652996959", "010145176274075487"), X1 = 1:4, X2 = 4:1) > [...snip...] >As a better alternative, why not move directly the first column in the rownames? rownames(x) <- x$ID write.table(x[, -1], "tmp.txt") y <- read.table("tmp.txt", header=T) y X1 X2 010007787048271871 1 4 1007109516820319 2 3 10094843652996959 3 2 010145176274075487 4 1 In this case, X1 and X2 variables are read as numeric, while the first column is read as character and assigned directly to the rownames. HTH, Adrian -- View this message in context: http://www.nabble.com/read.table%2C-row.names-arg-tp23888975p23892826.html Sent from the R help mailing list archive at Nabble.com.