Gorjanc Gregor
2005-Feb-14 02:55 UTC
[Rd] corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)
Hello!
I posted on saturday mail with the same subject on r-help seeking
for help in my work, but now I realized that this list is more
appropriate for this. I think I found I bug. Bellow are comments
and reproducible examples:
# Create a data frame
(tmp <- data.frame(y1=1:4, f1=factor(c("A", "B",
"C", "D"))))
y1 f1
1 1 A
2 2 B
3 3 C
4 4 D
# Add new column, which is not full (missing some data for last
# records)
tmp[1:2, "y2"] <- 2
tmp
y1 f1 y2
1 1 A 2
2 2 B 2
3 3 C <NA>
4 4 D <NA>
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# Why did I get corrupted data frame?
# Add new factor column, which is not full (missing some data for last
# records)
tmp[1:2, "f2"] <- tmp[1:2, "f1"]
tmp
y1 f1 y2 f2
1 1 A 2 1
2 2 B 2 2
3 3 C <NA> <NA>
4 4 D <NA> <NA>
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# New column should have class factor, but got somehow converted to integer
class(tmp$f2)
[1] "integer"
# If new column is completely full, everything is OK> tmp$f3 <- tmp$f1
> tmp
y1 f1 y2 f2 f3
1 1 A 2 1 A
2 2 B 2 2 B
3 3 C <NA> <NA> C
4 4 D <NA> <NA> D
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# Let's go further and try to convert one of new numeric column
# to factor
tmp$y2 <- factor(tmp$y2, labels="x")
tmp
y1 f1 y2 f2 f3
1 1 A x 1 A
2 2 B x 2 B
3 3 C x <NA> C
4 4 D x <NA> D
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# Why did also NAs get converted to level x?
# Let's continue and add additional column, which is again not
# full, but missing some data for first records
tmp[3:4, "y3"] <- 1
tmp
y1 f1 y2 f2 f3 y3
1 1 A x 1 A NA
2 2 B x 2 B NA
3 3 C x <NA> C 1
4 4 D x <NA> D 1
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# Notice the difference between <NA> in previous example and
# NA in current one.
# Try to convert this to factor
tmp$y3 <- factor(tmp$y3, labels="y")
tmp
y1 f1 y2 f2 f3 y3
1 1 A x 1 A <NA>
2 2 B x 2 B <NA>
3 3 C x <NA> C y
4 4 D x <NA> D y
Warning message:
corrupt data frame: columns will be truncated or padded with NAs
in: format.data.frame(x, digits = digits)
# Works as expected.
# My configuration:
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status =
major = 2
minor = 0.1
year = 2004
month = 11
day = 15
language = R
Windows XP Professional (build 2600) Service Pack 0.0
--
Lep pozdrav / With regards,
Gregor GORJANC
---------------------------------------------------------------
University of Ljubljana
Biotechnical Faculty URI: http://www.bfro.uni-lj.si
Zootechnical Department email: gregor.gorjanc <at> bfro.uni-lj.si
Groblje 3 tel: +386 (0)1 72 17 861
SI-1230 Domzale fax: +386 (0)1 72 17 888
Slovenia
Prof Brian Ripley
2005-Feb-14 08:52 UTC
[Rd] corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)
On Mon, 14 Feb 2005, Gorjanc Gregor wrote:> Hello! > > I posted on saturday mail with the same subject on r-help seeking > for help in my work, but now I realized that this list is more > appropriate for this. I think I found I bug.You do not tell us what you think it is, though! It *is* a bug in your code. You did create a corrupt data frame by using *replacement* on part of something that did not exist. The simple workaround is not to do that. One can argue about what should happen in such a case and currently R assumes that you know what you are doing and will only treat the data frame as a list. We could make this an error, but that would add an overhead to be paid by careful users too. If you really want to understand what is going on here, please read the source code: R is a volunteer project and the volunteers do not have time to explain each and every one of your error messages to you -- we have already had several goes over including data frames in data frames.> Bellow are comments > and reproducible examples: > > # Create a data frame > (tmp <- data.frame(y1=1:4, f1=factor(c("A", "B", "C", "D")))) > y1 f1 > 1 1 A > 2 2 B > 3 3 C > 4 4 D > > # Add new column, which is not full (missing some data for last > # records) > tmp[1:2, "y2"] <- 2 > tmp > y1 f1 y2 > 1 1 A 2 > 2 2 B 2 > 3 3 C <NA> > 4 4 D <NA> > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits) > > # Why did I get corrupted data frame?Because you tried to change elements in a non-existent column.> tmp[[3]][1] 2 2> # Add new factor column, which is not full (missing some data for last > # records) > tmp[1:2, "f2"] <- tmp[1:2, "f1"] > tmp > y1 f1 y2 f2 > 1 1 A 2 1 > 2 2 B 2 2 > 3 3 C <NA> <NA> > 4 4 D <NA> <NA> > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits) > > # New column should have class factor, but got somehow converted to integer > class(tmp$f2) > [1] "integer" > > # If new column is completely full, everything is OK >> tmp$f3 <- tmp$f1 >> tmp > y1 f1 y2 f2 f3 > 1 1 A 2 1 A > 2 2 B 2 2 B > 3 3 C <NA> <NA> C > 4 4 D <NA> <NA> D > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits) > > # Let's go further and try to convert one of new numeric column > # to factor > tmp$y2 <- factor(tmp$y2, labels="x") > tmp > y1 f1 y2 f2 f3 > 1 1 A x 1 A > 2 2 B x 2 B > 3 3 C x <NA> C > 4 4 D x <NA> D > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits) > > # Why did also NAs get converted to level x?They are *not* NAs: they print as NA with a warning.> # Let's continue and add additional column, which is again not > # full, but missing some data for first records > tmp[3:4, "y3"] <- 1 > tmp > y1 f1 y2 f2 f3 y3 > 1 1 A x 1 A NA > 2 2 B x 2 B NA > 3 3 C x <NA> C 1 > 4 4 D x <NA> D 1 > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits) > > # Notice the difference between <NA> in previous example and > # NA in current one.Yes, we know. The <NA>s are coming from the print, with the warning. They are unexpected, hence the headers do not line up. OTOH, for y3 you need to create a 4-long vector, and that is padded with numeric NAs.> # Try to convert this to factor > tmp$y3 <- factor(tmp$y3, labels="y") > tmp > y1 f1 y2 f2 f3 y3 > 1 1 A x 1 A <NA> > 2 2 B x 2 B <NA> > 3 3 C x <NA> C y > 4 4 D x <NA> D y > Warning message: > corrupt data frame: columns will be truncated or padded with NAs > in: format.data.frame(x, digits = digits)-- Brian D. Ripley, ripley@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Reasonably Related Threads
- corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)
- Re: [Rd] corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)
- corrupt data frame: columns will be truncated or padded with NAs in: format.data.frame(x, digits = digits)
- undefined S4 class in parallel computing at snowfall
- [R-pkgs] New package: `lavaan' for latent variable analysis (including structural equation modeling)