Hello and thank you dear R-people in advance.
This is quite basic question but which I have confronted occasionally and
get over it without satisfying solution. The question is about factors, this
time I would just like convert a data.frames NA-terms to 0 and get a
data.frame as a result. There might be a way to do that inside of the
data.frame but I think that it might be overcomplicated and possible slow.
With matrix it is easy and clean:
X <- (ifelse(is.na(X), 0, X)) ### Applying data.frames yields list..
or
"na.to.0" <- function(x)
{
x <- as.matrix(x) ### Just to be sure
x[is.na(x)] <- 0
x <- data.frame(x) ### PROBLEM
x
}
So the problem comes when converting the result to a data.frame (this is
sometimes also a problem when importing data.frame!). All character columns
goes to factors as documented in help. That's something one can avoid by
using I() or later call type.convert (convert.col.type in Splus if I can
recall) but somehow I think that there should be a way to make it easier. At
least in a case when converting data.frame to matrix and back to data.frame.
The other but related question is odd. This time I have numeric col in a
data.frame (at least it should be) which I have fetched from Excel through
RODBC (it's great). But when I'm trying to convert Na to 0 as a side
effect
these columns get converted to characters:
> is.numeric(as.matrix(KUNTADATA[,15]))
[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE
or> is.numeric(data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE
as.numeric works of course but that's not to way to do well and error robust
code.
Please let me know if you have any idea how to avoid automatic factor or
character (last case) conversion.
Jussi
Analytics
State Treasury of Finland, Finance
PS.
version:
platform i386-pc-mingw32
arch x86
os Win32
system x86, Win32
status
major 1
minor 4.1
year 2002
month 01
day 30
language R
Platform is Windows NT4 (not my choice..)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
To answer at least partially to my own question:
"na.to.0" <- function(x)
{
xx <- data.matrix(x)
xx[is.na(x)] <- 0
xx <- data.frame(xx)
xx
}
seems to work (Idea is/was replace a data.frame NAs by 0s and return a
data.frame as a result).
Still I'm a little bit confused with these converts but now I can move on.
Sorry this monology,
Jussi
________________________________________________
Hello and thank you dear R-people in advance.
This is quite basic question but which I have confronted occasionally and
get over it without satisfying solution. The question is about factors, this
time I would just like convert a data.frames NA-terms to 0 and get a
data.frame as a result. There might be a way to do that inside of the
data.frame but I think that it might be overcomplicated and possible slow.
With matrix it is easy and clean:
X <- (ifelse(is.na(X), 0, X)) ### Applying data.frames yields list..
or
"na.to.0" <- function(x)
{
x <- as.matrix(x) ### Just to be sure
x[is.na(x)] <- 0
x <- data.frame(x) ### PROBLEM
x
}
So the problem comes when converting the result to a data.frame (this is
sometimes also a problem when importing data.frame!). All character columns
goes to factors as documented in help. That's something one can avoid by
using I() or later call type.convert (convert.col.type in Splus if I can
recall) but somehow I think that there should be a way to make it easier. At
least in a case when converting data.frame to matrix and back to data.frame.
The other but related question is odd. This time I have numeric col in a
data.frame (at least it should be) which I have fetched from Excel through
RODBC (it's great). But when I'm trying to convert Na to 0 as a side
effect
these columns get converted to characters:
> is.numeric(as.matrix(KUNTADATA[,15]))
[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE
or> is.numeric(data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE
as.numeric works of course but that's not to way to do well and error robust
code.
Please let me know if you have any idea how to avoid automatic factor or
character (last case) conversion.
Jussi
Analytics
State Treasury of Finland, Finance
PS.
version:
platform i386-pc-mingw32
arch x86
os Win32
system x86, Win32
status
major 1
minor 4.1
year 2002
month 01
day 30
language R
Platform is Windows NT4 (not my choice..)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>> To answer at least partially to my own question: >> >> "na.to.0" <- function(x) >> { >> xx <- data.matrix(x) >> xx[is.na(x)] <- 0 >> xx <- data.frame(xx) >> xx >> } >> >> seems to work (Idea is/was replace a data.frame NAs by 0s and return a >> data.frame as a result). >> >> Still I'm a little bit confused with these converts but now I can move on.>I'm confused by your questions. Is this a data frame with only numeric >columns? If so, your comments about factors/characters make no sense,> and >if not your conversion makes little sense.>=46or a *numeric* data frame X>X[] <- lapply(X, function(x) {x[is.na(x)] <- 0; x})>seems to be what you want.Here is a sample of my data.frame (KUNTADATA): -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Sorry, last answer slipped away because accidental key shortcut typing.
Thank you for your reply. Here is a sample of my data.frame (KUNTADATA):
KUNTADATA[1:10,c(6:8, 20)]
Kunta Period Name Asunnot 1100 intarr_14
1 ESPOON KAUPUNKI 1993/1 14164 41336.27
2 ESPOON KAUPUNKI 1993/2 14164 NA
3 ESPOON KAUPUNKI 1994/1 14164 0.00
4 ESPOON KAUPUNKI 1994/2 14164 330.29
5 ESPOON KAUPUNKI 1995/1 14164 0.00
6 ESPOON KAUPUNKI 1995/2 14164 0.00
7 ESPOON KAUPUNKI 1996/1 14164 67277.18
8 ESPOON KAUPUNKI 1996/2 14164 7860.26
9 ESPOON KAUPUNKI 1997/1 14164 NA
10 ESPOON KAUPUNKI 1997/2 14164 231701.05
So there is both character vector and numerical vectors. But truly - I could
just use na.to.0 when neccessary with numerical rows. But because I have
face this "problem" with factors quite often I though that it might be
a
common interest to ask my question.
Still I cannot understand the result:
> is.numeric(as.matrix(KUNTADATA[,15]))
[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE
Yes, your code works nicely (as always) and do the trick I wanted, something
I couldn't write down myself. There is for me a lot to learn within R/Splus.
Thank you, I appreciated your help and this list,
Jussi Makinen
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Sorry, my previous answer slipped away because accidental key shortcut
typing when I was trying to copy/paste the example.
***
Thank you for your reply. Here is a sample of my data.frame (KUNTADATA):
KUNTADATA[1:10,c(6:8, 20)]
Kunta Period Name Asunnot 1100 intarr_14
1 ESPOON KAUPUNKI 1993/1 14164 41336.27
2 ESPOON KAUPUNKI 1993/2 14164 NA
3 ESPOON KAUPUNKI 1994/1 14164 0.00
4 ESPOON KAUPUNKI 1994/2 14164 330.29
5 ESPOON KAUPUNKI 1995/1 14164 0.00
6 ESPOON KAUPUNKI 1995/2 14164 0.00
7 ESPOON KAUPUNKI 1996/1 14164 67277.18
8 ESPOON KAUPUNKI 1996/2 14164 7860.26
9 ESPOON KAUPUNKI 1997/1 14164 NA
10 ESPOON KAUPUNKI 1997/2 14164 231701.05
So there is both type of vectors: characterical and numerical. But truly - I
could just use na.to.0 when neccessary with numerical rows. But because I
have faced this "problem" with conversions quite often I though that
it
might be a common interest to ask my question.
I still cannot understand the result:
> is.numeric(as.matrix(KUNTADATA[,15]))
[1] TRUE> is.numeric(as.data.frame(as.matrix(KUNTADATA[,15])))
[1] FALSE> mode(as.data.frame(as.matrix(KUNTADATA[1,15])))
[1] "list"
I'm sure that I just do not get the basic feature behind data.frame()
function and that would be valuable for me and might be to somebody else as
well.
Yes, your code works nicely (as always) and do the trick I wanted, something
I couldn't write down myself. There is for me a lot to learn within R/Splus.
Thank you, I appreciated your help and all the learning I have achieved
through the list,
Jussi Makinen
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._