Hi, I've got a rather large matrix of about 800 rows and 600000 columns. Each column is a time-series 800 long. Out of these 600000 time series, some have missing values (NA). I want to strip out all columns that have one or more NA values, i.e., only want full time series. This should do the trick: data_no_NA <- data[,!apply(is.na(data), 2, any)] I now use data_no_NA as input to a function, which returns output as a matrix of the same size as data_no_NA The trick is that i now need to put these columns back into a new 800 by 600000 empty matrix, at their original locations. Any suggestions on how to do that? hopefully without having to use loops. I'm using R/3.0.3 Cheers, Jatin.
just reverse what you did before. newdata <- data newdata[] <- NA newdata[,!apply(is.na(data), 2, any)] <- myfunction(data_no_NA) On Fri, Mar 27, 2015 at 1:13 AM, Jatin Kala <jatin.kala.jk at gmail.com> wrote:> Hi, > I've got a rather large matrix of about 800 rows and 600000 columns. > Each column is a time-series 800 long. > > Out of these 600000 time series, some have missing values (NA). > I want to strip out all columns that have one or more NA values, i.e., only > want full time series. > > This should do the trick: > data_no_NA <- data[,!apply(is.na(data), 2, any)] > > I now use data_no_NA as input to a function, which returns output as a > matrix of the same size as data_no_NA > > The trick is that i now need to put these columns back into a new 800 by > 600000 empty matrix, at their original locations. > Any suggestions on how to do that? hopefully without having to use loops. > I'm using R/3.0.3 > > Cheers, > Jatin. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Why not use complete.cases() ? data_no_NA <- data[, complete.cases(t(data))==T] Le 27 mars 2015 ? 06:13, Jatin Kala <jatin.kala.jk at gmail.com> a ?crit :> Hi, > I've got a rather large matrix of about 800 rows and 600000 columns. > Each column is a time-series 800 long. > > Out of these 600000 time series, some have missing values (NA). > I want to strip out all columns that have one or more NA values, i.e., only want full time series. > > This should do the trick: > data_no_NA <- data[,!apply(is.na(data), 2, any)] > > I now use data_no_NA as input to a function, which returns output as a matrix of the same size as data_no_NA > > The trick is that i now need to put these columns back into a new 800 by 600000 empty matrix, at their original locations. > Any suggestions on how to do that? hopefully without having to use loops. > I'm using R/3.0.3 > > Cheers, > Jatin. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks Richard, This works, rather obvious now that i think of it! =) On 27/03/2015 4:30 pm, Richard M. Heiberger wrote:> just reverse what you did before. > > newdata <- data > newdata[] <- NA > newdata[,!apply(is.na(data), 2, any)] <- myfunction(data_no_NA) > > On Fri, Mar 27, 2015 at 1:13 AM, Jatin Kala <jatin.kala.jk at gmail.com> wrote: >> Hi, >> I've got a rather large matrix of about 800 rows and 600000 columns. >> Each column is a time-series 800 long. >> >> Out of these 600000 time series, some have missing values (NA). >> I want to strip out all columns that have one or more NA values, i.e., only >> want full time series. >> >> This should do the trick: >> data_no_NA <- data[,!apply(is.na(data), 2, any)] >> >> I now use data_no_NA as input to a function, which returns output as a >> matrix of the same size as data_no_NA >> >> The trick is that i now need to put these columns back into a new 800 by >> 600000 empty matrix, at their original locations. >> Any suggestions on how to do that? hopefully without having to use loops. >> I'm using R/3.0.3 >> >> Cheers, >> Jatin. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
On 27 Mar 2015, at 09:58 , St?phane Adamowicz <stephane.adamowicz at avignon.inra.fr> wrote:> data_no_NA <- data[, complete.cases(t(data))==T]Ouch! logical == TRUE is bad, logical == T is worse: data[, complete.cases(t(data))] -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com