Hi R-users: I have a data formatting question. I have a data set that looks something like this: foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) What I have: [,1] [,2] [1,] NA NA [2,] 1 NA [3,] 2 0 [4,] 3 10 [5,] 4 20 [6,] 5 30 I want to line up the columns by the first value that is not NA. Like so: [,1] [,2] [1,] 1 0 [2,] 2 10 [3,] 3 20 [4,] 4 30 [5,] 5 NA [6,] NA NA Question is: Is there an elegant way to do this without a for loop? I tried doing this with na.omit and na.exclude without success. The real data is many hundreds of columns and many thousands of rows. Thanks in advance, Tim Sign up for Internet Service under $10 dollars a month, at http://isp.BlueLight.com
Would this do what you want? Cheers, Jerome> foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) > apply(foo.dat,2,function(x) x[order(as.logical(x))])[,1] [,2] [1,] 1 0 [2,] 2 10 [3,] 3 20 [4,] 4 30 [5,] 5 NA [6,] NA NA On Wednesday 12 February 2003 12:42, Tim Sharac wrote:> Content-Length: 918 > Status: R > X-Status: N > > Hi R-users: > > I have a data formatting question. I have a data set that looks something > like this: > > foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) > > What I have: > > [,1] [,2] > [1,] NA NA > [2,] 1 NA > [3,] 2 0 > [4,] 3 10 > [5,] 4 20 > [6,] 5 30 > > > I want to line up the columns by the first value that is not NA. Like so: > > [,1] [,2] > [1,] 1 0 > [2,] 2 10 > [3,] 3 20 > [4,] 4 30 > [5,] 5 NA > [6,] NA NA > > Question is: Is there an elegant way to do this without a for loop? > > I tried doing this with na.omit and na.exclude without success. > > The real data is many hundreds of columns and many thousands of rows. > > Thanks in advance, Tim > > Sign up for Internet Service under $10 dollars a month, at > http://isp.BlueLight.com > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Is apply(foo.dat, 2, sort, na.last = TRUE) what you want? -roger _______________________________ UCLA Department of Statistics rpeng at stat.ucla.edu http://www.stat.ucla.edu/~rpeng On 12 Feb 2003, Tim Sharac wrote:> Hi R-users: > > I have a data formatting question. I have a data set that looks something like this: > > foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) > > What I have: > > [,1] [,2] > [1,] NA NA > [2,] 1 NA > [3,] 2 0 > [4,] 3 10 > [5,] 4 20 > [6,] 5 30 > > > I want to line up the columns by the first value that is not NA. Like so: > > [,1] [,2] > [1,] 1 0 > [2,] 2 10 > [3,] 3 20 > [4,] 4 30 > [5,] 5 NA > [6,] NA NA > > Question is: Is there an elegant way to do this without a for loop? > > I tried doing this with na.omit and na.exclude without success. > > The real data is many hundreds of columns and many thousands of rows. > > Thanks in advance, Tim > > Sign up for Internet Service under $10 dollars a month, at http://isp.BlueLight.com > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > http://www.stat.math.ethz.ch/mailman/listinfo/r-help >
I do something like this as part of a missing value plot where I 'flush' the rows with most missing spots to the bottom and the columns with the most missing values to the right. In this manner I can visually see how much data I will retain (by looking at the area) if I decide to truncate rows/columns by missing value criterion. You might find this code useful in rearranging your matrix: na.mat <- 1*is.na(data) spots.na.per.row <- rowSums(na.mat)/ncol(data) # calculates the percentage missing by row spots.na.per.column <- colSums(na.mat)/nrow(data) data.re <- data[ order(spots.na.per.row), order(spots.na.per.column) ] -----Original Message----- From: Jerome Asselin [mailto:jerome at hivnet.ubc.ca] Sent: Thursday, February 13, 2003 5:06 AM To: Tim Sharac; r-help at stat.math.ethz.ch Subject: Re: [R] Matrix formatting Would this do what you want? Cheers, Jerome> foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) > apply(foo.dat,2,function(x) x[order(as.logical(x))])[,1] [,2] [1,] 1 0 [2,] 2 10 [3,] 3 20 [4,] 4 30 [5,] 5 NA [6,] NA NA On Wednesday 12 February 2003 12:42, Tim Sharac wrote:> Content-Length: 918 > Status: R > X-Status: N > > Hi R-users: > > I have a data formatting question. I have a data set that looks > something like this: > > foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) > > What I have: > > [,1] [,2] > [1,] NA NA > [2,] 1 NA > [3,] 2 0 > [4,] 3 10 > [5,] 4 20 > [6,] 5 30 > > > I want to line up the columns by the first value that is not NA. Like > so: > > [,1] [,2] > [1,] 1 0 > [2,] 2 10 > [3,] 3 20 > [4,] 4 30 > [5,] 5 NA > [6,] NA NA > > Question is: Is there an elegant way to do this without a for loop? > > I tried doing this with na.omit and na.exclude without success. > > The real data is many hundreds of columns and many thousands of rows. > > Thanks in advance, Tim > > Sign up for Internet Service under $10 dollars a month, at > http://isp.BlueLight.com > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > http://www.stat.math.ethz.ch/mailman/listinfo/r-help______________________________________________ R-help at stat.math.ethz.ch mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help
Given all these NA formatting replies - I have a question of my own. I too have an object like foo.dat from the pervious posts: foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30))> foo.dat[,1] [,2] [1,] NA NA [2,] 1 NA [3,] 2 0 [4,] 3 10 [5,] 4 20 [6,] 5 30 and I have vector I want to subtract from each column of foo.dat say: foo.subtract <- c(0.1, 0.2, 0.3, 0.4, 0.5) I want to perform the subtraction from foo.dat but preserve the structure of the data. I.e., [,1] [,2] [1,] NA NA [2,] 0.9 NA [3,] 1.8 -0.1 [4,] 2.7 9.8 [5,] 3.6 19.7 [6,] 4.5 29.6 The final formatting of foo.dat has to be intact - I can't just shove the NAs to the bottom. I tried to order the data in an apply function but couldn't make it work. Thanks, Andy -----Original Message----- From: r-help-admin at stat.math.ethz.ch [mailto:r-help-admin at stat.math.ethz.ch] On Behalf Of Tim Sharac Sent: Wednesday, February 12, 2003 1:42 PM To: r-help at stat.math.ethz.ch Subject: [R] Matrix formatting Hi R-users: I have a data formatting question. I have a data set that looks something like this: foo.dat <- cbind(c(NA, 1, 2, 3, 4, 5), c(NA, NA, 0, 10 ,20, 30)) What I have: [,1] [,2] [1,] NA NA [2,] 1 NA [3,] 2 0 [4,] 3 10 [5,] 4 20 [6,] 5 30 I want to line up the columns by the first value that is not NA. Like so: [,1] [,2] [1,] 1 0 [2,] 2 10 [3,] 3 20 [4,] 4 30 [5,] 5 NA [6,] NA NA Question is: Is there an elegant way to do this without a for loop? I tried doing this with na.omit and na.exclude without success. The real data is many hundreds of columns and many thousands of rows. Thanks in advance, Tim Sign up for Internet Service under $10 dollars a month, at http://isp.BlueLight.com ______________________________________________ R-help at stat.math.ethz.ch mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help