Michael Chirico
2019-Jul-08 09:40 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
I think of head() as a standard helper for "glancing" at objects, so I'm sometimes surprised that head() produces massive output: M = matrix(nrow = 10L, ncol = 100000L) print(head(M)) # <- beware, could be a huge print I assume there are lots of backwards-compatibility issues as well as valid use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out of the question. Is there any scope for adding a new argument to head.matrix that would allow this flexibility? IINM it should essentially be as simple to do head.array as: do.call(`[`, c(list(x, drop = FALSE), lapply(pmin(dim(x), n), seq_len))) (with extra decoration to handle -n, etc) [[alternative HTML version deleted]]
Abby Spurdle
2019-Jul-12 23:03 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
> I assume there are lots of backwards-compatibility issues as well as valid > use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is outof> the question.Agree.> Is there any scope for adding a new argument to head.matrix that would > allow this flexibility?I agree with what you're trying to achieve. However, I'm not sure this is as simple as you're suggesting. What if the user wants "head" in rows but "tail" in columns. Or "head" in rows, and both "head" and "tail" in columns. With head and tail alone, there's a combinatorial explosion. Also, when using tail on an unnamed matrix, it may be desirable to name rows and columns. And all of this assumes standard matrix objects. Add in a matrix subclasses and related objects, and things get more complex still. As I suggested in a another thread, a few days ago, I'm planning to write an R package for matrices and matrix-like objects (possibly extending the Matrix package), with an initial emphasis on subsetting, printing and formatting. So, I'm interested to hear more suggestions on this topic. [[alternative HTML version deleted]]
Gabriel Becker
2019-Jul-13 00:35 UTC
[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?
Hi Michael and Abby, So one thing that could happen that would be backwards compatible (with the exception of something that was an error no longer being an error) is head and tail could take vectors of length (dim(x)) rather than integers of length for n, with the default being n=6 being equivalent to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for the deprecation cycle, if not permanently. It not recycling would be unexpected based on the behavior of many R functions but would preserve the current behavior while granting more fine-grained control to users that feel they need it. A rapidly thrown-together prototype of such a method for the head of a matrix case is as follows: head2 = function(x, n = 6L, ...) { indvecs = lapply(seq_along(dim(x)), function(i) { if(length(n) >= i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni = max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i]) seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE) do.call("[", lstargs) }> mat = matrix(1:100, 10, 10)> *head(mat)*[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 11 21 31 41 51 61 71 81 91 [2,] 2 12 22 32 42 52 62 72 82 92 [3,] 3 13 23 33 43 53 63 73 83 93 [4,] 4 14 24 34 44 54 64 74 84 94 [5,] 5 15 25 35 45 55 65 75 85 95 [6,] 6 16 26 36 46 56 66 76 86 96> *head2(mat)*[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 1 11 21 31 41 51 61 71 81 91 [2,] 2 12 22 32 42 52 62 72 82 92 [3,] 3 13 23 33 43 53 63 73 83 93 [4,] 4 14 24 34 44 54 64 74 84 94 [5,] 5 15 25 35 45 55 65 75 85 95 [6,] 6 16 26 36 46 56 66 76 86 96> *head2(mat, c(2, 3))*[,1] [,2] [,3] [1,] 1 11 21 [2,] 2 12 22> *head2(mat, c(2, -9))*[,1] [1,] 1 [2,] 2 Now one thing to keep in mind here, is that I think we'd either a) have to make the non-recycling behavior permanent, or b) have head treat data.frames and matrices different with respect to the subsets they grab (which strikes me as a *Bad Plan *(tm)). So I don't think the default behavior would ever be mat[1:6, 1:6], not because of backwards compatibility, but because at least in my intuition that is just not what head on a data.frame should do by default, and I think the behaviors for the basic rectangular datatypes should "stick together". I mean, also because of backwards compatibility, but that could *in theory* change across a long enough deprecation cycle, but the conceptually right thing to do with a data.frame probably won't. All of that said, is head(mat, c(6, 6)) really that much easier to type/better than just mat[1:6, 1:6, drop=FALSE] (I know this will behave differently if any of the dims of mat are less than 6, but if so why are you heading it in the first place ;) )? I don't really have a strong feeling on the answer to that. I'm happy to put a patch for head.matrix, head.data.frame, tail.matrix and tail.data.frame, plus documentation, if people on R-core are interested in this. Note, as most here probably know, and as alluded to above, length(n) > 1 for head or tail currently give an error, so this would be an extension of the existing functionality in the mathematical extension sense, where all existing behavior would remain identical, but the support/valid parameter space would grow. Best, ~G On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle <spurdle.a at gmail.com> wrote:> > I assume there are lots of backwards-compatibility issues as well as > valid > > use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out > of > > the question. > > Agree. > > > Is there any scope for adding a new argument to head.matrix that would > > allow this flexibility? > > I agree with what you're trying to achieve. > However, I'm not sure this is as simple as you're suggesting. > > What if the user wants "head" in rows but "tail" in columns. > Or "head" in rows, and both "head" and "tail" in columns. > With head and tail alone, there's a combinatorial explosion. > > Also, when using tail on an unnamed matrix, it may be desirable to name > rows and columns. > > And all of this assumes standard matrix objects. > Add in a matrix subclasses and related objects, and things get more complex > still. > > As I suggested in a another thread, a few days ago, I'm planning to write > an R package for matrices and matrix-like objects (possibly extending the > Matrix package), with an initial emphasis on subsetting, printing and > formatting. > So, I'm interested to hear more suggestions on this topic. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >[[alternative HTML version deleted]]
Maybe Matching Threads
- head.matrix can return 1000s of columns -- limit to n or add new argument?
- head.matrix can return 1000s of columns -- limit to n or add new argument?
- head.matrix can return 1000s of columns -- limit to n or add new argument?
- head.matrix can return 1000s of columns -- limit to n or add new argument?
- head.matrix can return 1000s of columns -- limit to n or add new argument?