Philip Beevers
2005-Sep-15 06:34 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
Hi Bryan,> Does that sit well with everyone?Seems fine to me. Just revisiting one of Dragan''s points, though (sorry if I missed the answer) - is there a reason for making this global (via a #pragma) rather than, say, simply providing two functions which print in the different orders? e.g. printa() for sort by sample, printak() for sort by key. My reason for wanting to do both in the same script is that the current printa() ordering is nearly always right; it''s relatively (although very useful in those cases) to order by key. -- Philip Beevers Fidessa Infrastructure Development mailto:philip.beevers at fidessa.com phone: +44 1483 206571> -----Original Message----- > From: dtrace-discuss-bounces at opensolaris.org > [mailto:dtrace-discuss-bounces at opensolaris.org]On Behalf Of Bryan > Cantrill > Sent: 15 September 2005 07:29 > To: Jonathan Haslam > Cc: dtrace-discuss at opensolaris.org; Dragan Cvetkovic > Subject: Re: [dtrace-discuss] Can I use printa() for printing multiple > aggregations? > > > On Tue, Sep 13, 2005 at 11:00:21PM +0100, Jonathan Haslam wrote: > > > > >Okay. Perhaps all of the cases where I want to sum across > them will > > >be handled when we support multiple aggregating actions > (which won''t > > >be a part of this work, but which we plan to ultimately > do). An example > > >of this might be aggregating by both sum of bytes and a > quantization > > >of the size -- to save me from having to do the mental > calculation when > > >looking at quantize output. But I suppose even that case > could just > > >be handled by position-based ordering. Does > position-based ordering > > >_not_ violate the principle of least-surprise? (And we > could always > > >default to position-based and then provide an option to > have some other > > >ordering.) > > > > > > > > > > I''ll have a further think about it but, at the moment, I think that > > position-based > > provides good functionality and also a least-surprise > default behaviour. > > > > I guess you could always have 3 options to the aggsort #pragma; > > position(default), > > value and key? > > I think I''m going to introduce "#pragma D option aggsort=key", but I > would _really_ rather not have an additional position sorting > argument. > Not only would it be arcane, it wouldn''t even be complete: presumably > one would want to additionally define an arbitrary positioning order > for breaking ties. It''s doable, but it''s a mess (mainly, it''s a mess > to test). So I''m just going to leave it as strict positioning, with > ties resolved in positioning-order. (That is, if the first two > aggregation values for two keys are equal, the second two aggregation > values are checked and so on. If (and only if) all aggregation values > are equal, the keys will be checked as the final arbiter. > > Does that sit well with everyone? > > - Bryan > > -------------------------------------------------------------- > ------------ > Bryan Cantrill, Solaris Kernel Development. > http://blogs.sun.com/bmc > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >****************************************************************** This message is intended only for the stated addressee(s) and may be confidential. Access to this email by anyone else is unauthorised. Any opinions expressed in this email do not necessarily reflect the opinions of royalblue. Any unauthorised disclosure, use or dissemination, either whole or in part is prohibited. If you are not the intended recipient of this message, please notify the sender immediately. ******************************************************************
Bryan Cantrill
2005-Sep-15 16:44 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 07:34:10AM +0100, Philip Beevers wrote:> Hi Bryan, > > > Does that sit well with everyone? > > Seems fine to me. > > Just revisiting one of Dragan''s points, though (sorry if I missed the > answer) - is there a reason for making this global (via a #pragma) rather > than, say, simply providing two functions which print in the different > orders? e.g. printa() for sort by sample, printak() for sort by key. > > My reason for wanting to do both in the same script is that the current > printa() ordering is nearly always right; it''s relatively (although very > useful in those cases) to order by key.You''ll be able to set the option dynamically via the new setopt() action that I introduced in my last large wad. (This hasn''t yet hit Solaris Express -- and yes, we have a ton of doc work to do.) So you''ll be able to print an aggregation two different ways by doing this: printf("By value:\n"); printf("%20s %@d\n", @foo); setopt("aggsort", "key"); printf("By key:\n"); printf("%20s %@d\n", @foo); I don''t want to introduce an additional entry point, because the matrix blows out as other options are introduced. (There is precedent for this in the way we do aggregation normalization, which doesn''t require an additional printa() entry point.) As a concrete example, I think I''ll also introduce reverse-sorting as an option (it seems to have come up, and it''s easy to add); it''s much cleaner to add an "aggorder" than it is to have printakr() (or whatever). Interface question: to me, it seems like it might be slightly cleaner to introduce these options as boolean options. That way, you won''t have to go to the docs (or the help message or whatever) to remember how to set the option. So instead of "aggsort" being set to "key", it would be "aggsortkey" and you would set it to "true" (or "1" or "on" or whatever -- as of my latest wad, we accept just about any synonym for "true" for boolean options). And instead of "aggorder" it would be "aggrevsort" or something. Thoughts? - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
Jim Mauro
2005-Sep-15 17:43 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
Will the setopt() interface be constrained to setting options related only to aggregations? If so, would setaggopt() be more intuitive? setaggopt("sort","key"); (just rambling here...). I personally like the idea of a simple boolean model for setting these options. I find: setopt("revsort","true"); or setaggopt("revsort","true"); much more palatable then something like setopt("aggsort","rev"); ...or whatever. Chiming in... /jim Bryan Cantrill wrote:>On Thu, Sep 15, 2005 at 07:34:10AM +0100, Philip Beevers wrote: > > >>Hi Bryan, >> >> >> >>>Does that sit well with everyone? >>> >>> >>Seems fine to me. >> >>Just revisiting one of Dragan''s points, though (sorry if I missed the >>answer) - is there a reason for making this global (via a #pragma) rather >>than, say, simply providing two functions which print in the different >>orders? e.g. printa() for sort by sample, printak() for sort by key. >> >>My reason for wanting to do both in the same script is that the current >>printa() ordering is nearly always right; it''s relatively (although very >>useful in those cases) to order by key. >> >> > >You''ll be able to set the option dynamically via the new setopt() action >that I introduced in my last large wad. (This hasn''t yet hit Solaris >Express -- and yes, we have a ton of doc work to do.) > >So you''ll be able to print an aggregation two different ways by doing >this: > > printf("By value:\n"); > printf("%20s %@d\n", @foo); > > setopt("aggsort", "key"); > > printf("By key:\n"); > printf("%20s %@d\n", @foo); > >I don''t want to introduce an additional entry point, because the >matrix blows out as other options are introduced. (There is precedent for >this in the way we do aggregation normalization, which doesn''t require an >additional printa() entry point.) As a concrete example, I think I''ll >also introduce reverse-sorting as an option (it seems to have come up, and >it''s easy to add); it''s much cleaner to add an "aggorder" than it is to >have printakr() (or whatever). > >Interface question: to me, it seems like it might be slightly cleaner >to introduce these options as boolean options. That way, you won''t have >to go to the docs (or the help message or whatever) to remember how to >set the option. So instead of "aggsort" being set to "key", it would >be "aggsortkey" and you would set it to "true" (or "1" or "on" or >whatever -- as of my latest wad, we accept just about any synonym for >"true" for boolean options). And instead of "aggorder" it would be >"aggrevsort" or something. Thoughts? > > - Bryan > >-------------------------------------------------------------------------- >Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >
Adam Leventhal
2005-Sep-15 19:14 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 09:44:09AM -0700, Bryan Cantrill wrote:> Interface question: to me, it seems like it might be slightly cleaner > to introduce these options as boolean options. That way, you won''t have > to go to the docs (or the help message or whatever) to remember how to > set the option. So instead of "aggsort" being set to "key", it would > be "aggsortkey" and you would set it to "true" (or "1" or "on" or > whatever -- as of my latest wad, we accept just about any synonym for > "true" for boolean options). And instead of "aggorder" it would be > "aggrevsort" or something. Thoughts?I have a slight preference for having "aggsort" be the option with "key", "revkey", etc. parameters as I think it provides some additional flexibility if we want to add some new ways of sorting aggregations. Is your argument that it''s easier to remember "aggsortkey=true" than "aggsort=key"? I think in either case we''re going to need to reserve space for this on the DTrace quick reference mug. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Bryan Cantrill
2005-Sep-15 22:48 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 12:14:28PM -0700, Adam Leventhal wrote:> On Thu, Sep 15, 2005 at 09:44:09AM -0700, Bryan Cantrill wrote: > > Interface question: to me, it seems like it might be slightly cleaner > > to introduce these options as boolean options. That way, you won''t have > > to go to the docs (or the help message or whatever) to remember how to > > set the option. So instead of "aggsort" being set to "key", it would > > be "aggsortkey" and you would set it to "true" (or "1" or "on" or > > whatever -- as of my latest wad, we accept just about any synonym for > > "true" for boolean options). And instead of "aggorder" it would be > > "aggrevsort" or something. Thoughts? > > I have a slight preference for having "aggsort" be the option with > "key", "revkey", etc. parameters as I think it provides some additional > flexibility if we want to add some new ways of sorting aggregations.The problem is that the number of strings increases exponentially with the number of sorting options. If we want to (say) add equivalents to the "-n" or "-d" options to sort(1) (neither of which would be unreasonable), it would induce absurd option values ("#pragma D option aggsort=revkeynumdict"?) that become even more absurd if you don''t want to rely on our defaults ("#pragma D option aggsort=norevvalnonumnodict"?!).> Is your argument that it''s easier to remember "aggsortkey=true" than > "aggsort=key"? I think in either case we''re going to need to reserve > space for this on the DTrace quick reference mug.Your point is well-made. While you wouldn''t have to set "aggsortkey" to explicitly be "true" (you can just set it -- ''setopt("aggsortkey")'' or "#pragma D option aggsortkey"), I agree that it''s a bit of a mouthful. Maybe drop the "agg" and name the options "sortkey", "sortrev" and so on? There''s not sorting elsewhere in DTrace, so the "agg" qualifier seems a bit redundant. Does removing it help the mnemonic properties? Also, I should add that I''m going to fix a long-standing annoyance: there will be a way to get dtrace(1M) to spout out all possible options and a short description of them. Hopefully will save someone a trip to the (mythical) reference mug, anyway... - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc
Adam Leventhal
2005-Sep-15 23:41 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill wrote:> > I have a slight preference for having "aggsort" be the option with > > "key", "revkey", etc. parameters as I think it provides some additional > > flexibility if we want to add some new ways of sorting aggregations. > > The problem is that the number of strings increases exponentially with > the number of sorting options. If we want to (say) add equivalents to > the "-n" or "-d" options to sort(1) (neither of which would be > unreasonable), it would induce absurd option values ("#pragma D option > aggsort=revkeynumdict"?) that become even more absurd if you don''t want to > rely on our defaults ("#pragma D option aggsort=norevvalnonumnodict"?!).Of course. If all the ways of sorting are orthogonal we would end up with the outer product as you describe in which case splitting each axis into its own option would be preferable. But if options are such that they don''t make sense when combined we could end up with some very strange syntax or error messages. How about something like "aggsort=key,rev" or "aggsort=numkey" where the parameter would be a comma-delimited list of options? In that case, something like "aggsort=key,numkey" would generate an error. Or is that adding too much complexity with only a hypothetical benefit?> > Is your argument that it''s easier to remember "aggsortkey=true" than > > "aggsort=key"? I think in either case we''re going to need to reserve > > space for this on the DTrace quick reference mug. > > Your point is well-made. While you wouldn''t have to set "aggsortkey" to > explicitly be "true" (you can just set it -- ''setopt("aggsortkey")'' or > "#pragma D option aggsortkey"), I agree that it''s a bit of a mouthful. > Maybe drop the "agg" and name the options "sortkey", "sortrev" and so on? > There''s not sorting elsewhere in DTrace, so the "agg" qualifier seems > a bit redundant. Does removing it help the mnemonic properties?Recall that we''ve talked about adding a feature to sort output based on timestamp, but I agree that nothing _currently_ has any kind of sorting semantics.> Also, I should add that I''m going to fix a long-standing annoyance: > there will be a way to get dtrace(1M) to spout out all possible options > and a short description of them. Hopefully will save someone a trip > to the (mythical) reference mug, anyway...Cool. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Bryan Cantrill
2005-Sep-16 00:01 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 04:41:19PM -0700, Adam Leventhal wrote:> On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill wrote: > > > I have a slight preference for having "aggsort" be the option with > > > "key", "revkey", etc. parameters as I think it provides some additional > > > flexibility if we want to add some new ways of sorting aggregations. > > > > The problem is that the number of strings increases exponentially with > > the number of sorting options. If we want to (say) add equivalents to > > the "-n" or "-d" options to sort(1) (neither of which would be > > unreasonable), it would induce absurd option values ("#pragma D option > > aggsort=revkeynumdict"?) that become even more absurd if you don''t want to > > rely on our defaults ("#pragma D option aggsort=norevvalnonumnodict"?!). > > Of course. If all the ways of sorting are orthogonal we would end up with > the outer product as you describe in which case splitting each axis into > its own option would be preferable. But if options are such that they > don''t make sense when combined we could end up with some very strange > syntax or error messages.I think most of these options _are_ orthogonal. And even where they aren''t, you actually get crisper error handling by forcing each one to be its own boolean option: you get an error when setting the option that has created the inconsistency as opposed to an error when setting the inconsistent set of options.> How about something like "aggsort=key,rev" or "aggsort=numkey" where > the parameter would be a comma-delimited list of options? In that case, > something like "aggsort=key,numkey" would generate an error. Or is that > adding too much complexity with only a hypothetical benefit?You still have the problem that you must support either antonyms or explicit negatives for each sorting option. Is it obvious that "val" is the opposite of "key"? Perhaps, perhaps not -- but it''s certainly obvious that setting the "aggsortkey" option is the opposite of unsetting it or explicitly setting it to false.> > > Is your argument that it''s easier to remember "aggsortkey=true" than > > > "aggsort=key"? I think in either case we''re going to need to reserve > > > space for this on the DTrace quick reference mug. > > > > Your point is well-made. While you wouldn''t have to set "aggsortkey" to > > explicitly be "true" (you can just set it -- ''setopt("aggsortkey")'' or > > "#pragma D option aggsortkey"), I agree that it''s a bit of a mouthful. > > Maybe drop the "agg" and name the options "sortkey", "sortrev" and so on? > > There''s not sorting elsewhere in DTrace, so the "agg" qualifier seems > > a bit redundant. Does removing it help the mnemonic properties? > > Recall that we''ve talked about adding a feature to sort output based on > timestamp, but I agree that nothing _currently_ has any kind of sorting > semantics.The more I think about it, the more I want the "agg" prefix anyway, if only to keep with the "size" precedent ("aggsize" being the option to set the aggregation buffer size). So it''s looking lie "aggsortkey", "aggsortrev", etc. Does this rub anyone the wrong way? - Bryan -------------------------------------------------------------------------- Bryan Cantrill, Solaris Kernel Development. http://blogs.sun.com/bmc P.S. Also, for those who are new to DTrace who might be wondering: yes, there is this much discussion and debate for just about every new feature in DTrace. Just be glad that many of the extended debates over minutae have been spared the public eye...
Adam Leventhal
2005-Sep-16 00:36 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 05:01:23PM -0700, Bryan Cantrill wrote:> > > Your point is well-made. While you wouldn''t have to set "aggsortkey" to > > > explicitly be "true" (you can just set it -- ''setopt("aggsortkey")'' or > > > "#pragma D option aggsortkey"), I agree that it''s a bit of a mouthful. > > > Maybe drop the "agg" and name the options "sortkey", "sortrev" and so on? > > > There''s not sorting elsewhere in DTrace, so the "agg" qualifier seems > > > a bit redundant. Does removing it help the mnemonic properties? > > > > Recall that we''ve talked about adding a feature to sort output based on > > timestamp, but I agree that nothing _currently_ has any kind of sorting > > semantics. > > The more I think about it, the more I want the "agg" prefix anyway, if > only to keep with the "size" precedent ("aggsize" being the option to set > the aggregation buffer size). So it''s looking lie "aggsortkey", > "aggsortrev", etc. Does this rub anyone the wrong way?Sounds good.> P.S. Also, for those who are new to DTrace who might be wondering: yes, > there is this much discussion and debate for just about every new feature > in DTrace. Just be glad that many of the extended debates over minutae > have been spared the public eye...... though there''s usually considerably more swearing. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Alexander Kolbasov
2005-Sep-16 01:16 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
> > The more I think about it, the more I want the "agg" prefix anyway, if > > only to keep with the "size" precedent ("aggsize" being the option to set > > the aggregation buffer size). So it''s looking lie "aggsortkey", > > "aggsortrev", etc. Does this rub anyone the wrong way? > > Sounds good.There will be the corresponding command-line option, right?
Adam Leventhal
2005-Sep-16 01:35 UTC
[dtrace-discuss] Can I use printa() for printing multiple agg regations?
On Thu, Sep 15, 2005 at 06:16:37PM -0700, Alexander Kolbasov wrote:> > > The more I think about it, the more I want the "agg" prefix anyway, if > > > only to keep with the "size" precedent ("aggsize" being the option to set > > > the aggregation buffer size). So it''s looking lie "aggsortkey", > > > "aggsortrev", etc. Does this rub anyone the wrong way? > > > > Sounds good. > > There will be the corresponding command-line option, right?Everything you can do with #pragma D option option[=value] can be set with dtrace -x option[=value]. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Michael Schuster - Sun Germany
2005-Sep-16 06:30 UTC
[dtrace-discuss] Can I use printa() for printing multiple aggregations?
Bryan Cantrill wrote:> > On Thu, Sep 15, 2005 at 04:41:19PM -0700, Adam Leventhal wrote: > > On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill wrote: > > > > I have a slight preference for having "aggsort" be the option with > > > > "key", "revkey", etc. parameters as I think it provides some additional > > > > flexibility if we want to add some new ways of sorting aggregations. > > > > > > The problem is that the number of strings increases exponentially with > > > the number of sorting options. If we want to (say) add equivalents to > > > the "-n" or "-d" options to sort(1) (neither of which would be > > > unreasonable), it would induce absurd option values ("#pragma D option > > > aggsort=revkeynumdict"?) that become even more absurd if you don''t want to > > > rely on our defaults ("#pragma D option aggsort=norevvalnonumnodict"?!). > > > > Of course. If all the ways of sorting are orthogonal we would end up with > > the outer product as you describe in which case splitting each axis into > > its own option would be preferable. But if options are such that they > > don''t make sense when combined we could end up with some very strange > > syntax or error messages. > > I think most of these options _are_ orthogonal. And even where they > aren''t, you actually get crisper error handling by forcing each one to be > its own boolean option: you get an error when setting the option that > has created the inconsistency as opposed to an error when setting the > inconsistent set of options.I was tending towards advocating something similar to the way you can pass options on the commandline, eg "ls -las" in one long string ... but on second thoughts, I feel that Bryan''s suggestion makes for much better readability of a D script. cheers -- Michael Schuster (+49 89) 46008-2974 / x62974 visit the online support center: http://www.sun.com/osc/ Recursion, n.: see ''Recursion''