thr3ads.net - dtrace discuss - [dtrace-discuss] Can I use printa() for printing multiple agg regations? [Sep 2005]

If this information is useful, please help other people find it:
Share via:

Philip Beevers

2005-Sep-15 06:34 UTC

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

Hi Bryan,
> Does that sit well with everyone?
Seems fine to me.

Just revisiting one of Dragan''s points, though (sorry if I missed the
answer) - is there a reason for making this global (via a #pragma) rather
than, say, simply providing two functions which print in the different
orders? e.g. printa() for sort by sample, printak() for sort by key.

My reason for wanting to do both in the same script is that the current
printa() ordering is nearly always right; it''s relatively (although
very
useful in those cases) to order by key.

-- 

Philip Beevers
Fidessa Infrastructure Development

mailto:philip.beevers at fidessa.com
phone: +44 1483 206571

> -----Original Message-----
> From: dtrace-discuss-bounces at opensolaris.org
> [mailto:dtrace-discuss-bounces at opensolaris.org]On Behalf Of Bryan
> Cantrill
> Sent: 15 September 2005 07:29
> To: Jonathan Haslam
> Cc: dtrace-discuss at opensolaris.org; Dragan Cvetkovic
> Subject: Re: [dtrace-discuss] Can I use printa() for printing multiple
> aggregations?
> 
> 
> On Tue, Sep 13, 2005 at 11:00:21PM +0100, Jonathan Haslam wrote:
> > 
> > >Okay.  Perhaps all of the cases where I want to sum across 
> them will
> > >be handled when we support multiple aggregating actions 
> (which won''t
> > >be a part of this work, but which we plan to ultimately 
> do).  An example
> > >of this might be aggregating by both sum of bytes and a 
> quantization
> > >of the size -- to save me from having to do the mental 
> calculation when
> > >looking at quantize output.  But I suppose even that case 
> could just
> > >be handled by position-based ordering.  Does 
> position-based ordering
> > >_not_ violate the principle of least-surprise?  (And we 
> could always
> > >default to position-based and then provide an option to 
> have some other
> > >ordering.)
> > > 
> > >
> > 
> > I''ll have a further think about it but, at the moment, I
think that
> > position-based
> > provides good functionality and also a least-surprise 
> default behaviour.
> > 
> > I guess you could always have 3 options to the aggsort #pragma; 
> > position(default),
> > value and key?
> 
> I think I''m going to introduce "#pragma D option
aggsort=key", but I
> would _really_ rather not have an additional position sorting 
> argument.
> Not only would it be arcane, it wouldn''t even be complete: 
presumably
> one would want to additionally define an arbitrary positioning order
> for breaking ties.  It''s doable, but it''s a mess (mainly,
it''s a mess
> to test).  So I''m just going to leave it as strict positioning,
with
> ties resolved in positioning-order.  (That is, if the first two
> aggregation values for two keys are equal, the second two aggregation
> values are checked and so on.  If (and only if) all aggregation values
> are equal, the keys will be checked as the final arbiter.
> 
> Does that sit well with everyone?
> 
> 	- Bryan
> 
> --------------------------------------------------------------
> ------------
> Bryan Cantrill, Solaris Kernel Development.       
> http://blogs.sun.com/bmc
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
> 

******************************************************************    
This message is intended only for the stated addressee(s) and
may be confidential.  Access to this email by anyone else is
unauthorised. Any opinions expressed in this email do not
necessarily reflect the opinions of royalblue. Any unauthorised
disclosure, use or dissemination, either whole or in part is
prohibited. If you are not the intended recipient of this message,
please notify the sender immediately.
******************************************************************

Bryan Cantrill

2005-Sep-15 16:44 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 07:34:10AM +0100, Philip Beevers
wrote:> Hi Bryan,
> 
> > Does that sit well with everyone?
> 
> Seems fine to me.
> 
> Just revisiting one of Dragan''s points, though (sorry if I missed
the
> answer) - is there a reason for making this global (via a #pragma) rather
> than, say, simply providing two functions which print in the different
> orders? e.g. printa() for sort by sample, printak() for sort by key.
> 
> My reason for wanting to do both in the same script is that the current
> printa() ordering is nearly always right; it''s relatively
(although very
> useful in those cases) to order by key.
You''ll be able to set the option dynamically via the new setopt()
action
that I introduced in my last large wad.  (This hasn''t yet hit Solaris
Express -- and yes, we have a ton of doc work to do.)

So you''ll be able to print an aggregation two different ways by doing
this:

	printf("By value:\n");
	printf("%20s %@d\n", @foo);

	setopt("aggsort", "key");

	printf("By key:\n");
	printf("%20s %@d\n", @foo);

I don''t want to introduce an additional entry point, because the 
matrix blows out as other options are introduced.  (There is precedent for
this in the way we do aggregation normalization, which doesn''t require
an
additional printa() entry point.)  As a concrete example, I think I''ll
also introduce reverse-sorting as an option (it seems to have come up, and
it''s easy to add); it''s much cleaner to add an
"aggorder" than it is to
have printakr() (or whatever).

Interface question:  to me, it seems like it might be slightly cleaner
to introduce these options as boolean options.  That way, you won''t
have
to go to the docs (or the help message or whatever) to remember how to
set the option.  So instead of "aggsort" being set to "key",
it would
be "aggsortkey" and you would set it to "true" (or
"1" or "on" or
whatever -- as of my latest wad, we accept just about any synonym for
"true" for boolean options).  And instead of "aggorder" it
would be
"aggrevsort" or something.  Thoughts?

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Jim Mauro

2005-Sep-15 17:43 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

Will the setopt() interface be constrained to setting options related
only to aggregations? If so, would setaggopt() be more intuitive?

setaggopt("sort","key");

(just rambling here...).

I personally like the idea of a simple boolean model for setting these
options. I find:

setopt("revsort","true");

or

setaggopt("revsort","true");

much more palatable then something like

setopt("aggsort","rev");

...or whatever.

Chiming in...

/jim





Bryan Cantrill wrote:
>On Thu, Sep 15, 2005 at 07:34:10AM +0100, Philip Beevers wrote:
>  
>
>>Hi Bryan,
>>
>>    
>>
>>>Does that sit well with everyone?
>>>      
>>>
>>Seems fine to me.
>>
>>Just revisiting one of Dragan''s points, though (sorry if I
missed the
>>answer) - is there a reason for making this global (via a #pragma)
rather
>>than, say, simply providing two functions which print in the different
>>orders? e.g. printa() for sort by sample, printak() for sort by key.
>>
>>My reason for wanting to do both in the same script is that the current
>>printa() ordering is nearly always right; it''s relatively
(although very
>>useful in those cases) to order by key.
>>    
>>
>
>You''ll be able to set the option dynamically via the new setopt()
action
>that I introduced in my last large wad.  (This hasn''t yet hit
Solaris
>Express -- and yes, we have a ton of doc work to do.)
>
>So you''ll be able to print an aggregation two different ways by
doing
>this:
>
>	printf("By value:\n");
>	printf("%20s %@d\n", @foo);
>
>	setopt("aggsort", "key");
>
>	printf("By key:\n");
>	printf("%20s %@d\n", @foo);
>
>I don''t want to introduce an additional entry point, because the 
>matrix blows out as other options are introduced.  (There is precedent for
>this in the way we do aggregation normalization, which doesn''t
require an
>additional printa() entry point.)  As a concrete example, I think
I''ll
>also introduce reverse-sorting as an option (it seems to have come up, and
>it''s easy to add); it''s much cleaner to add an
"aggorder" than it is to
>have printakr() (or whatever).
>
>Interface question:  to me, it seems like it might be slightly cleaner
>to introduce these options as boolean options.  That way, you won''t
have
>to go to the docs (or the help message or whatever) to remember how to
>set the option.  So instead of "aggsort" being set to
"key", it would
>be "aggsortkey" and you would set it to "true" (or
"1" or "on" or
>whatever -- as of my latest wad, we accept just about any synonym for
>"true" for boolean options).  And instead of "aggorder"
it would be
>"aggrevsort" or something.  Thoughts?
>
>	- Bryan
>
>--------------------------------------------------------------------------
>Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc
>_______________________________________________
>dtrace-discuss mailing list
>dtrace-discuss at opensolaris.org
>  
>

Adam Leventhal

2005-Sep-15 19:14 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 09:44:09AM -0700, Bryan Cantrill
wrote:> Interface question:  to me, it seems like it might be slightly cleaner
> to introduce these options as boolean options.  That way, you
won''t have
> to go to the docs (or the help message or whatever) to remember how to
> set the option.  So instead of "aggsort" being set to
"key", it would
> be "aggsortkey" and you would set it to "true" (or
"1" or "on" or
> whatever -- as of my latest wad, we accept just about any synonym for
> "true" for boolean options).  And instead of "aggorder"
it would be
> "aggrevsort" or something.  Thoughts?
I have a slight preference for having "aggsort" be the option with
"key", "revkey", etc. parameters as I think it provides some
additional
flexibility if we want to add some new ways of sorting aggregations.

Is your argument that it''s easier to remember
"aggsortkey=true" than
"aggsort=key"? I think in either case we''re going to need to
reserve
space for this on the DTrace quick reference mug.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Bryan Cantrill

2005-Sep-15 22:48 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 12:14:28PM -0700, Adam Leventhal
wrote:> On Thu, Sep 15, 2005 at 09:44:09AM -0700, Bryan Cantrill wrote:
> > Interface question:  to me, it seems like it might be slightly cleaner
> > to introduce these options as boolean options.  That way, you
won''t have
> > to go to the docs (or the help message or whatever) to remember how to
> > set the option.  So instead of "aggsort" being set to
"key", it would
> > be "aggsortkey" and you would set it to "true" (or
"1" or "on" or
> > whatever -- as of my latest wad, we accept just about any synonym for
> > "true" for boolean options).  And instead of
"aggorder" it would be
> > "aggrevsort" or something.  Thoughts?
> 
> I have a slight preference for having "aggsort" be the option
with
> "key", "revkey", etc. parameters as I think it provides
some additional
> flexibility if we want to add some new ways of sorting aggregations.
The problem is that the number of strings increases exponentially with
the number of sorting options.  If we want to (say) add equivalents to
the "-n" or "-d" options to sort(1) (neither of which would
be
unreasonable), it would induce absurd option values ("#pragma D option
aggsort=revkeynumdict"?) that become even more absurd if you don''t
want to
rely on our defaults ("#pragma D option
aggsort=norevvalnonumnodict"?!).
> Is your argument that it''s easier to remember
"aggsortkey=true" than
> "aggsort=key"? I think in either case we''re going to
need to reserve
> space for this on the DTrace quick reference mug.
Your point is well-made.  While you wouldn''t have to set
"aggsortkey" to
explicitly be "true" (you can just set it --
''setopt("aggsortkey")'' or
"#pragma D option aggsortkey"), I agree that it''s a bit of a
mouthful.
Maybe drop the "agg" and name the options "sortkey",
"sortrev" and so on?
There''s not sorting elsewhere in DTrace, so the "agg"
qualifier seems
a bit redundant.  Does removing it help the mnemonic properties?

Also, I should add that I''m going to fix a long-standing annoyance:
there will be a way to get dtrace(1M) to spout out all possible options
and a short description of them.  Hopefully will save someone a trip
to the (mythical) reference mug, anyway...

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Adam Leventhal

2005-Sep-15 23:41 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill
wrote:> > I have a slight preference for having "aggsort" be the
option with
> > "key", "revkey", etc. parameters as I think it
provides some additional
> > flexibility if we want to add some new ways of sorting aggregations.
> 
> The problem is that the number of strings increases exponentially with
> the number of sorting options.  If we want to (say) add equivalents to
> the "-n" or "-d" options to sort(1) (neither of which
would be
> unreasonable), it would induce absurd option values ("#pragma D option
> aggsort=revkeynumdict"?) that become even more absurd if you
don''t want to
> rely on our defaults ("#pragma D option
aggsort=norevvalnonumnodict"?!).
Of course. If all the ways of sorting are orthogonal we would end up with
the outer product as you describe in which case splitting each axis into
its own option would be preferable. But if options are such that they
don''t make sense when combined we could end up with some very strange
syntax or error messages.

How about something like "aggsort=key,rev" or
"aggsort=numkey" where
the parameter would be a comma-delimited list of options? In that case,
something like "aggsort=key,numkey" would generate an error. Or is
that
adding too much complexity with only a hypothetical benefit?
> > Is your argument that it''s easier to remember
"aggsortkey=true" than
> > "aggsort=key"? I think in either case we''re going
to need to reserve
> > space for this on the DTrace quick reference mug.
> 
> Your point is well-made.  While you wouldn''t have to set
"aggsortkey" to
> explicitly be "true" (you can just set it --
''setopt("aggsortkey")'' or
> "#pragma D option aggsortkey"), I agree that it''s a bit
of a mouthful.
> Maybe drop the "agg" and name the options "sortkey",
"sortrev" and so on?
> There''s not sorting elsewhere in DTrace, so the "agg"
qualifier seems
> a bit redundant.  Does removing it help the mnemonic properties?
Recall that we''ve talked about adding a feature to sort output based on
timestamp, but I agree that nothing _currently_ has any kind of sorting
semantics.
> Also, I should add that I''m going to fix a long-standing
annoyance:
> there will be a way to get dtrace(1M) to spout out all possible options
> and a short description of them.  Hopefully will save someone a trip
> to the (mythical) reference mug, anyway...
Cool.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Bryan Cantrill

2005-Sep-16 00:01 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 04:41:19PM -0700, Adam Leventhal
wrote:> On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill wrote:
> > > I have a slight preference for having "aggsort" be the
option with
> > > "key", "revkey", etc. parameters as I think
it provides some additional
> > > flexibility if we want to add some new ways of sorting
aggregations.
> > 
> > The problem is that the number of strings increases exponentially with
> > the number of sorting options.  If we want to (say) add equivalents to
> > the "-n" or "-d" options to sort(1) (neither of
which would be
> > unreasonable), it would induce absurd option values ("#pragma D
option
> > aggsort=revkeynumdict"?) that become even more absurd if you
don''t want to
> > rely on our defaults ("#pragma D option
aggsort=norevvalnonumnodict"?!).
> 
> Of course. If all the ways of sorting are orthogonal we would end up with
> the outer product as you describe in which case splitting each axis into
> its own option would be preferable. But if options are such that they
> don''t make sense when combined we could end up with some very
strange
> syntax or error messages.
I think most of these options _are_ orthogonal.  And even where they 
aren''t, you actually get crisper error handling by forcing each one to
be
its own boolean option:  you get an error when setting the option that
has created the inconsistency as opposed to an error when setting the
inconsistent set of options.
> How about something like "aggsort=key,rev" or
"aggsort=numkey" where
> the parameter would be a comma-delimited list of options? In that case,
> something like "aggsort=key,numkey" would generate an error. Or
is that
> adding too much complexity with only a hypothetical benefit?
You still have the problem that you must support either antonyms or
explicit negatives for each sorting option.  Is it obvious that "val"
is the opposite of "key"?  Perhaps, perhaps not -- but it''s
certainly
obvious that setting the "aggsortkey" option is the opposite of
unsetting
it or explicitly setting it to false.
> > > Is your argument that it''s easier to remember
"aggsortkey=true" than
> > > "aggsort=key"? I think in either case we''re
going to need to reserve
> > > space for this on the DTrace quick reference mug.
> > 
> > Your point is well-made.  While you wouldn''t have to set
"aggsortkey" to
> > explicitly be "true" (you can just set it --
''setopt("aggsortkey")'' or
> > "#pragma D option aggsortkey"), I agree that it''s a
bit of a mouthful.
> > Maybe drop the "agg" and name the options
"sortkey", "sortrev" and so on?
> > There''s not sorting elsewhere in DTrace, so the
"agg" qualifier seems
> > a bit redundant.  Does removing it help the mnemonic properties?
> 
> Recall that we''ve talked about adding a feature to sort output
based on
> timestamp, but I agree that nothing _currently_ has any kind of sorting
> semantics.
The more I think about it, the more I want the "agg" prefix anyway, if
only to keep with the "size" precedent ("aggsize" being the
option to set
the aggregation buffer size).  So it''s looking lie
"aggsortkey",
"aggsortrev", etc.  Does this rub anyone the wrong way?

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

P.S.  Also, for those who are new to DTrace who might be wondering:  yes,
there is this much discussion and debate for just about every new feature
in DTrace.  Just be glad that many of the extended debates over minutae
have been spared the public eye...

Adam Leventhal

2005-Sep-16 00:36 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 05:01:23PM -0700, Bryan Cantrill
wrote:> > > Your point is well-made.  While you wouldn''t have to set
"aggsortkey" to
> > > explicitly be "true" (you can just set it --
''setopt("aggsortkey")'' or
> > > "#pragma D option aggsortkey"), I agree that
it''s a bit of a mouthful.
> > > Maybe drop the "agg" and name the options
"sortkey", "sortrev" and so on?
> > > There''s not sorting elsewhere in DTrace, so the
"agg" qualifier seems
> > > a bit redundant.  Does removing it help the mnemonic properties?
> > 
> > Recall that we''ve talked about adding a feature to sort
output based on
> > timestamp, but I agree that nothing _currently_ has any kind of
sorting
> > semantics.
> 
> The more I think about it, the more I want the "agg" prefix
anyway, if
> only to keep with the "size" precedent ("aggsize" being
the option to set
> the aggregation buffer size).  So it''s looking lie
"aggsortkey",
> "aggsortrev", etc.  Does this rub anyone the wrong way?
Sounds good.
> P.S.  Also, for those who are new to DTrace who might be wondering:  yes,
> there is this much discussion and debate for just about every new feature
> in DTrace.  Just be glad that many of the extended debates over minutae
> have been spared the public eye...
... though there''s usually considerably more swearing.

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Alexander Kolbasov

2005-Sep-16 01:16 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

> > The more I think about it, the more I want the "agg" prefix
anyway, if
> > only to keep with the "size" precedent ("aggsize"
being the option to set
> > the aggregation buffer size).  So it''s looking lie
"aggsortkey",
> > "aggsortrev", etc.  Does this rub anyone the wrong way?
> 
> Sounds good.
There will be the corresponding command-line option, right?

Adam Leventhal

2005-Sep-16 01:35 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

On Thu, Sep 15, 2005 at 06:16:37PM -0700, Alexander Kolbasov
wrote:> > > The more I think about it, the more I want the "agg"
prefix anyway, if
> > > only to keep with the "size" precedent
("aggsize" being the option to set
> > > the aggregation buffer size).  So it''s looking lie
"aggsortkey",
> > > "aggsortrev", etc.  Does this rub anyone the wrong way?
> > 
> > Sounds good.
> 
> There will be the corresponding command-line option, right?
Everything you can do with #pragma D option option[=value] can be set
with dtrace -x option[=value].

Adam

-- 
Adam Leventhal, Solaris Kernel Development       http://blogs.sun.com/ahl

Michael Schuster - Sun Germany

2005-Sep-16 06:30 UTC

head link

[dtrace-discuss] Can I use printa() for printing multiple aggregations?

Bryan Cantrill wrote:> 
> On Thu, Sep 15, 2005 at 04:41:19PM -0700, Adam Leventhal wrote:
> > On Thu, Sep 15, 2005 at 03:48:54PM -0700, Bryan Cantrill wrote:
> > > > I have a slight preference for having "aggsort" be
the option with
> > > > "key", "revkey", etc. parameters as I
think it provides some additional
> > > > flexibility if we want to add some new ways of sorting
aggregations.
> > >
> > > The problem is that the number of strings increases exponentially
with
> > > the number of sorting options.  If we want to (say) add
equivalents to
> > > the "-n" or "-d" options to sort(1) (neither
of which would be
> > > unreasonable), it would induce absurd option values
("#pragma D option
> > > aggsort=revkeynumdict"?) that become even more absurd if you
don''t want to
> > > rely on our defaults ("#pragma D option
aggsort=norevvalnonumnodict"?!).
> >
> > Of course. If all the ways of sorting are orthogonal we would end up
with
> > the outer product as you describe in which case splitting each axis
into
> > its own option would be preferable. But if options are such that they
> > don''t make sense when combined we could end up with some very
strange
> > syntax or error messages.
> 
> I think most of these options _are_ orthogonal.  And even where they
> aren''t, you actually get crisper error handling by forcing each
one to be
> its own boolean option:  you get an error when setting the option that
> has created the inconsistency as opposed to an error when setting the
> inconsistent set of options.
I was tending towards advocating something similar to the way you can pass
options on the commandline, eg "ls -las" in one long string ... but on
second thoughts, I feel that Bryan''s suggestion makes for much better
readability of a D script.

cheers
-- 
Michael Schuster                  (+49 89) 46008-2974 / x62974
visit the online support center:  http://www.sun.com/osc/

Recursion, n.: see ''Recursion''

Apparently Analagous Threads

Search for more seemingly similar threads

dtrace discuss - Sep 2005 - Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple agg regations?

[dtrace-discuss] Can I use printa() for printing multiple aggregations?

Apparently Analagous Threads