Andreas.Haas at Sun.COM
2006-Oct-05 15:02 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi all, I started recently working on a dtrace script that shall help to understand bottlenecks in two processes that together constitute a crucial daemon component in a distributed software system. What I got so far solely utilizes the "pid" provider for collecting statistic indices about relevant function calls and the "profile" provider for printing these indices periodically http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=17635 all in all I''m quite happy with dtrace as it facilitates monitoring our two daemon processes and deriving a joint view from the data -- which is really great! Unfortunately now I''m somewhat stuck due to problems with a "pid" provider probe when I use my script to monitor the same daemon component of a later release of our software # ./monitor.sh -interval 1sec dtrace: failed to compile script ./monitor.d: line 122: probe description pid4518::do_gdi_request:return does not match any probes at first I thought I get this error due C code function do_gdi_request() being defined as ''static'' in that laster. But then I encountered with the first version this function already was ''static'' and it did work all the same. On the other hand when I remove the ''static'' with the later release I am able to launch dtrace, yet the probe doesn''t fire, even though I''m absolutely sure the do_gdi_request() is called -- I checked this double and triple. Now I''m fairly confused and thus my question is: (1) Does the "pid" provider anyhow care about a function being defined ''static'' or not? I searched http://docs.sun.com/app/docs/coll/45.20?q=dtrace but I couldn''t find an indication. (2) And, if there is no dependency to ''static''/non-''static'' what could be the cause for such a symptom? Thanks and best regards, Andreas Haas Sun Microsystems GmbH | ++49 +941 3075-131 Dr.-Leo-Ritter-Str. 7 | N1 Grid Engine D-93049 Regensburg/Germany | System Engineering Group Lead
Adam Leventhal
2006-Oct-06 21:56 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
> (1) Does the "pid" provider anyhow care about a function > being defined ''static'' or not? I searched > http://docs.sun.com/app/docs/coll/45.20?q=dtrace > but I couldn''t find an indication.The only thing that matter for the pid provider is that the symbol is present in the symbol table. If the compiler discards static symbols or if you strip the binary, the symbol for the function won''t be in the symbol table so the pid provider won''t be able to find it by name.> (2) And, if there is no dependency to ''static''/non-''static'' what could be > the cause for such a symptom?Make sure the symbol is in the binary at all (use nm(1)), if it''s not your problem could be with the way the binary is built. If it is, there''s potentially a problem with DTrace and we''d need some more information. Adam> Thanks and best regards, > Andreas HaasOn Thu, Oct 05, 2006 at 05:02:52PM +0200, Andreas.Haas at Sun.COM wrote:> Hi all, > > I started recently working on a dtrace script that shall help to understand > bottlenecks in two processes that together constitute a crucial daemon > component in a distributed software system. What I got so far solely > utilizes the "pid" provider for collecting statistic indices about relevant > function calls and the "profile" provider for printing these indices > periodically > > http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=17635 > > all in all I''m quite happy with dtrace as it facilitates monitoring our two > daemon processes and deriving a joint view from the data -- which is really > great! > > Unfortunately now I''m somewhat stuck due to problems with a "pid" provider > probe when I use my script to monitor the same daemon component of a later > release of our software > > # ./monitor.sh -interval 1sec > dtrace: failed to compile script ./monitor.d: line 122: probe > description pid4518::do_gdi_request:return does not match any probes > > at first I thought I get this error due C code function do_gdi_request() > being defined as ''static'' in that laster. But then I encountered with the > first version this function already was ''static'' and it did work all the > same. On the other hand when I remove the ''static'' with the later release > I am able to launch dtrace, yet the probe doesn''t fire, even though I''m > absolutely sure the do_gdi_request() is called -- I checked this double > and triple. > > Now I''m fairly confused and thus my question is: > > (1) Does the "pid" provider anyhow care about a function > being defined ''static'' or not? I searched > http://docs.sun.com/app/docs/coll/45.20?q=dtrace > but I couldn''t find an indication. > > (2) And, if there is no dependency to ''static''/non-''static'' what could be > the cause for such a symptom? > > Thanks and best regards, > Andreas Haas > > Sun Microsystems GmbH | ++49 +941 3075-131 > Dr.-Leo-Ritter-Str. 7 | N1 Grid Engine > D-93049 Regensburg/Germany | System Engineering Group Lead > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Andreas.Haas at Sun.COM
2006-Oct-09 13:17 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi Adam, On Fri, 6 Oct 2006, Adam Leventhal wrote:>> (1) Does the "pid" provider anyhow care about a function >> being defined ''static'' or not? I searched >> http://docs.sun.com/app/docs/coll/45.20?q=dtrace >> but I couldn''t find an indication. > > The only thing that matter for the pid provider is that the symbol is present > in the symbol table. If the compiler discards static symbols or if you strip > the binary, the symbol for the function won''t be in the symbol table so the > pid provider won''t be able to find it by name.Ok.>> (2) And, if there is no dependency to ''static''/non-''static'' what could be >> the cause for such a symptom? > > Make sure the symbol is in the binary at all (use nm(1)), if it''s not your > problem could be with the way the binary is built.I see. Makes sense.> If it is, there''s > potentially a problem with DTrace and we''d need some more information.I''m having a case here where dtrace does not fire on a > cat /etc/release Solaris 10 3/05 s10_74L2a X86 Copyright 2005 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 22 January 2005 for a non-static function that is known by nm(1) > nm SOLARISAMD64/sge_qmaster | grep sge_c_gdi_permcheck [2913] | 4559440| 990|FUNC |GLOB |0 |12 |sge_c_gdi_permcheck from our home-grown tracing utility I know the function is being called: : 2342 21128 9 uid/username = 115088/ah114088, gid/groupname = 10/staff 2343 21128 9 GDI PERMCHECK general request (es-ergb01-01/qconf/5) (ah114088/115088/staff/10) : could it be nm(1) does report a symbol despite the compiler (Sun Studio 10) actually inlined the function? Thanks, Andreas
Adam Leventhal
2006-Oct-09 15:06 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi Andreas, Can you send the script that you''re using? I admit that I''m not sure I understand that problem you''re seeing, but in your previous mail you mentioned that a return probe for a function was failing to match (despite the fact that the function existed). One possible cause of that is that the pid provider will become very conservative if it detects that there''s a jump table anywhere in a function and refuse to allow the return point to be enabled unless it matches a very specific instruction sequence. Can you also send the disassembly for the function in question? Adam On Mon, Oct 09, 2006 at 03:17:35PM +0200, Andreas.Haas at Sun.COM wrote:> Hi Adam, > > On Fri, 6 Oct 2006, Adam Leventhal wrote: > > >>(1) Does the "pid" provider anyhow care about a function > >> being defined ''static'' or not? I searched > >> http://docs.sun.com/app/docs/coll/45.20?q=dtrace > >> but I couldn''t find an indication. > > > >The only thing that matter for the pid provider is that the symbol is > >present > >in the symbol table. If the compiler discards static symbols or if you > >strip > >the binary, the symbol for the function won''t be in the symbol table so the > >pid provider won''t be able to find it by name. > > Ok. > > >>(2) And, if there is no dependency to ''static''/non-''static'' what could be > >> the cause for such a symptom? > > > >Make sure the symbol is in the binary at all (use nm(1)), if it''s not your > >problem could be with the way the binary is built. > > I see. Makes sense. > > >If it is, there''s > >potentially a problem with DTrace and we''d need some more information. > > I''m having a case here where dtrace does not fire on a > > > cat /etc/release > Solaris 10 3/05 s10_74L2a X86 > Copyright 2005 Sun Microsystems, Inc. All Rights Reserved. > Use is subject to license terms. > Assembled 22 January 2005 > > for a non-static function that is known by nm(1) > > > nm SOLARISAMD64/sge_qmaster | grep sge_c_gdi_permcheck > [2913] | 4559440| 990|FUNC |GLOB |0 |12 > |sge_c_gdi_permcheck > > from our home-grown tracing utility I know the function is being called: > > : > 2342 21128 9 uid/username = 115088/ah114088, gid/groupname = 10/staff > 2343 21128 9 GDI PERMCHECK general request (es-ergb01-01/qconf/5) > (ah114088/115088/staff/10) > : > > could it be nm(1) does report a symbol despite the compiler (Sun > Studio 10) actually inlined the function? > > Thanks, > Andreas-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Rayson Ho
2006-Oct-09 15:13 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Andreas.Haas at Sun.COM wrote:> could it be nm(1) does report a symbol despite the compiler (Sun > Studio 10) actually inlined the function?How about attaching a debugger (set SGE_ND first?) and placing a breakpoint inside sge_c_gdi_permcheck()?? Since the debugger doesn''t know about the inlined function, it won''t be able to touch the inlined code... and it can only place the breakpoint in the function displayed by nm. (IIRC, placing a user application "probe" in DTrace works somewhat similar to placing a breakpoint in a debugger...) Rayson
Roch
2006-Oct-09 16:09 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
could it be nm(1) does report a symbol despite the compiler (Sun Studio 10) actually inlined the function? Thanks, Andreas I think it would. Note that a function A can be inlined for some subset of call sites (say inlined when B calls A but not when C calls A). When running inline, of course, the probes don''t fire. -r
Andreas.Haas at Sun.COM
2006-Oct-09 16:16 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi Adam, On Mon, 9 Oct 2006, Adam Leventhal wrote:> Hi Andreas, > > Can you send the script that you''re using?find it attached. Watch out for the "sge_c_gdi_permcheck". For starting dtrace I use monitor.sh (attached) e.g. like this # /cod_home/ah114088/SGE60/util/dtrace/monitor.sh -interval 3sec -request dtrace: script ''/cod_home/ah114088/SGE60/util/dtrace/monitor2.d'' matched 48 probes CPU ID FUNCTION:NAME 0 1 :BEGIN Time | #wrt wrt/ms|#rep #gdi #ack| #dsp dsp/ms #sad| #snd #rcv| #lck0 #ulck0 #lck1 #ulck1 0 68105 :tick-3sec 2006 Oct 9 17:59:26 | 0 0| 0 0 0| 0 0 0| 0 0| 0 0 7 7 0 39929 sge_c_report:entry sge_c_report(??r, bilbo) tid 10 0 68105 :tick-3sec 2006 Oct 9 17:59:29 | 0 0| 1 0 0| 0 0 0| 0 0| 2 2 59 59 0 39931 sge_c_gdi_get:entry sge_c_gdi_get(es-ergb01-01) tid 9 0 39931 sge_c_gdi_get:entry sge_c_gdi_get(es-ergb01-01) tid 9 0 68105 :tick-3sec 2006 Oct 9 17:59:32 | 0 0| 0 3 0| 0 0 0| 0 0| 5 5 11 11 0 39929 sge_c_report:entry sge_c_report(`ju, elendil) tid 9 0 68105 :tick-3sec 2006 Oct 9 17:59:35 | 0 0| 1 0 0| 0 0 0| 0 0| 1 1 13 13 0 39929 sge_c_report:entry sge_c_report(`ju, gluck) tid 9 0 68105 :tick-3sec 2006 Oct 9 17:59:38 | 0 0| 1 0 0| 0 0 0| 0 0| 1 1 12 12 as third file find attached sge_c_gdi.c where sge_c_gdi_permcheck() is defined as non-static function: # nm $SGE_ROOT/bin/sol-amd64/sge_qmaster | grep sge_c_gdi_permcheck [2913] | 4559440| 990|FUNC |GLOB |0 |12 |sge_c_gdi_permcheck> I admit that I''m not sure I understand that problem you''re seeing, but in your > previous mail you mentioned that a return probe for a function was failing to > match (despite the fact that the function existed). One possible cause of > that is that the pid provider will become very conservative if it detects > that there''s a jump table anywhere in a function and refuse to allow the > return point to be enabled unless it matches a very specific instruction > sequence.It''s the entry probe of this function.> Can you also send the disassembly for the function in question?The assembly code? It''s attached as generated by -S option when our normal compilation options are in effect. Regards, Andreas -------------- next part -------------- /************************************************************************* * * The Contents of this file are made available subject to the terms of * the Sun Industry Standards Source License Version 1.2 * * Sun Microsystems Inc., March, 2001 * * * Sun Industry Standards Source License Version 1.2 * ================================================ * The contents of this file are subject to the Sun Industry Standards * Source License Version 1.2 (the "License"); You may not use this file * except in compliance with the License. You may obtain a copy of the * License at http://gridengine.sunsource.net/Gridengine_SISSL_license.html * * Software provided under this License is provided on an "AS IS" basis, * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, * WITHOUT LIMITATION, WARRANTIES THAT THE SOFTWARE IS FREE OF DEFECTS, * MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE, OR NON-INFRINGING. * See the License for the specific provisions governing your rights and * obligations concerning the Software. * * The Initial Developer of the Original Code is: Sun Microsystems, Inc. * * Copyright: 2001 by Sun Microsystems, Inc. * * All Rights Reserved. * ************************************************************************/ /* Parameters: $1 = qmaster_pid $2 = scheduler_pid $3 = interval $4 = show qmaster spooling probes $5 = show incoming qmaster request probes */ BEGIN { printf("%20s |%7s %7s|%4s %4s %4s|%7s %7s %7s|%7s %7s|%7s %7s %7s %7s", "Time", "#wrt", "wrt/ms", "#rep", "#gdi", "#ack", "#dsp", "dsp/ms", "#sad", "#snd", "#rcv", "#lck0", "#ulck0", "#lck1", "#ulck1"); snd_schedd = 0; rcv = 0; rep = 0; ack = 0; gdi = 0; wrt = 0; wrt_total = 0; dsp = 0; sad = 0; dsp_total = 0; lck0 = 0; lck1 = 0; ulck0 = 0; ulck1 = 0; } /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ /* qmaster/scheduler logging */ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ /* errors, warnings and criticals only */ pid$1::sge_log:entry, pid$2::sge_log:entry /arg0 == 2 || arg0 == 3 || arg0 == 4/ { printf("%20Y | %s(%d, %s)", walltimestamp, probefunc, arg0, copyinstr(arg1)); } /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ /* statistics */ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ profile:::tick-$3 { printf("%20Y | %7d %7d|%4d %4d %4d|%7d %7d %7d|%7d %7d|%7d %7d %7d %7d", walltimestamp, wrt, wrt_total, rep, gdi, ack, dsp, dsp_total, sad, snd_schedd, rcv, lck0, ulck0, lck1, ulck1); snd_schedd = 0; rcv = 0; rep = 0; ack = 0; gdi = 0; wrt = 0; wrt_total = 0; dsp = 0; dsp_total = 0; sad = 0; lck0 = 0; ulck0 = 0; lck1 = 0; ulck1 = 0; } /* -------------------------------------- [spooling] ---------------------------------- */ pid$1::spool_write_object:entry, pid$1::spool_delete_object:entry { self->spool_start = timestamp; } pid$1::spool_write_object:return, pid$1::spool_delete_object:return { wrt_total += (timestamp - self->spool_start)/1000000; /* printf("\t%s() tid %d %d", probefunc, tid, (timestamp - self->spool_start)/1000000); */ @q[ probefunc ] = quantize((timestamp - self->spool_start)/1000000); wrt++; } /* -------------------------------------- [requests] ---------------------------------- */ pid$1::sge_c_report:return { rep++; } pid$1::do_c_ack:return { ack++; } pid$1::do_gdi_request:return { gdi++; } /* ------------------------------------- [scheduling] --------------------------------- */ pid$2::dispatch_jobs:entry { self->dispatch_start = timestamp; } pid$2::dispatch_jobs:return { dsp_total += (timestamp - self->dispatch_start)/1000000; /* printf("\t%s() tid %d %d %d", probefunc, tid, (timestamp - self->dispatch_start)/1000000, dsp_total); */ @q[ probefunc ] = quantize((timestamp - self->dispatch_start)/1000000); dsp++; } pid$2::select_assign_debit:return { sad++; } /* ---------------------------------- [synchronization] ------------------------------- */ pid$1::report_list_send:entry { self->event_target = copyinstr(arg2); } pid$1::report_list_send:return /self->event_target == "schedd" / { snd_schedd++; } pid$2::sge_mirror_process_events:return { rcv++; } /* --------------------------------------- [locks] ------------------------------------ */ pid$1::sge_lock:entry { self->lnum = arg0 /* printf("\t%s(%d, %s) tid %d %d", probefunc, arg0, copyinstr(arg2), tid, timestamp); */ } pid$1::sge_lock:return /self->lnum == 0/ { lck0++; } pid$1::sge_lock:return /self->lnum == 1/ { lck1++; } pid$1::sge_unlock:entry { self->lnum = arg0 } pid$1::sge_unlock:return /self->lnum == 0/ { ulck0++; } pid$1::sge_unlock:return /self->lnum == 1/ { ulck1++; } /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ /* showing probes in detail */ /* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ /* -------------------------------------- [spooling] ---------------------------------- */ pid$1::spool_write_object:entry /$4 == 1/ { printf("\t%s(%d, %s) tid %d", probefunc, arg4, copyinstr(arg3), tid); } pid$1::spool_delete_object:entry /$4 == 1/ { printf("\t%s(%d, %s) tid %d", probefunc, arg2, copyinstr(arg3), tid); } /* -------------------------------------- [requests] ---------------------------------- */ pid$1::sge_c_report:entry /$5 == 1/ { printf("\t%s(%s, %s) tid %d", probefunc, copyinstr(arg0), copyinstr(arg1), tid); } pid$1::do_c_ack:entry /$5 == 1/ { printf("\t%s() tid %d", probefunc, tid); } pid$1::sge_c_gdi_get:entry, pid$1::sge_c_gdi_add:entry, pid$1::sge_c_gdi_del:entry, pid$1::sge_c_gdi_mod:entry, pid$1::sge_c_gdi_copy:entry /$5 == 1/ { printf("\t%s(%s) tid %d", probefunc, copyinstr(arg1), tid); } pid$1::sge_c_gdi_trigger:entry, pid$1::sge_c_gdi_permcheck:entry /$5 == 1/ { printf("\t%s(%s) tid %d", probefunc, copyinstr(arg0), tid); } /* ------------------------------------- [scheduling] --------------------------------- */ /* pid$2::select_assign_debit:entry { } */ /* ---------------------------------- [synchronization] ------------------------------- */ pid$2::sge_mirror_process_events:entry /$5 == 1/ { printf("\t%s() tid %d %d", probefunc, tid, timestamp); } /* --------------------------------------- [locks] ------------------------------------ */ -------------- next part -------------- #!/bin/sh Usage() { echo "monitor.sh [options]" echo "options:" echo " -cell <cell> use \$SGE_CELL other than \"default\"" echo " -interval <time> use statistics interval other than \"15sec\"" echo " -spooling show qmaster spooling probes" echo " -requests show incoming qmaster request probes" } # monitor.sh defaults cell=default interval=15sec spooling_probes=0 request_probes=0 while [ $# -gt 0 ]; do case "$1" in -spooling) spooling_probes=1 shift ;; -request) request_probes=1 shift ;; -interval) shift interval="$1" shift ;; -cell) shift cell="$1" shift ;; -help) Usage exit 0 ;; *) Usage exit 1 ;; esac done if [ $SGE_ROOT = "" ]; then echo "Please run with \$SGE_ROOT set on master machine" exit 1 fi master=`cat $SGE_ROOT/$cell/spool/qmaster/qmaster.pid` if [ $? -ne 0 ]; then echo "Couldn''t read sge_qmaster pid from \$SGE_ROOT/$cell/spool/qmaster/qmaster.pid" exit 1 fi schedd=`cat $SGE_ROOT/$cell/spool/qmaster/schedd/schedd.pid` if [ $? -ne 0 ]; then echo "Couldn''t read sge_schedd pid from \$SGE_ROOT/$cell/spool/qmaster/schedd/schedd.pid" exit 1 fi # /usr/sbin/dtrace -s $SGE_ROOT/util/dtrace/monitor.d $master $schedd $interval $spooling_probes $request_probes /usr/sbin/dtrace -s $SGE_ROOT/util/dtrace/monitor2.d $master $schedd $interval $spooling_probes $request_probes -------------- next part -------------- /*___INFO__MARK_BEGIN__*/ /************************************************************************* * * The Contents of this file are made available subject to the terms of * the Sun Industry Standards Source License Version 1.2 * * Sun Microsystems Inc., March, 2001 * * * Sun Industry Standards Source License Version 1.2 * ================================================ * The contents of this file are subject to the Sun Industry Standards * Source License Version 1.2 (the "License"); You may not use this file * except in compliance with the License. You may obtain a copy of the * License at http://gridengine.sunsource.net/Gridengine_SISSL_license.html * * Software provided under this License is provided on an "AS IS" basis, * WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, * WITHOUT LIMITATION, WARRANTIES THAT THE SOFTWARE IS FREE OF DEFECTS, * MERCHANTABLE, FIT FOR A PARTICULAR PURPOSE, OR NON-INFRINGING. * See the License for the specific provisions governing your rights and * obligations concerning the Software. * * The Initial Developer of the Original Code is: Sun Microsystems, Inc. * * Copyright: 2001 by Sun Microsystems, Inc. * * All Rights Reserved. * ************************************************************************/ /*___INFO__MARK_END__*/ #include <string.h> #include <stdlib.h> #include <errno.h> #include "sge_all_listsL.h" #include "cull.h" #include "sge.h" #include "sge_order.h" #include "sge_follow.h" #include "sge_gdi_request.h" #include "sge_c_gdi.h" #include "sge_host.h" #include "sge_host_qmaster.h" #include "sge_job_qmaster.h" #include "sge_userset_qmaster.h" #include "sge_calendar_qmaster.h" #include "sge_manop_qmaster.h" #include "sge_centry_qmaster.h" #include "sge_cqueue_qmaster.h" #include "sge_pe_qmaster.h" #include "sge_limit_rule_qmaster.h" #include "sge_limit_rule.h" #include "sge_conf.h" #include "configuration_qmaster.h" #include "sge_event_master.h" #include "sched_conf_qmaster.h" #include "sge_userprj_qmaster.h" #include "sge_ckpt_qmaster.h" #include "sge_hgroup_qmaster.h" #include "sge_sharetree_qmaster.h" #include "sge_cuser_qmaster.h" #include "sge_feature.h" #include "sge_qmod_qmaster.h" #include "sge_prog.h" #include "sgermon.h" #include "sge_log.h" #include "sge_qmaster_threads.h" #include "sge_time.h" #include "version.h" #include "sge_security.h" #include "sge_answer.h" #include "sge_pe.h" #include "sge_ckpt.h" #include "sge_qinstance.h" #include "sge_userprj.h" #include "sge_job.h" #include "sge_userset.h" #include "sge_manop.h" #include "sge_calendar.h" #include "sge_sharetree.h" #include "sge_hgroup.h" #include "sge_cuser.h" #include "sge_centry.h" #include "sge_cqueue.h" #include "sge_lock.h" #include "msg_common.h" #include "msg_qmaster.h" #include "sgeobj/sge_event.h" #include "uti/sge_bootstrap.h" #ifdef TEST_QMASTER_GDI2 #include "sge_gdi_ctx.h" #endif static void sge_c_gdi_get(gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, sge_pack_buffer *pb, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); static void sge_c_gdi_add(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int return_list_flag, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); static void sge_c_gdi_del(void *context, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); static void sge_c_gdi_mod(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); void sge_c_gdi_copy(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); void sge_c_gdi_permcheck(char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); static void sge_c_gdi_replace(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor); static void sge_gdi_do_permcheck(char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group); void sge_c_gdi_trigger(void *context, char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor, object_description *object_base); static void sge_gdi_shutdown_event_client(const char*, sge_gdi_request*, sge_gdi_request*, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor, object_description *object_base); static int get_client_id(lListElem*, int*); static void trigger_scheduler_monitoring(char*, sge_gdi_request*, sge_gdi_request*, uid_t uid, gid_t gid, char *user, char *group, monitoring_t*); static int sge_chck_get_perm_host(lList **alpp, sge_gdi_request *request, monitoring_t *monitor, object_description *object_base); static int sge_chck_mod_perm_user(lList **alpp, u_long32 target, char *user, monitoring_t *monitor); static int sge_chck_mod_perm_host(lList **alpp, u_long32 target, char *host, char *commproc, int mod, lListElem *ep, bool is_locked, monitoring_t *monitor, object_description *object_base); static int schedd_mod(void *context, lList **alpp, lListElem *modp, lListElem *ep, int add, const char *ruser, const char *rhost, gdi_object_t *object, int sub_command, monitoring_t *monitor ); #if 0 static int do_gdi_get_config_list(sge_gdi_request *aReq, sge_gdi_request *aRes, int *aBeforeCnt, int *anAfterCnt); static int do_gdi_get_sc_config_list(sge_gdi_request *aReq, sge_gdi_request *aRes, int *aBeforeCnt, int *anAfterCnt); #endif /* ------------------------------ generic gdi objects --------------------- */ /* *INDENT-OFF* */ static gdi_object_t gdi_object[] = { { SGE_CALENDAR_LIST, CAL_name, CAL_Type, "calendar", SGE_TYPE_CALENDAR, calendar_mod, calendar_spool, calendar_update_queue_states }, { SGE_EVENT_LIST, 0, NULL, "event", SGE_TYPE_NONE, NULL, NULL, NULL }, { SGE_ADMINHOST_LIST, AH_name, AH_Type, "adminhost", SGE_TYPE_ADMINHOST, host_mod, host_spool, host_success }, { SGE_SUBMITHOST_LIST, SH_name, SH_Type, "submithost", SGE_TYPE_SUBMITHOST, host_mod, host_spool, host_success }, { SGE_EXECHOST_LIST, EH_name, EH_Type, "exechost", SGE_TYPE_EXECHOST, host_mod, host_spool, host_success }, { SGE_CQUEUE_LIST, CQ_name, CQ_Type, "cluster queue", SGE_TYPE_CQUEUE, cqueue_mod, cqueue_spool, cqueue_success }, { SGE_JOB_LIST, 0, NULL, "job", SGE_TYPE_JOB, NULL, NULL, NULL }, { SGE_CENTRY_LIST, CE_name, CE_Type, "complex entry", SGE_TYPE_CENTRY, centry_mod, centry_spool, centry_success }, { SGE_ORDER_LIST, 0, NULL, "order", SGE_TYPE_NONE, NULL, NULL, NULL }, { SGE_MASTER_EVENT, 0, NULL, "master event", SGE_TYPE_NONE, NULL, NULL, NULL }, { SGE_MANAGER_LIST, 0, NULL, "manager", SGE_TYPE_MANAGER, NULL, NULL, NULL }, { SGE_OPERATOR_LIST, 0, NULL, "operator", SGE_TYPE_OPERATOR, NULL, NULL, NULL }, { SGE_PE_LIST, PE_name, PE_Type, "parallel environment", SGE_TYPE_PE, pe_mod, pe_spool, pe_success }, { SGE_CONFIG_LIST, 0, NULL, "configuration", SGE_TYPE_NONE, NULL, NULL, NULL }, { SGE_SC_LIST, 0, NULL, "scheduler configuration", SGE_TYPE_NONE, schedd_mod, NULL, NULL }, { SGE_USER_LIST, UP_name, UP_Type, "user", SGE_TYPE_USER, userprj_mod, userprj_spool, userprj_success }, { SGE_USERSET_LIST, 0, NULL, "userset", SGE_TYPE_USERSET, NULL, NULL, NULL }, { SGE_PROJECT_LIST, UP_name, UP_Type, "project", SGE_TYPE_PROJECT, userprj_mod, userprj_spool, userprj_success }, { SGE_SHARETREE_LIST, 0, NULL, "sharetree", SGE_TYPE_SHARETREE, NULL, NULL, NULL }, { SGE_CKPT_LIST, CK_name, CK_Type, "checkpoint interface", SGE_TYPE_CKPT, ckpt_mod, ckpt_spool, ckpt_success }, { SGE_JOB_SCHEDD_INFO_LIST, 0, NULL, "schedd info", SGE_TYPE_JOB_SCHEDD_INFO, NULL, NULL, NULL }, { SGE_ZOMBIE_LIST, 0, NULL, "job zombie list", SGE_TYPE_ZOMBIE, NULL, NULL, NULL }, { SGE_LIRS_LIST, LIRS_name, LIRS_Type, "limitation rule", SGE_TYPE_LIRS, lirs_mod, lirs_spool, lirs_success }, #ifndef __SGE_NO_USERMAPPING__ { SGE_USER_MAPPING_LIST, CU_name, CU_Type, "user mapping entry", SGE_TYPE_CUSER, cuser_mod, cuser_spool, cuser_success }, #endif { SGE_HGROUP_LIST, HGRP_name, HGRP_Type, "host group", SGE_TYPE_HGROUP, hgroup_mod, hgroup_spool, hgroup_success }, { SGE_DUMMY_LIST, 0, NULL, "general request", SGE_TYPE_NONE, NULL, NULL, NULL }, { 0, 0, NULL, NULL, SGE_TYPE_NONE, NULL, NULL, NULL } }; /* *INDENT-ON* */ void sge_clean_lists(void) { int i = 0; for(;gdi_object[i].target != 0 ; i++) { if (gdi_object[i].list_type != SGE_TYPE_NONE) { lList **master_list = object_type_get_master_list(gdi_object[i].list_type); /* fprintf(stderr, "---> freeing list %s, it has %d elems\n", gdi_object[i].object_name, lGetNumberOfElem(*master_list)); */ lFreeList(master_list); } } } /* * MT-NOTE: verify_request_version() is MT safe */ int verify_request_version( lList **alpp, u_long32 version, char *host, char *commproc, int id ) { char *client_version = NULL; dstring ds; char buffer[256]; const vdict_t *vp, *vdict = GRM_GDI_VERSION_ARRAY; DENTER(TOP_LAYER, "verify_request_version"); sge_dstring_init(&ds, buffer, sizeof(buffer)); if (version == GRM_GDI_VERSION) { DEXIT; return 0; } for (vp = &vdict[0]; vp->version; vp++) { if (version == vp->version) { client_version = vp->release; } } if (client_version) { WARNING((SGE_EVENT, MSG_GDI_WRONG_GDI_SSISS, host, commproc, id, client_version, feature_get_product_name(FS_VERSION, &ds))); } else { WARNING((SGE_EVENT, MSG_GDI_WRONG_GDI_SSIUS, host, commproc, id, sge_u32c(version), feature_get_product_name(FS_VERSION, &ds))); } answer_list_add(alpp, SGE_EVENT, STATUS_EVERSION, ANSWER_QUALITY_ERROR); DEXIT; return 1; } /* ------------------------------------------------------------ */ void sge_c_gdi(void *context, char *host, sge_gdi_request *request, sge_gdi_request *response, sge_pack_buffer *pb, monitoring_t *monitor) { const char *target_name = NULL; char *operation_name = NULL; int sub_command = 0; gdi_object_t *ao; uid_t uid; gid_t gid; char user[128] = ""; char group[128] = ""; lList *local_answer_list = NULL; object_description *object_base = object_type_get_object_description(); #ifdef TEST_QMASTER_GDI2 sge_gdi_ctx_class_t *ctx = (sge_gdi_ctx_class_t*)context; const char *admin_user = ctx->get_admin_user(ctx); const char *progname = ctx->get_progname(ctx); #else const char *admin_user = bootstrap_get_admin_user(); const char *progname = uti_state_get_sge_formal_prog_name(); #endif DENTER(TOP_LAYER, "sge_c_gdi"); response->op = request->op; response->target = request->target; response->sequence_id = request->sequence_id; response->request_id = request->request_id; if (verify_request_version(&(response->alp), request->version, request->host, request->commproc, request->id)) { DEXIT; return; } if (sge_get_auth_info(request, &uid, user, sizeof(user), &gid, group, sizeof(group)) == -1) { ERROR((SGE_EVENT, MSG_GDI_FAILEDTOEXTRACTAUTHINFO)); answer_list_add(&(response->alp), SGE_EVENT, STATUS_ENOMGR, ANSWER_QUALITY_ERROR); DEXIT; return; } if ((strlen(user) == 0) || (strlen(group) == 0)) { CRITICAL((SGE_EVENT, MSG_GDI_NULL_IN_GDI_SSS, (strlen(user)==0)?MSG_OBJ_USER:"", (strlen(group)==0)?MSG_OBJ_GROUP:"", host)); answer_list_add(&(response->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); DEXIT; return; } DPRINTF(("uid/username = %d/%s, gid/groupname = %d/%s\n", (int) uid, user, (int) gid, group)); if (!sge_security_verify_user(request->host, request->commproc, request->id, admin_user, user, progname)) { CRITICAL((SGE_EVENT, MSG_SEC_CRED_SSSI, user, request->host, request->commproc, request->id)); answer_list_add(&(response->alp), SGE_EVENT, STATUS_ENOSUCHUSER, ANSWER_QUALITY_ERROR); DEXIT; return; } if ((ao = get_gdi_object(request->target))) { target_name = ao->object_name; } if (!ao || !target_name) { target_name = MSG_UNKNOWN_OBJECT; } /* ** we take request->op % SGE_GDI_RETURN_NEW_VERSION to get the ** real operation and request->op / SGE_GDI_RETURN_NEW_VERSION ** to get the changed list back in the answer sge_gdi_request ** struct for add/modify operations ** If request->op / SGE_GDI_RETURN_NEW_VERSION is 1 we create ** a list response->lp this list is handed over to the corresponding ** add/modify routine. ** Now only for job add available. */ #if 0 all_users_flag = request->op / SGE_GDI_ALL_USERS; request->op %= SGE_GDI_ALL_USERS; all_jobs_flag = request->op / SGE_GDI_ALL_JOBS; request->op %= SGE_GDI_ALL_JOBS; request->op %= SGE_GDI_RETURN_NEW_VERSION; #endif sub_command = SGE_GDI_GET_SUBCOMMAND(request->op); request->op = SGE_GDI_GET_OPERATION(request->op); switch (request->op) { case SGE_GDI_GET: operation_name = "GET"; MONITOR_GDI_GET(monitor); break; case SGE_GDI_ADD: operation_name = "ADD"; MONITOR_GDI_ADD(monitor); break; case SGE_GDI_DEL: operation_name = "DEL"; MONITOR_GDI_DEL(monitor); break; case SGE_GDI_MOD: operation_name = "MOD"; MONITOR_GDI_MOD(monitor); break; case SGE_GDI_COPY: operation_name = "COPY"; MONITOR_GDI_CP(monitor); break; case SGE_GDI_TRIGGER: operation_name = "TRIGGER"; MONITOR_GDI_TRIG(monitor); break; case SGE_GDI_PERMCHECK: operation_name = "PERMCHECK"; MONITOR_GDI_PERM(monitor); break; case SGE_GDI_REPLACE: operation_name = "REPLACE"; MONITOR_GDI_REPLACE(monitor); break; default: operation_name = "???"; break; } /* different report types */ switch (request->op) { case SGE_GDI_GET: break; case SGE_GDI_ADD: case SGE_GDI_DEL: case SGE_GDI_MOD: case SGE_GDI_COPY: case SGE_GDI_TRIGGER: case SGE_GDI_REPLACE: default: DPRINTF(("GDI %s %s (%s/%s/%d) (%s/%d/%s/%d)\n", operation_name, target_name, request->host, request->commproc, (int)request->id, user, (int)uid, group, (int)gid)); break; } switch (request->op) { case SGE_GDI_GET: sge_c_gdi_get(ao, host, request, response, pb, uid, gid, user, group, monitor); break; case SGE_GDI_ADD: sge_c_gdi_add(context, ao, host, request, response, sub_command, uid, gid, user, group, monitor); break; case SGE_GDI_DEL: sge_c_gdi_del(context, host, request, response, sub_command, uid, gid, user, group, monitor); break; case SGE_GDI_MOD: sge_c_gdi_mod(context, ao, host, request, response, sub_command, uid, gid, user, group, monitor); break; case SGE_GDI_COPY: sge_c_gdi_copy(context, ao, host, request, response, sub_command, uid, gid, user, group, monitor); break; case SGE_GDI_TRIGGER: sge_c_gdi_trigger(context, host, request, response, uid, gid, user, group, monitor, object_base); break; case SGE_GDI_PERMCHECK: sge_c_gdi_permcheck(host, request, response, uid, gid, user, group, monitor); break; case SGE_GDI_REPLACE: sge_c_gdi_replace(context, ao, host, request, response, sub_command, uid, gid, user, group, monitor); break; default: SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_UNKNOWNOP)); answer_list_add(&(response->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } /* GDI_GET fills the pack-buffer by itself */ if (request->op != SGE_GDI_GET) { gdi_request_pack_result(response, &local_answer_list, pb); } /* different report types */ switch (request->op) { case SGE_GDI_GET: DPRINTF(("GDI %s %s (%s/%s/%d) (%s/%d/%s/%d)\n", operation_name, target_name, request->host, request->commproc, (int)request->id, user, (int)uid, group, (int)gid)); break; case SGE_GDI_ADD: case SGE_GDI_DEL: case SGE_GDI_MOD: case SGE_GDI_COPY: case SGE_GDI_TRIGGER: case SGE_GDI_REPLACE: default: break; } DEXIT; return; } /* * MT-NOTE: sge_c_gdi_get() is MT safe */ static void sge_c_gdi_get(gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, sge_pack_buffer *pb, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lList *local_answer_list = NULL; #define USE_OLD_IMPL 0 #if !USE_OLD_IMPL bool local_ret = true; #endif lList *lp = NULL; dstring ds; char buffer[256]; object_description *object_base = object_type_get_object_description(); DENTER(TOP_LAYER, "sge_c_gdi_get"); sge_dstring_init(&ds, buffer, sizeof(buffer)); if (sge_chck_get_perm_host(&(answer->alp), request, monitor, object_base)) { gdi_request_pack_result(answer, &local_answer_list, pb); DEXIT; return; } switch (request->target) { #ifdef QHOST_TEST case SGE_QHOST: sprintf(SGE_EVENT, "SGE_QHOST\n"); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); gdi_request_pack_result(answer, &local_answer_list, pb); DEXIT; return; #endif case SGE_EVENT_LIST: answer->lp = sge_select_event_clients("qmaster_response", request->cp, request->enp); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); gdi_request_pack_result(answer, &local_answer_list, pb); DEXIT; return; case SGE_CONFIG_LIST: { /* TODO EB: move this into the master configuration, and pack the list right away */ #if 0 /* EB: TODO PACKING */ do_gdi_get_config_list(request, answer, before, after); #else lList *conf = NULL; conf = sge_get_configuration(); answer->lp = lSelectHashPack("qmaster_response", conf, request->cp, request->enp, false, NULL); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); lFreeList(&conf); } gdi_request_pack_result(answer, &local_answer_list, pb); #endif DEXIT; return; case SGE_SC_LIST: /* TODO EB: move this into the scheduler configuration, and pack the list right away */ { lList *conf = NULL; conf = sconf_get_config_list(); answer->lp = lSelectHashPack("qmaster_response", conf, request->cp, request->enp, false, NULL); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); lFreeList(&conf); } gdi_request_pack_result(answer, &local_answer_list, pb); DEXIT; return; default: /* EB: TODO PACKING */ MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); /* * Issue 1365 * If the scheduler is not available the information in the job info * messages are outdated. In this case we have to reject the request. */ if (request->target == SGE_JOB_SCHEDD_INFO_LIST && !sge_has_event_client(EV_ID_SCHEDD) ) { answer_list_add(&(answer->alp),MSG_SGETEXT_JOBINFOMESSAGESOUTDATED, STATUS_ESEMANTIC, ANSWER_QUALITY_ERROR); } else if (ao == NULL || ao->list_type == SGE_TYPE_NONE) { SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); } else { lp = *object_type_get_master_list(ao->list_type); #if !USE_OLD_IMPL /* * start with first part of packing */ #if 0 fprintf(stderr, "### before gdi_request_pack_prefix {\n"); pb_print_to(pb, false, stderr); fprintf(stderr, "\n"); #endif local_ret &= gdi_request_pack_prefix(answer, &local_answer_list, pb); #if 0 fprintf(stderr, "### after gdi_request_pack_prefix {\n"); pb_print_to(pb, false, stderr); fprintf(stderr, "\n"); #endif #endif #if !USE_OLD_IMPL lSelectHashPack("qmaster_response", lp, request->cp, request->enp, false, pb); #if 0 { sge_pack_buffer pb2; lList *lpr = NULL; init_packbuffer(&pb2, 0, 0); lpr = lSelectHashPack("qmaster_response", lp, request->cp, request->enp, false, NULL); cull_pack_list(&pb2, lpr); lFreeList(lpr); fprintf(stderr, "************* lSelectHashPack with pb\n"); pb_print_to(pb, false, stderr); fprintf(stderr, "************* lSelectHashPack without pb\n"); pb_print_to(&pb2, false, stderr); clear_packbuffer(&pb2); } #endif #else answer->lp = lSelectHashPack("qmaster_response", lp, request->cp, request->enp, false, NULL); #endif sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); #if !USE_OLD_IMPL /* * finish packing */ local_ret &= gdi_request_pack_suffix(answer, &local_answer_list, pb); #else gdi_request_pack_result(answer, &local_answer_list, pb); #endif #if 0 fprintf(stderr, "*** pb\n"); pb_print_to(pb, false, stderr); #endif lFreeList(&local_answer_list); } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return; } #if 0 /* EB: TODO PACKING */ /* * Implement ''SGE_GDI_GET'' for request target ''SGE_CONFIG_LIST''. * * MT-NOTE: do_gdi_get_config() is MT safe */ static int do_gdi_get_config_list(sge_gdi_request *aReq, sge_gdi_request *aRes, int *aBeforeCnt, int *anAfterCnt) { lList *conf = NULL; DENTER(TOP_LAYER, "do_gdi_get_config_list"); conf = sge_get_configuration(); *aBeforeCnt = lGetNumberOfElem(conf); aRes->lp = lSelectHashPack("qmaster_response", conf, aReq->cp, aReq->enp, false, NULL); conf = lFreeList(conf); *anAfterCnt = lGetNumberOfElem(aRes->lp); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(aRes->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); DEXIT; return 0; } #endif #if 0 /* * Implement ''SGE_GDI_GET'' for request target ''SGE_CONFIG_LIST''. * * MT-NOTE: do_gdi_get_config() is MT safe */ static int do_gdi_get_config_list(sge_gdi_request *aReq, sge_gdi_request *aRes, int *aBeforeCnt, int *anAfterCnt) { lList *conf = NULL; DENTER(TOP_LAYER, "do_gdi_get_config_list"); conf = sge_get_configuration(); *aBeforeCnt = lGetNumberOfElem(conf); aRes->lp = lSelectHashPack("qmaster_response", conf, aReq->cp, aReq->enp, false, NULL); lFreeList(&conf); *anAfterCnt = lGetNumberOfElem(aRes->lp); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(aRes->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); DEXIT; return 0; } static int do_gdi_get_sc_config_list(sge_gdi_request *aReq, sge_gdi_request *aRes, int *aBeforeCnt, int *anAfterCnt) { lList *conf = NULL; DENTER(TOP_LAYER, "do_gdi_get_sc_config_list"); conf = sconf_get_config_list(); *aBeforeCnt = lGetNumberOfElem(conf); aRes->lp = lSelectHashPack("qmaster_response", conf, aReq->cp, aReq->enp, false, NULL); *anAfterCnt = lGetNumberOfElem(aRes->lp); sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(aRes->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); lFreeList(&conf); DEXIT; return 0; } #endif /* * MT-NOTE: sge_c_gdi_add() is MT safe */ static void sge_c_gdi_add(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lListElem *ep; lList *ticket_orders = NULL; dstring ds; char buffer[256]; object_description *object_base = object_type_get_object_description(); DENTER(TOP_LAYER, "sge_c_gdi_add"); sge_dstring_init(&ds, buffer, sizeof(buffer)); if (!request->host || !user || !request->commproc) { CRITICAL((SGE_EVENT, MSG_SGETEXT_NULLPTRPASSED_S, SGE_FUNC)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR); DEXIT; return; } /* check permissions of host and user */ if ((!sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) && (!sge_chck_mod_perm_host(&(answer->alp), request->target, request->host, request->commproc, 0, NULL, false, monitor, object_base))) { if (request->target == SGE_EVENT_LIST) { for_each (ep, request->lp) {/* is thread save. the global lock is used, when needed */ /* fill address infos from request into event client that must be added */ lSetHost(ep, EV_host, request->host); lSetString(ep, EV_commproc, request->commproc); lSetUlong(ep, EV_commid, request->id); /* fill in authentication infos from request */ lSetUlong(ep, EV_uid, uid); if (!event_client_verify(ep, &(answer->alp), true)) { ERROR((SGE_EVENT, MSG_QMASTER_INVALIDEVENTCLIENT_SSS, user, request->commproc, request->host)); } else { mconf_set_max_dynamic_event_clients(sge_set_max_dynamic_event_clients(mconf_get_max_dynamic_event_clients())); sge_add_event_client(ep,&(answer->alp), (sub_command & SGE_GDI_RETURN_NEW_VERSION) ? &(answer->lp) : NULL, user, host, monitor); } } } else if (request->target == SGE_JOB_LIST) { for_each(ep, request->lp) { /* is thread save. the global lock is used, when needed */ /* fill address infos from request into event client that must be added */ if (!job_verify_submitted_job(ep, &(answer->alp))) { ERROR((SGE_EVENT, MSG_QMASTER_INVALIDJOBSUBMISSION_SSS, user, request->commproc, request->host)); } else { if (mconf_get_simulate_hosts()) { int multi_job = 1; int i; lList *context = lGetList(ep, JB_context); if(context != NULL) { lListElem *multi = lGetElemStr(context, VA_variable, "SGE_MULTI_SUBMIT"); if(multi != NULL) { multi_job = atoi(lGetString(multi, VA_value)); DPRINTF(("Cloning job %d times in simulation mode\n", multi_job)); } } for(i = 0; i < multi_job; i++) { lListElem *clone = lCopyElem(ep); sge_gdi_add_job(context, clone, &(answer->alp), (sub_command & SGE_GDI_RETURN_NEW_VERSION) ? &(answer->lp) : NULL, user, host, uid, gid, group, request, monitor); lFreeElem(&clone); } } else { /* submit needs to know user and group */ sge_gdi_add_job(context, ep, &(answer->alp), (sub_command & SGE_GDI_RETURN_NEW_VERSION) ? &(answer->lp) : NULL, user, host, uid, gid, group, request, monitor); } } } } else if (request->target == SGE_SC_LIST ) { for_each (ep, request->lp) { sge_mod_sched_configuration(context, ep, &(answer->alp), user, host); } } else { bool is_scheduler_resync = false; lList *ppList = NULL; MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); if (request->target == SGE_ORDER_LIST) { sge_set_commit_required(); } for_each (ep, request->lp) { /* add each element */ switch (request->target) { case SGE_ORDER_LIST: switch (sge_follow_order(context, ep, &(answer->alp), user, host, &ticket_orders, monitor, object_base)) { case STATUS_OK : case 0 : /* everything went fine */ break; case -2 : is_scheduler_resync = true; case -1 : case -3 : /* stop the order processing */ DPRINTF(("Failed to follow order . Remaining %d orders unprocessed.\n", lGetNumberOfRemainingElem(ep))); ep = lLast(request->lp); break; default : DPRINTF(("--> FAILED: unexpected state from in the order processing <--\n")); break; } break; case SGE_MANAGER_LIST: case SGE_OPERATOR_LIST: sge_add_manop(context, ep, &(answer->alp), user, host, request->target); break; case SGE_USERSET_LIST: sge_add_userset(context, ep, &(answer->alp), object_base[SGE_TYPE_USERSET].list, user, host); break; case SGE_SHARETREE_LIST: sge_add_sharetree(context, ep, object_base[SGE_TYPE_SHARETREE].list, &(answer->alp), user, host); break; default: if (!ao) { SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } if (request->target==SGE_EXECHOST_LIST && !strcmp(prognames[EXECD], request->commproc)) { sge_execd_startedup(context, ep, &(answer->alp), user, host, request->target, monitor); } else { sge_gdi_add_mod_generic(context, &(answer->alp), ep, 1, ao, user, host, sub_command, &ppList, monitor); } break; } } /* for_each request */ if (request->target == SGE_ORDER_LIST) { sge_commit(); sge_set_next_spooling_time(); answer_list_add(&(answer->alp), "OK\n", STATUS_OK, ANSWER_QUALITY_INFO); } SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); if (is_scheduler_resync) { sge_resync_schedd(monitor); /* ask for a total update */ } /* we could do postprocessing based on ppList here */ lFreeList(&ppList); } } if (ticket_orders != NULL) { if (sge_conf_is_reprioritize()) { MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); distribute_ticket_orders(context, ticket_orders, monitor, object_base); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); } else { /* tickets not needed at execd''s if no repriorization is done */ DPRINTF(("NO TICKET DELIVERY\n")); } lFreeList(&ticket_orders); } DEXIT; return; } /* * MT-NOTE: sge_c_gdi-del() is MT safe */ static void sge_c_gdi_del(void *context, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lListElem *ep; dstring ds; char buffer[256]; object_description *object_base = object_type_get_object_description(); DENTER(GDI_LAYER, "sge_c_gdi_del"); sge_dstring_init(&ds, buffer, sizeof(buffer)); if (!request->lp) /* delete whole list */ { if (sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) { DEXIT; return; } if (sge_chck_mod_perm_host(&(answer->alp), request->target, request->host, request->commproc, 0, NULL, false, monitor, object_base)) { DEXIT; return; } switch (request->target) { case SGE_SHARETREE_LIST: MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); sge_del_sharetree(context, object_base[SGE_TYPE_SHARETREE].list, &(answer->alp), user,host); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); break; default: SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } } else { if (sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) { DEXIT; return; } if (sge_chck_mod_perm_host(&(answer->alp), request->target, request->host, request->commproc, 0, NULL, false, monitor, object_base)) { DEXIT; return; } for_each (ep, request->lp) { MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); /* try to remove the element */ switch (request->target) { case SGE_ADMINHOST_LIST: case SGE_SUBMITHOST_LIST: case SGE_EXECHOST_LIST: sge_del_host(context, ep, &(answer->alp), user, host, request->target, *object_base[SGE_TYPE_HGROUP].list); break; case SGE_CQUEUE_LIST: cqueue_del(context, ep, &(answer->alp), user, host); break; case SGE_JOB_LIST: sge_set_commit_required(); sge_gdi_del_job(context, ep, &(answer->alp), user, host, sub_command, monitor); sge_commit(); break; case SGE_CENTRY_LIST: sge_del_centry(context, ep, &(answer->alp), user, host); break; case SGE_PE_LIST: sge_del_pe(context, ep, &(answer->alp), user, host); break; case SGE_MANAGER_LIST: case SGE_OPERATOR_LIST: sge_del_manop(context, ep, &(answer->alp), user, host, request->target); break; case SGE_CONFIG_LIST: sge_del_configuration(context, ep, &(answer->alp), user, host); break; case SGE_USER_LIST: sge_del_userprj(context, ep, &(answer->alp), object_base[SGE_TYPE_USER].list, user, host, 1); break; case SGE_USERSET_LIST: sge_del_userset(context, ep, &(answer->alp), object_base[SGE_TYPE_USERSET].list, user, host); break; case SGE_PROJECT_LIST: sge_del_userprj(context, ep, &(answer->alp), object_base[SGE_TYPE_PROJECT].list, user, host, 0); break; case SGE_LIRS_LIST: sge_del_limit_rule_set(context, ep, &(answer->alp), object_base[SGE_TYPE_LIRS].list, user, host); break; case SGE_CKPT_LIST: sge_del_ckpt(context, ep, &(answer->alp), user, host); break; case SGE_CALENDAR_LIST: sge_del_calendar(context, ep, &(answer->alp), user, host); break; #ifndef __SGE_NO_USERMAPPING__ case SGE_USER_MAPPING_LIST: cuser_del(ep, &(answer->alp), user, host); break; #endif case SGE_HGROUP_LIST: hgroup_del(context, ep, &(answer->alp), user, host); break; default: SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } /* switch target */ SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); } /* for_each element */ } DEXIT; return; } /* * MT-NOTE: sge_c_gdi_copy() is MT safe */ void sge_c_gdi_copy(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lListElem *ep = NULL; object_description *object_base = object_type_get_object_description(); DENTER(TOP_LAYER, "sge_c_gdi_copy"); if (!request->host || !user || !request->commproc) { CRITICAL((SGE_EVENT, MSG_SGETEXT_NULLPTRPASSED_S, SGE_FUNC)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR); DEXIT; return; } if (sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) { DEXIT; return; } if (sge_chck_mod_perm_host(&(answer->alp), request->target, request->host, request->commproc, 0, NULL, false, monitor, object_base)) { DEXIT; return; } for_each (ep, request->lp) { switch (request->target) { case SGE_JOB_LIST: /* gdi_copy_job uses the global lock internal */ sge_gdi_copy_job(context, ep, &(answer->alp), (sub_command & SGE_GDI_RETURN_NEW_VERSION) ? &(answer->lp) : NULL, user, host, uid, gid, group, request, monitor); break; default: SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } } DEXIT; return; } /* ------------------------------------------------------------ */ static void sge_gdi_do_permcheck(char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group) { lList *lp = NULL; lListElem *ep = NULL; DENTER(GDI_LAYER, "sge_gdi_do_permcheck"); DPRINTF(("User: %s\n", user )); if (answer->lp == NULL) { const char *mapped_user = NULL; const char* requested_host = NULL; bool did_mapping = false; lUlong value; /* create PERM_Type list for answer structure*/ lp = lCreateList("permissions", PERM_Type); ep = lCreateElem(PERM_Type); lAppendElem(lp,ep); /* set sge username */ lSetString(ep, PERM_sge_username, user ); /* set requested host name */ if (request->lp == NULL) { requested_host = host; } else { lList* tmp_lp = NULL; lListElem* tmp_ep = NULL; tmp_lp = request->lp; tmp_ep = tmp_lp->first; requested_host = lGetHost(tmp_ep, PERM_req_host); #ifndef __SGE_NO_USERMAPPING__ cuser_list_map_user(*(cuser_list_get_master_list()), NULL, user, requested_host, &mapped_user); did_mapping = true; #endif } if (requested_host != NULL) { lSetHost(ep, PERM_req_host, requested_host); } if (did_mapping && strcmp(mapped_user, user)) { DPRINTF(("execution mapping: user %s mapped to %s on host %s\n", user, mapped_user, requested_host)); lSetString(ep, PERM_req_username, mapped_user); } else { lSetString(ep, PERM_req_username, ""); } /* check for manager permission */ value = 0; if (manop_is_manager(user)) { value = 1; } lSetUlong(ep, PERM_manager, value); /* check for operator permission */ value = 0; if (manop_is_operator(user)) { value = 1; } lSetUlong(ep, PERM_operator, value); if ((request->cp != NULL) && (request->enp != NULL)) { answer->lp = lSelect("permissions", lp, request->cp, request->enp); lFreeList(&lp); } else { answer->lp = lp; } } sprintf(SGE_EVENT, MSG_GDI_OKNL); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); DEXIT; return; } /* * MT-NOTE: sge_c_gdi_permcheck() is MT safe */ void sge_c_gdi_permcheck(char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { DENTER(GDI_LAYER, "sge_c_gdi_permcheck"); MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); switch (request->target) { case SGE_DUMMY_LIST: sge_gdi_do_permcheck(host, request, answer, uid, gid, user, group); break; default: WARNING((SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return; } static void sge_c_gdi_replace(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lList *ppList = NULL; lListElem *ep = NULL; object_description *object_base = object_type_get_object_description(); DENTER(GDI_LAYER, "sge_c_gdi_replace"); if (sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) { DEXIT; return; } if (sge_chck_mod_perm_host(&(answer->alp), request->target, host, request->commproc, 0, NULL, false, monitor, object_base)) { DEXIT; return; } switch (request->target) { case SGE_LIRS_LIST: { MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); /* delete all currently defined rule sets */ ep = lFirst(*object_base[SGE_TYPE_LIRS].list); while (ep != NULL) { sge_del_limit_rule_set(context, ep, &(answer->alp), object_base[SGE_TYPE_LIRS].list, user, host); ep = lFirst(*object_base[SGE_TYPE_LIRS].list); } for_each(ep, request->lp) { sge_gdi_add_mod_generic(context, &(answer->alp), ep, 1, ao, user, host, SGE_GDI_SET_ALL, &ppList, monitor); } lFreeList(&ppList); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); } break; default: SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } DEXIT; return; } /* * MT-NOTE: sge_c_gdi_trigger() is MT safe */ void sge_c_gdi_trigger(void *context, char *host, sge_gdi_request *request, sge_gdi_request *answer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor, object_description *object_base) { u_long32 target = request->target; DENTER(GDI_LAYER, "sge_c_gdi_trigger"); switch (target) { case SGE_EXECHOST_LIST: /* kill execd */ case SGE_MASTER_EVENT: /* kill master */ case SGE_SC_LIST: /* trigger scheduler monitoring */ MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); if (!host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOADMINHOST_S, host)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); DEXIT; return; } if (SGE_EXECHOST_LIST == target) { sge_gdi_kill_exechost(context, host, request, answer, uid, gid, user, group); } SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); if (SGE_SC_LIST == target) { trigger_scheduler_monitoring(host, request, answer, uid, gid, user, group, monitor); } else if (target == SGE_MASTER_EVENT) { /* shutdown qmaster. Do NOT hold the global lock, while doing this !! */ sge_gdi_kill_master(host, request, answer); } break; case SGE_CQUEUE_LIST: case SGE_JOB_LIST: MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); sge_gdi_qmod(context, host, request, answer, uid, gid, user, group, monitor); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); break; case SGE_EVENT_LIST: /* kill scheduler or event client */ sge_gdi_shutdown_event_client(host, request, answer, uid, gid, user, group, monitor, object_base); answer_list_log(&answer->alp, false); break; default: /* permissions should be checked in the functions. Here we don''t know what is to do, so we don''t know what permissions we need */ WARNING((SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } DEXIT; return; } /****** qmaster/sge_c_gdi/sge_gdi_shutdown_event_client() ********************** * NAME * sge_gdi_shutdown_event_client() -- shutdown event client * * SYNOPSIS * static void sge_gdi_shutdown_event_client(const char *aHost, * sge_gdi_request *aRequest, sge_gdi_request *anAnswer) * * FUNCTION * Shutdown event clients by client id. ''aRequest'' does contain a list of * client id''s. This is a list of ''ID_Type'' elements. * * INPUTS * const char *aHost - sender * sge_gdi_request *aRequest - request * sge_gdi_request *anAnswer - answer * monitoring_t *monitor - the monitoring structure * * RESULT * void - none * * NOTES * MT-NOTE: sge_gdi_shutdown_event_client() is NOT MT safe. * *******************************************************************************/ static void sge_gdi_shutdown_event_client(const char *aHost, sge_gdi_request *aRequest, sge_gdi_request *anAnswer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor, object_description *object_base) { lListElem *elem = NULL; /* ID_Type */ DENTER(TOP_LAYER, "sge_gdi_shutdown_event_client"); for_each (elem, aRequest->lp) { lList *local_alp = NULL; int client_id = EV_ID_ANY; int res = -1; if (get_client_id(elem, &client_id) != 0) { answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_EEXIST, ANSWER_QUALITY_ERROR); continue; } if (client_id == EV_ID_SCHEDD && !host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, aHost)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOADMINHOST_S, aHost)); answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); continue; } else if (!host_list_locate(*object_base[SGE_TYPE_SUBMITHOST].list, aHost) && !host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, aHost)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, aHost)); answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); continue; } if (client_id == EV_ID_ANY) { res = sge_shutdown_dynamic_event_clients(user, &(local_alp), monitor); } else { res = sge_shutdown_event_client(client_id, user, uid, &(local_alp), monitor); } if ((res == EINVAL) && (client_id == EV_ID_SCHEDD)) { lFreeList(&local_alp); answer_list_add(&(anAnswer->alp), MSG_COM_NOSCHEDDREGMASTER, STATUS_EEXIST, ANSWER_QUALITY_WARNING); } else { answer_list_append_list(&(anAnswer->alp), &local_alp); } } DEXIT; return; } /* sge_gdi_shutdown_event_client() */ /****** qmaster/sge_c_gdi/get_client_id() ************************************** * NAME * get_client_id() -- get client id from ID_Type element. * * SYNOPSIS * static int get_client_id(lListElem *anElem, int *anID) * * FUNCTION * Get client id from ID_Type element. The client id is converted to an * integer and stored in ''anID''. * * INPUTS * lListElem *anElem - ID_Type element * int *anID - will contain client id on return * * RESULT * EINVAL - failed to extract client id. * 0 - otherwise * * NOTES * MT-NOTE: get_client_id() is MT safe. * * Using ''errno'' to check for ''strtol'' error situations is recommended * by POSIX. * *******************************************************************************/ static int get_client_id(lListElem *anElem, int *anID) { const char *id = NULL; DENTER(TOP_LAYER, "get_client_id"); if ((id = lGetString(anElem, ID_str)) == NULL) { DEXIT; return EINVAL; } errno = 0; /* errno is thread local */ *anID = strtol(id, NULL, 0); if (errno != 0) { ERROR((SGE_EVENT, MSG_GDI_EVENTCLIENTIDFORMAT_S, id)); DEXIT; return EINVAL; } DEXIT; return 0; } /* get_client_id() */ /****** qmaster/sge_c_gdi/trigger_scheduler_monitoring() *********************** * NAME * trigger_scheduler_monitoring() -- trigger scheduler monitoring * * SYNOPSIS * void trigger_scheduler_monitoring(char *aHost, sge_gdi_request *aRequest, * sge_gdi_request *anAnswer) * * FUNCTION * Trigger scheduler monitoring. * * INPUTS * char *aHost - sender * sge_gdi_request *aRequest - request * sge_gdi_request *anAnswer - response * * RESULT * void - none * * NOTES * MT-NOTE: trigger_scheduler_monitoring() is MT safe, using global lock * * SEE ALSO * qconf -tsm * *******************************************************************************/ static void trigger_scheduler_monitoring(char *aHost, sge_gdi_request *aRequest, sge_gdi_request *anAnswer, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { DENTER(GDI_LAYER, "trigger_scheduler_monitoring"); MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); if (!manop_is_manager(user)) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); WARNING((SGE_EVENT, MSG_COM_NOSCHEDMONPERMS)); answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_ENOMGR, ANSWER_QUALITY_WARNING); DEXIT; return; } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); if (!sge_add_event_for_client(EV_ID_SCHEDD, 0, sgeE_SCHEDDMONITOR, 0, 0, NULL, NULL, NULL, NULL)) { WARNING((SGE_EVENT, MSG_COM_NOSCHEDDREGMASTER)); answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_EEXIST, ANSWER_QUALITY_WARNING); DEXIT; return; } INFO((SGE_EVENT, MSG_COM_SCHEDMON_SS, user, aHost)); answer_list_add(&(anAnswer->alp), SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); DEXIT; return; } /* trigger_scheduler_monitoring() */ /* * MT-NOTE: sge_c_gdi_mod() is MT safe */ static void sge_c_gdi_mod(void *context, gdi_object_t *ao, char *host, sge_gdi_request *request, sge_gdi_request *answer, int sub_command, uid_t uid, gid_t gid, char *user, char *group, monitoring_t *monitor) { lListElem *ep; dstring ds; char buffer[256]; lList *ppList = NULL; /* for postprocessing, after the lists of requests has been processed */ bool is_locked = false; object_description *object_base = object_type_get_object_description(); DENTER(TOP_LAYER, "sge_c_gdi_mod"); sge_dstring_init(&ds, buffer, sizeof(buffer)); if (sge_chck_mod_perm_user(&(answer->alp), request->target, user, monitor)) { DEXIT; return; } for_each (ep, request->lp) { if (sge_chck_mod_perm_host(&(answer->alp), request->target, request->host, request->commproc, 1, ep, is_locked, monitor, object_base)) { continue; } if (request->target == SGE_CONFIG_LIST) { sge_mod_configuration(context, ep, &(answer->alp), user, host); } else if (request->target == SGE_EVENT_LIST) { /* fill address infos from request into event client that must be added */ lSetHost(ep, EV_host, request->host); lSetString(ep, EV_commproc, request->commproc); lSetUlong(ep, EV_commid, request->id); /* fill in authentication infos from request */ lSetUlong(ep, EV_uid, uid); if (!event_client_verify(ep, &(answer->alp), false)) { ERROR((SGE_EVENT, MSG_QMASTER_INVALIDEVENTCLIENT_SSS, user, request->commproc, request->host)); } else { sge_mod_event_client(ep, &(answer->alp), user, host); } } else if (request->target == SGE_SC_LIST) { sge_mod_sched_configuration(context, ep, &(answer->alp), user, host); } else { if (!is_locked) { MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_WRITE), monitor); sge_set_commit_required(); is_locked = true; } switch (request->target) { case SGE_JOB_LIST: sge_gdi_mod_job(context, ep, &(answer->alp), user, host, sub_command); break; case SGE_USERSET_LIST: sge_mod_userset(context, ep, &(answer->alp), object_base[SGE_TYPE_USERSET].list, user, host); break; case SGE_SHARETREE_LIST: sge_mod_sharetree(context, ep, object_base[SGE_TYPE_SHARETREE].list, &(answer->alp), user, host); break; default: if (ao == NULL) { SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(&(answer->alp), SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); break; } sge_gdi_add_mod_generic(context, &(answer->alp), ep, 0, ao, user, host, sub_command, &ppList, monitor); break; } } } /* for_each */ if (is_locked) { sge_commit(); SGE_UNLOCK(LOCK_GLOBAL, LOCK_WRITE); } /* postprocessing for the list of requests */ if (lGetNumberOfElem(ppList) != 0) { switch (request->target) { case SGE_CENTRY_LIST: DPRINTF(("rebuilding consumable debitation\n")); centry_redebit_consumables(context, ppList); break; } } lFreeList(&ppList); DEXIT; return; } /* * MT-NOTE: sge_chck_mod_perm_user() is MT safe */ static int sge_chck_mod_perm_user(lList **alpp, u_long32 target, char *user, monitoring_t *monitor) { DENTER(TOP_LAYER, "sge_chck_mod_perm_user"); /* check permissions of user */ switch (target) { case SGE_ORDER_LIST: case SGE_ADMINHOST_LIST: case SGE_SUBMITHOST_LIST: case SGE_EXECHOST_LIST: case SGE_CQUEUE_LIST: case SGE_CENTRY_LIST: case SGE_OPERATOR_LIST: case SGE_MANAGER_LIST: case SGE_PE_LIST: case SGE_CONFIG_LIST: case SGE_SC_LIST: case SGE_USER_LIST: case SGE_PROJECT_LIST: case SGE_SHARETREE_LIST: case SGE_CKPT_LIST: case SGE_CALENDAR_LIST: case SGE_USER_MAPPING_LIST: case SGE_HGROUP_LIST: case SGE_LIRS_LIST: /* user must be a manager */ MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); if (!manop_is_manager(user)) { ERROR((SGE_EVENT, MSG_SGETEXT_MUSTBEMANAGER_S, user)); answer_list_add(alpp, SGE_EVENT, STATUS_ENOMGR, ANSWER_QUALITY_ERROR); SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); break; case SGE_USERSET_LIST: /* user must be a operator */ MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); if (!manop_is_operator(user)) { ERROR((SGE_EVENT, MSG_SGETEXT_MUSTBEOPERATOR_S, user)); answer_list_add(alpp, SGE_EVENT, STATUS_ENOMGR, ANSWER_QUALITY_ERROR); SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); break; case SGE_JOB_LIST: /* what checking could we do here ? we had to check if there is a queue configured for scheduling of jobs of this group/user. If there is no such queue we had to deny submitting. Other checkings need to be done in stub functions. */ break; case SGE_EVENT_LIST: /* an event client can be started by any user - it can only read objects like SGE_GDI_GET delete requires more info - is done in sge_gdi_kill_eventclient */ break; default: SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(alpp, SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); DEXIT; return 1; } DEXIT; return 0; } /* * MT-NOTE: sge_chck_mod_perm_host() is MT safe */ static int sge_chck_mod_perm_host(lList **alpp, u_long32 target, char *host, char *commproc, int mod, lListElem *ep, bool is_locked, monitoring_t *monitor, object_description *object_base) { DENTER(TOP_LAYER, "sge_chck_mod_perm_host"); if (!is_locked) { MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); } /* check permissions of host */ switch (target) { case SGE_ORDER_LIST: case SGE_ADMINHOST_LIST: case SGE_OPERATOR_LIST: case SGE_MANAGER_LIST: case SGE_SUBMITHOST_LIST: case SGE_CQUEUE_LIST: case SGE_CENTRY_LIST: case SGE_PE_LIST: case SGE_CONFIG_LIST: case SGE_SC_LIST: case SGE_USER_LIST: case SGE_USERSET_LIST: case SGE_PROJECT_LIST: case SGE_SHARETREE_LIST: case SGE_CKPT_LIST: case SGE_CALENDAR_LIST: case SGE_USER_MAPPING_LIST: case SGE_HGROUP_LIST: case SGE_LIRS_LIST: /* host must be SGE_ADMINHOST_LIST */ if (!host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOADMINHOST_S, host)); answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 1; } break; case SGE_EXECHOST_LIST: /* host must be either admin host or exec host and execd */ if (!(host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host) || (host_list_locate(*object_base[SGE_TYPE_EXECHOST].list, host) && !strcmp(commproc, prognames[EXECD])))) { ERROR((SGE_EVENT, MSG_SGETEXT_NOADMINHOST_S, host)); answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 1; } break; case SGE_JOB_LIST: /* ** check if override ticket change request, if yes we need ** to be on an admin host and must not be on a submit host */ if (mod && (lGetPosViaElem(ep, JB_override_tickets, SGE_NO_ABORT) >= 0)) { /* host must be SGE_ADMINHOST_LIST */ if (!host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOADMINHOST_S, host)); answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 1; } break; } /* host must be SGE_SUBMITHOST_LIST */ if (!host_list_locate(*object_base[SGE_TYPE_SUBMITHOST].list, host)) { ERROR((SGE_EVENT, MSG_SGETEXT_NOSUBMITHOST_S, host)); answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 1; } break; case SGE_EVENT_LIST: /* to start an event client or if an event client performs modify requests on itself it must be on a submit or an admin host */ if ( (!host_list_locate(*object_base[SGE_TYPE_SUBMITHOST].list, host)) && (!host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host))) { ERROR((SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, host)); answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 1; } break; default: SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(alpp, SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } if (!is_locked) { SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); } DEXIT; return 0; } /* * MT-NOTE: sge_chck_get_perm_host() is MT safe */ static int sge_chck_get_perm_host(lList **alpp, sge_gdi_request *request, monitoring_t *monitor, object_description *object_base) { u_long32 target; char *host = NULL; static int last_id = -1; DENTER(TOP_LAYER, "sge_chck_get_perm_host"); MONITOR_WAIT_TIME(SGE_LOCK(LOCK_GLOBAL, LOCK_READ), monitor); /* reset the last_id counter on first sequence number we won''t log the same error message twice in an api multi request */ if (request->sequence_id == 1) { last_id = -1; } target = request->target; host = request->host; /* check permissions of host */ switch (target) { case SGE_ORDER_LIST: case SGE_EVENT_LIST: case SGE_ADMINHOST_LIST: case SGE_OPERATOR_LIST: case SGE_MANAGER_LIST: case SGE_SUBMITHOST_LIST: case SGE_CQUEUE_LIST: case SGE_CENTRY_LIST: case SGE_PE_LIST: case SGE_SC_LIST: case SGE_USER_LIST: case SGE_USERSET_LIST: case SGE_PROJECT_LIST: case SGE_SHARETREE_LIST: case SGE_CKPT_LIST: case SGE_CALENDAR_LIST: case SGE_USER_MAPPING_LIST: case SGE_HGROUP_LIST: case SGE_EXECHOST_LIST: case SGE_JOB_LIST: case SGE_ZOMBIE_LIST: case SGE_JOB_SCHEDD_INFO_LIST: case SGE_LIRS_LIST: /* host must be admin or submit host */ if ( !host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host) && !host_list_locate(*object_base[SGE_TYPE_SUBMITHOST].list, host)) { if (last_id != request->id) { /* only log the first error in an api multi request */ ERROR((SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, host)); } else { SGE_ADD_MSG_ID( sprintf(SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, host)); } answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); last_id = request->id; /* this indicates that the error is already locked */ SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } break; case SGE_CONFIG_LIST: /* host must be admin or submit host or exec host */ if ( !host_list_locate(*object_base[SGE_TYPE_ADMINHOST].list, host) && !host_list_locate(*object_base[SGE_TYPE_SUBMITHOST].list, host) && !host_list_locate(*object_base[SGE_TYPE_EXECHOST].list, host)) { if (last_id != request->id) { /* only log the first error in an api multi request */ ERROR((SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, host)); } else { SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_NOSUBMITORADMINHOST_S, host)); } answer_list_add(alpp, SGE_EVENT, STATUS_EDENIED2HOST, ANSWER_QUALITY_ERROR); last_id = request->id; /* this indicates that the error is already locked */ SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } break; default: SGE_ADD_MSG_ID(sprintf(SGE_EVENT, MSG_SGETEXT_OPNOIMPFORTARGET)); answer_list_add(alpp, SGE_EVENT, STATUS_ENOIMP, ANSWER_QUALITY_ERROR); SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 1; } SGE_UNLOCK(LOCK_GLOBAL, LOCK_READ); DEXIT; return 0; } /* this is our strategy: do common checks and search old object make a copy of the old object (this will become the new object) modify new object using reduced object as instruction on error: dispose new object store new object to disc on error: dispose new object on success create events replace old object by new queue */ int sge_gdi_add_mod_generic( void *context, lList **alpp, lListElem *instructions, /* our instructions - a reduced object */ int add, /* true in case of add */ gdi_object_t *object, const char *ruser, const char *rhost, int sub_command, lList **ppList, monitoring_t *monitor ) { int pos; int dataType; const char *name; lList *tmp_alp = NULL; lListElem *new_obj = NULL, *old_obj; lListElem *tmp_ep = NULL; DENTER(TOP_LAYER, "sge_gdi_add_mod_generic"); /* DO COMMON CHECKS AND SEARCH OLD OBJECT */ if (!instructions || !object) { CRITICAL((SGE_EVENT, MSG_SGETEXT_NULLPTRPASSED_S, SGE_FUNC)); answer_list_add(alpp, SGE_EVENT, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR); DEXIT; return STATUS_EUNKNOWN; } /* ep is no element of this type, if ep has no QU_qname */ if (lGetPosViaElem(instructions, object->key_nm, SGE_NO_ABORT) < 0) { CRITICAL((SGE_EVENT, MSG_SGETEXT_MISSINGCULLFIELD_SS, lNm2Str(object->key_nm), SGE_FUNC)); answer_list_add(alpp, SGE_EVENT, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR); DEXIT; return STATUS_EUNKNOWN; } /* resolve host name in case of objects with hostnames as key before searching for the objects */ if ( object->key_nm == EH_name || object->key_nm == AH_name || object->key_nm == SH_name ) { if ( sge_resolve_host(instructions, object->key_nm) != CL_RETVAL_OK ) { const char *host = lGetHost(instructions, object->key_nm); ERROR((SGE_EVENT, MSG_SGETEXT_CANTRESOLVEHOST_S, host ? host : "NULL")); answer_list_add(alpp, SGE_EVENT, STATUS_EUNKNOWN, ANSWER_QUALITY_ERROR); DEXIT; return STATUS_EUNKNOWN; } } pos = lGetPosViaElem(instructions, object->key_nm, SGE_NO_ABORT); dataType = lGetPosType(lGetElemDescr(instructions),pos); if (dataType == lHostT) { name = lGetHost(instructions, object->key_nm); old_obj = lGetElemHost(*object_type_get_master_list(object->list_type), object->key_nm, name); } else { name = lGetString(instructions, object->key_nm); old_obj = lGetElemStr(*object_type_get_master_list(object->list_type), object->key_nm, name); } if ((old_obj && add) || (!old_obj && !add)) { ERROR((SGE_EVENT, add? MSG_SGETEXT_ALREADYEXISTS_SS:MSG_SGETEXT_DOESNOTEXIST_SS, object->object_name, name)); answer_list_add(alpp, SGE_EVENT, STATUS_EEXIST, ANSWER_QUALITY_ERROR); DEXIT; return STATUS_EEXIST; } /* MAKE A COPY OF THE OLD QUEUE (THIS WILL BECOME THE NEW QUEUE) */ if (!(new_obj = (add ? lCreateElem(object->type) : lCopyElem(old_obj)))) { ERROR((SGE_EVENT, MSG_MEM_MALLOC)); answer_list_add(alpp, SGE_EVENT, STATUS_EEXIST, ANSWER_QUALITY_ERROR); DEXIT; return STATUS_EEXIST; } /* MODIFY NEW QUEUE USING REDUCED QUEUE AS INSTRUCTION */ if (object->modifier(context, &tmp_alp, new_obj, instructions, add, ruser, rhost, object, sub_command, monitor)) { if (alpp) { /* ON ERROR: DISPOSE NEW QUEUE */ /* failure: just append last elem in tmp_alp elements before may contain invalid success messages */ if (tmp_alp) { lListElem *failure; failure = lLast(tmp_alp); lDechainElem(tmp_alp, failure); if (!*alpp) *alpp = lCreateList("answer", AN_Type); lAppendElem(*alpp, failure); } } lFreeList(&tmp_alp); lFreeElem(&new_obj); DEXIT; return STATUS_EUNKNOWN; } /* write on file */ if (object->writer(context, alpp, new_obj, object)) { lFreeElem(&new_obj); lFreeList(&tmp_alp); DEXIT; return STATUS_EUNKNOWN; } if (alpp != NULL) { if (*alpp == NULL) { *alpp = lCreateList("answer", AN_Type); } /* copy every entrie from tmp_alp into alpp */ for_each (tmp_ep, tmp_alp) { lListElem* copy = NULL; copy = lCopyElem(tmp_ep); if (copy != NULL) { lAppendElem(*alpp,copy); } } } lFreeList(&tmp_alp); { lList **master_list = NULL; master_list = object_type_get_master_list(object->list_type); /* chain out the old object */ if (old_obj) { lDechainElem(*master_list, old_obj); } /* ensure our global list exists */ if (*master_list == NULL ) { *master_list = lCreateList(object->object_name, object->type); } /* chain in new object */ lAppendElem(*master_list, new_obj); } if (object->on_success) { object->on_success(context, new_obj, old_obj, object, ppList, monitor); } lFreeElem(&old_obj); INFO((SGE_EVENT, add?MSG_SGETEXT_ADDEDTOLIST_SSSS: MSG_SGETEXT_MODIFIEDINLIST_SSSS, ruser, rhost, name, object->object_name)); answer_list_add(alpp, SGE_EVENT, STATUS_OK, ANSWER_QUALITY_INFO); DEXIT; return STATUS_OK; } /* * MT-NOTE: get_gdi_object() is MT safe */ gdi_object_t *get_gdi_object(u_long32 target) { int i; DENTER(TOP_LAYER, "get_gdi_object"); for (i=0; gdi_object[i].target; i++) if (target == gdi_object[i].target) { DEXIT; return &gdi_object[i]; } DEXIT; return NULL; } static int schedd_mod( void *context, lList **alpp, lListElem *modp, lListElem *ep, int add, const char *ruser, const char *rhost, gdi_object_t *object, int sub_command, monitoring_t *monitor ) { int ret; DENTER(TOP_LAYER, "schedd_mod"); ret = sconf_validate_config_(alpp) ? 0 : 1; DEXIT; return ret; } -------------- next part -------------- A non-text attachment was scrubbed... Name: sge_c_gdi.s Type: application/octet-stream Size: 313955 bytes Desc: URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20061009/30756963/attachment.obj>
Andreas.Haas at Sun.COM
2006-Oct-09 17:06 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Mon, 9 Oct 2006, Roch wrote:> could it be nm(1) does report a symbol despite the compiler (Sun > Studio 10) actually inlined the function? > > Thanks, > Andreas > > I think it would. Note that a function A can be inlined for > some subset of call sites (say inlined when B calls A but > not when C calls A). When running inline, of > course, the probes don''t fire.Could be. In my case however I know the function is used only from a single call site, so there is no need for nm(1) to report the symbol, if it then gets inlined. But anyways. When I look at the assemly file I identify both function start and end for sge_c_gdi_permcheck(). At least that sequence looks to me like regular function start .align 16 sge_c_gdi_permcheck: .CG149: .CG14A: push %rbx .CG14B: push %rbp .CG14C: push %r12 : and that here like a regular function end : G154: pop %r13 .CG155: pop %r12 .CG156: pop %rbp .CG157: pop %rbx .CG158: ret .CG159: .size sge_c_gdi_permcheck, . - sge_c_gdi_permcheck Regards, Andreas
Nicolas Williams
2006-Oct-09 17:21 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Mon, Oct 09, 2006 at 06:09:19PM +0200, Roch wrote:> > > could it be nm(1) does report a symbol despite the compiler (Sun > Studio 10) actually inlined the function? > > Thanks, > Andreas > > I think it would. Note that a function A can be inlined for > some subset of call sites (say inlined when B calls A but > not when C calls A). When running inline, of > course, the probes don''t fire.Just because a function is inlined by a compiler does not mean that it can''t remain available as a normal function. If you pass the address of a function around or if the function is not static, then the compiler has to generate non-inlined function code too. DTrace couldn''t know about inlined functions without help from the compiler. There isn''t a protocol for the compiler to tell DTrace about inlined functions, is there? If you add -xinline= to the compiler command-line, does that help? (I.e., set the list of functions to be inlined to the empty list.) Nico --
Rayson Ho
2006-Oct-09 17:44 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
In that case, just place a debugger breakpoint inside the function and run SGE can easily find out if it is a problem of DTrace or compiler inlining... If it is not inlined, then the DTrace probe should be fired and transfer control to the pid provider framework. Similarily, with a debugger breakpoint, it will be stepped on and will transfer control to dbx/gdb via a trap (or something similar). Either case would require the function to be not inlined... BTW, looking at the default compiler flags used for compiling SGE, we use "-fast -xchip=generic -xcache=generic", and -fast turns on -xO5, and thus function inlining is enabled... http://www.spec.org/omp/results/flags/SUN-20051104-Studio-Solaris-opteron.txt Rayson On 10/9/06, Andreas.Haas at sun.com wrote:> Could be. In my case however I know the function is used > only from a single call site, so there is no need for nm(1) > to report the symbol, if it then gets inlined.http://gridengine.sunsource.net
Adam Leventhal
2006-Oct-10 04:06 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Mon, Oct 09, 2006 at 12:21:47PM -0500, Nicolas Williams wrote:> DTrace couldn''t know about inlined functions without help from the > compiler. There isn''t a protocol for the compiler to tell DTrace about > inlined functions, is there?There isn''t. Is there some generic way the compiler publishes information about inlines to debuggers? It seems that they would have the same problem that setting a breakpoint in a given function could miss the inlined copies. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
John Levon
2006-Oct-10 13:21 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Mon, Oct 09, 2006 at 09:06:08PM -0700, Adam Leventhal wrote:> On Mon, Oct 09, 2006 at 12:21:47PM -0500, Nicolas Williams wrote: > > DTrace couldn''t know about inlined functions without help from the > > compiler. There isn''t a protocol for the compiler to tell DTrace about > > inlined functions, is there? > > There isn''t. Is there some generic way the compiler publishes information > about inlines to debuggers? It seems that they would have the same problem > that setting a breakpoint in a given function could miss the inlined > copies.There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child of the containing scope. I have no idea how well this is implemented though. regards john
Andreas.Haas at Sun.COM
2006-Oct-10 13:32 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Mon, 9 Oct 2006, Rayson Ho wrote:> In that case, just place a debugger breakpoint inside the function and > run SGE can easily find out if it is a problem of DTrace or compiler > inlining... > > If it is not inlined, then the DTrace probe should be fired and > transfer control to the pid provider framework. Similarily, with a > debugger breakpoint, it will be stepped on and will transfer control > to dbx/gdb via a trap (or something similar). Either case would > require the function to be not inlined...Rayson, I checked as you suggested. The breakpoint doesn''t work and this indicates sge_c_gdi_permcheck() was inlined. Thanks for the hint!> > BTW, looking at the default compiler flags used for compiling SGE, we > use "-fast -xchip=generic -xcache=generic", and -fast turns on -xO5, > and thus function inlining is enabled... > > http://www.spec.org/omp/results/flags/SUN-20051104-Studio-Solaris-opteron.txt >Yep, in contrast when I compile SGE with # aimk -debug the breakpoint does work and the dtrace probe fires likewise 1 39937 sge_c_gdi_permcheck:entry sge_c_gdi_permcheck(es-ergb01-01) tid 9 so, this explains the behavior. Conclusions: (1) I think the inlined function caveat must be mentioned in the "pid Provider" chapter of "Solaris Dynamic Tracing Guide" http://docs.sun.com/app/docs/doc/817-6223/6mlkidlls?q=dtrace&a=view Adam, please let me know, if I somehow can help to get this fixed in the dtrace docs. (2) Finally I was able to get the pid provider for sge_c_gdi_permcheck() also in an optimized SGE build, by using #pragma no_inline(sge_c_gdi_permcheck) as described in http://docs.sun.com/source/819-0494/sun.specific.html that means we''ll be able to use dtrace pid provider probes for SGE bottleneck diagnosis :-) Thanks, everyone for helping! Cheers, Andreas
Adam Leventhal
2006-Oct-10 15:06 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, Oct 10, 2006 at 02:21:56PM +0100, John Levon wrote:> There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child > of the containing scope. I have no idea how well this is implemented though.And is it emitted for non-debug builds? Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Adam Leventhal
2006-Oct-10 15:08 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, Oct 10, 2006 at 03:32:33PM +0200, Andreas.Haas at Sun.COM wrote:> (1) I think the inlined function caveat must be mentioned in the > "pid Provider" chapter of "Solaris Dynamic Tracing Guide" > > http://docs.sun.com/app/docs/doc/817-6223/6mlkidlls?q=dtrace&a=view > > Adam, please let me know, if I somehow can help to get this fixed in > the dtrace docs.I completely agree. Please file a bug though opensolaris.org in the category doc/dtrace. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Adam Leventhal
2006-Oct-10 15:29 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, Oct 10, 2006 at 04:32:45PM +0100, John Levon wrote:> > And is it emitted for non-debug builds? > > Not sure what you mean, since there''s not any DWARF sections without -g.That''s exactly what I mean: even if the compiler emits it the information it''s useless unless it''s present in optimized binaries.> We''d > have to extend CTF (or a new section) to parse the DW_AT_ flags into > <name,low-pc,high-pc> tuples,Right.> or make dtrace consume dwarf.... and have the compiler emit the chunky DWARF for optimized binaries that customers ship. Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
John Levon
2006-Oct-10 15:32 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, Oct 10, 2006 at 08:06:42AM -0700, Adam Leventhal wrote:> On Tue, Oct 10, 2006 at 02:21:56PM +0100, John Levon wrote: > > There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child > > of the containing scope. I have no idea how well this is implemented though. > > And is it emitted for non-debug builds?Not sure what you mean, since there''s not any DWARF sections without -g. We''d have to extend CTF (or a new section) to parse the DW_AT_ flags into <name,low-pc,high-pc> tuples, or make dtrace consume dwarf. regards john
Andreas.Haas at Sun.COM
2006-Oct-10 15:33 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, 10 Oct 2006, Adam Leventhal wrote:> On Tue, Oct 10, 2006 at 03:32:33PM +0200, Andreas.Haas at Sun.COM wrote: >> (1) I think the inlined function caveat must be mentioned in the >> "pid Provider" chapter of "Solaris Dynamic Tracing Guide" >> >> http://docs.sun.com/app/docs/doc/817-6223/6mlkidlls?q=dtrace&a=view >> >> Adam, please let me know, if I somehow can help to get this fixed in >> the dtrace docs. > > I completely agree. Please file a bug though opensolaris.org in the category > doc/dtrace.Done. It''s #6480235. Cheers, Andreas
Nicolas Williams
2006-Oct-10 15:51 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
On Tue, Oct 10, 2006 at 04:32:45PM +0100, John Levon wrote:> On Tue, Oct 10, 2006 at 08:06:42AM -0700, Adam Leventhal wrote: > > > On Tue, Oct 10, 2006 at 02:21:56PM +0100, John Levon wrote: > > > There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child > > > of the containing scope. I have no idea how well this is implemented though. > > > > And is it emitted for non-debug builds? > > Not sure what you mean, since there''s not any DWARF sections without -g. We''d > have to extend CTF (or a new section) to parse the DW_AT_ flags into > <name,low-pc,high-pc> tuples, or make dtrace consume dwarf.But what if other optimizations cause the inlined code not to be compact? Maybe it should suffice to have the compiler emit {fname, inlined_fname} tuples, so that DTrace could warn that probes on the inlined function may fail to fire. Nico --
Rayson Ho
2006-Oct-11 05:25 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
I just looked at (readelf --debug-dump) the DWARF info of an inlined function compiled with gcc -O3 -g... as John mentioned, there is no DW_AT_[low,high]_pc. However, it is hard to determine the pc of an inlined function, as compiler optimizations are free to move things around, remove loads/stores/dead code, etc... I googled a bit and found something interesting, "reliable markers (hooks/probes/taps/...)": http://lwn.net/Articles/27762/ I then wrote a quick hack: #define funcBegin(funcname) \ __asm__ __volatile__( \ "929:\n" \ ".section .funcLoc,\"a\"\n" \ " .align 4\n" \ " .long 929b\n" \ " .asciz \"" #funcname "\"\n" \ " .previous\n" \ : : : "memory"); To use it, just call it like a function call, and pass in an identifer: func2() { funcBegin(func2); printf("Hello2 : %s\n", __func__); } main() { funcBegin(main); func2(); printf("%p %p\n", func2, main); } The macro will save the string (ID) and the label address pair in a special section (.funcLoc). It is extremely low overhead because it has no executable code, and it only adds a memory barrier to stop the compiler from moving stores around. And to hook it up with DTrace: - modify the pid provider, read the .funcLoc section for the {function_address, identifier} pair array. Search for the identifier (may be we can just define the function name as the ID). - since the address is already there, a probe can be placed. And the extra section can be dropped easily with GNU strip with the "--remove-section" option. Rayson On 10/10/06, Nicolas Williams <Nicolas.Williams at sun.com> wrote:> On Tue, Oct 10, 2006 at 04:32:45PM +0100, John Levon wrote: > > On Tue, Oct 10, 2006 at 08:06:42AM -0700, Adam Leventhal wrote: > > > > > On Tue, Oct 10, 2006 at 02:21:56PM +0100, John Levon wrote: > > > > There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child > > > > of the containing scope. I have no idea how well this is implemented though. > > > > > > And is it emitted for non-debug builds? > > > > Not sure what you mean, since there''s not any DWARF sections without -g. We''d > > have to extend CTF (or a new section) to parse the DW_AT_ flags into > > <name,low-pc,high-pc> tuples, or make dtrace consume dwarf. > > But what if other optimizations cause the inlined code not to be > compact? > > Maybe it should suffice to have the compiler emit {fname, inlined_fname} > tuples, so that DTrace could warn that probes on the inlined function > may fail to fire. > > Nico > --
Rayson Ho
2006-Oct-11 08:22 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hmm, the sdt (statically-defined tracing) provider works similar to my hack, so I guess everyone can ignore my precious message... One difference is that my approach does not require support from the dynamic linker, and thus it would work on older releases of solaris. The downside of course is that it would require changes to the pid provider, and it is sort of reinventing the wheel of the sdt... http://www.opensolaris.org/os/community/os_user_groups/czosug/czosug2_dtrace_x86.pdf Rayson On 10/11/06, Rayson Ho <rayrayson at gmail.com> wrote:> I just looked at (readelf --debug-dump) the DWARF info of an inlined > function compiled with gcc -O3 -g... as John mentioned, there is no > DW_AT_[low,high]_pc. However, it is hard to determine the pc of an > inlined function, as compiler optimizations are free to move things > around, remove loads/stores/dead code, etc... > > I googled a bit and found something interesting, "reliable markers > (hooks/probes/taps/...)": http://lwn.net/Articles/27762/ > > I then wrote a quick hack: > #define funcBegin(funcname) \ > __asm__ __volatile__( \ > "929:\n" \ > ".section .funcLoc,\"a\"\n" \ > " .align 4\n" \ > " .long 929b\n" \ > " .asciz \"" #funcname "\"\n" \ > " .previous\n" \ > : : : "memory"); > > To use it, just call it like a function call, and pass in an identifer: > > func2() > { > funcBegin(func2); > > printf("Hello2 : %s\n", __func__); > } > > main() > { > funcBegin(main); > > func2(); > printf("%p %p\n", func2, main); > } > > > The macro will save the string (ID) and the label address pair in a > special section (.funcLoc). It is extremely low overhead because it > has no executable code, and it only adds a memory barrier to stop the > compiler from moving stores around. > > And to hook it up with DTrace: > - modify the pid provider, read the .funcLoc section for the > {function_address, identifier} pair array. Search for the identifier > (may be we can just define the function name as the ID). > - since the address is already there, a probe can be placed. > > And the extra section can be dropped easily with GNU strip with the > "--remove-section" option. > > Rayson > > > > On 10/10/06, Nicolas Williams <Nicolas.Williams at sun.com> wrote: > > On Tue, Oct 10, 2006 at 04:32:45PM +0100, John Levon wrote: > > > On Tue, Oct 10, 2006 at 08:06:42AM -0700, Adam Leventhal wrote: > > > > > > > On Tue, Oct 10, 2006 at 02:21:56PM +0100, John Levon wrote: > > > > > There''s a DWARF tag: DW_TAG_inlined_subroutine, which is supposed to be a child > > > > > of the containing scope. I have no idea how well this is implemented though. > > > > > > > > And is it emitted for non-debug builds? > > > > > > Not sure what you mean, since there''s not any DWARF sections without -g. We''d > > > have to extend CTF (or a new section) to parse the DW_AT_ flags into > > > <name,low-pc,high-pc> tuples, or make dtrace consume dwarf. > > > > But what if other optimizations cause the inlined code not to be > > compact? > > > > Maybe it should suffice to have the compiler emit {fname, inlined_fname} > > tuples, so that DTrace could warn that probes on the inlined function > > may fail to fire. > > > > Nico > > -- >
Andreas.Haas at Sun.COM
2007-Jul-05 12:20 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi all, last year in October we discussed in this thread the requirements for pid-provider probes to function. Outcome was that pid provider can be used as long as no_inline #pragma is used to ensure C functions will be found by dtrace as documented in http://bugs.opensolaris.org/view_bug.do?bug_id=6480235 now we have an issue in our Dtrace monitor for Sun Grid Engine master http://wiki.gridengine.info/wiki/index.php/Dtrace#Implementation the issue is that one particular pid-provider is not found anymore by Dtrace when Sun Studio compiler v12 is used. But when the same C module is compiled with Sun Studio v10 the pid-provider for the function gets found and the monitor works! Regarding compile options nothing changed ("-Xc -v -fast -xchip=generic -xcache=generic -ftrap=division -KPIC") excpet Sun Studio v10 option "-xarch=amd64" became Studio compiler v12 "-m64". Curiously ''dbx'' does find the function when ''stop in sge_mirror_process_events'' is used and in ''nm'' output shows no difference either: Sun Studio v10 nm sge_mirror.o | grep process_events [63] | 14800| 4292|FUNC |GLOB |0 |2 |sge_mirror_process_events Sun Stuido v12 nm sge_mirror.o | grep process_events [67] | 10016| 982|FUNC |GLOB |0 |2 |sge_mirror_process_events note, the ''nm'' output unveils the functions object became much smaller. Also we noticed our binary distribution size went down from 32MB to 24MB. Anyone knowing of a Sun Studio compiler v12 degradation that matches this phenomenon? Thanks, Andreas On Tue, 10 Oct 2006, Andreas.Haas at Sun.COM wrote:> On Tue, 10 Oct 2006, Adam Leventhal wrote: > >> On Tue, Oct 10, 2006 at 03:32:33PM +0200, Andreas.Haas at Sun.COM wrote: >>> (1) I think the inlined function caveat must be mentioned in the >>> "pid Provider" chapter of "Solaris Dynamic Tracing Guide" >>> >>> http://docs.sun.com/app/docs/doc/817-6223/6mlkidlls?q=dtrace&a=view >>> >>> Adam, please let me know, if I somehow can help to get this fixed in >>> the dtrace docs. >> >> I completely agree. Please file a bug though opensolaris.org in the >> category >> doc/dtrace. > > Done. It''s #6480235. > > Cheers, > Andreas > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Adam Leventhal
2007-Jul-05 16:32 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
Hi Andreas, It may be that the pid provider can no longer verify that the function is safe to instrument due to certain constructs in generated code. Can you send the disassemably of the old and new versions of the function? Adam On Thu, Jul 05, 2007 at 02:20:29PM +0200, Andreas.Haas at Sun.COM wrote:> Hi all, > > last year in October we discussed in this thread the requirements for pid-provider probes > to function. Outcome was that pid provider can be used as long as no_inline #pragma is used > to ensure C functions will be found by dtrace as documented in > > http://bugs.opensolaris.org/view_bug.do?bug_id=6480235 > > now we have an issue in our Dtrace monitor for Sun Grid Engine master > > http://wiki.gridengine.info/wiki/index.php/Dtrace#Implementation > > the issue is that one particular pid-provider is not found anymore by Dtrace > when Sun Studio compiler v12 is used. But when the same C module is compiled with > Sun Studio v10 the pid-provider for the function gets found and the monitor works! > > Regarding compile options nothing changed ("-Xc -v -fast -xchip=generic > -xcache=generic -ftrap=division -KPIC") excpet Sun Studio v10 option > "-xarch=amd64" became Studio compiler v12 "-m64". Curiously ''dbx'' does find > the function when ''stop in sge_mirror_process_events'' is used and in ''nm'' > output shows no difference either: > > Sun Studio v10 > > nm sge_mirror.o | grep process_events > [63] | 14800| 4292|FUNC |GLOB |0 |2 |sge_mirror_process_events > > Sun Stuido v12 > > nm sge_mirror.o | grep process_events > [67] | 10016| 982|FUNC |GLOB |0 |2 |sge_mirror_process_events > > note, the ''nm'' output unveils the functions object became much smaller. Also we > noticed our binary distribution size went down from 32MB to 24MB. > > Anyone knowing of a Sun Studio compiler v12 degradation that matches this > phenomenon? > > Thanks, > Andreas > > On Tue, 10 Oct 2006, Andreas.Haas at Sun.COM wrote: > > > On Tue, 10 Oct 2006, Adam Leventhal wrote: > > > >> On Tue, Oct 10, 2006 at 03:32:33PM +0200, Andreas.Haas at Sun.COM wrote: > >>> (1) I think the inlined function caveat must be mentioned in the > >>> "pid Provider" chapter of "Solaris Dynamic Tracing Guide" > >>> > >>> http://docs.sun.com/app/docs/doc/817-6223/6mlkidlls?q=dtrace&a=view > >>> > >>> Adam, please let me know, if I somehow can help to get this fixed in > >>> the dtrace docs. > >> > >> I completely agree. Please file a bug though opensolaris.org in the > >> category > >> doc/dtrace. > > > > Done. It''s #6480235. > > > > Cheers, > > Andreas > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss at opensolaris.org > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Andreas.Haas at Sun.COM
2007-Jul-05 17:13 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
[This email is either empty or too large to be displayed at this time]
Adam Leventhal
2007-Jul-05 18:01 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
[This email is either empty or too large to be displayed at this time]
Andreas.Haas at Sun.COM
2007-Jul-06 09:33 UTC
[dtrace-discuss] Question on pid provider for ''static'' C functions
[This email is either empty or too large to be displayed at this time]