thr3ads.net - zfs code - [zfs-code] Building a debug version of libzpool [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Steve Gonczi

2009-Jun-11 20:12 UTC

[zfs-code] Building a debug version of libzpool

I am not succeeding in doing this. 

Opensolaris.sh has  option ( D and F) also 
bldenv also has an option ( -d) I tried setting either, neither, both..
Coincidentally, if bldenv is run with -d it outputs a burb mentioning that a
debug build is configured.
If this option  is not given on the command line, the blurb says it is a release
build, regardless of the debug flag settings in opensolaris.sh

Built full nightly, incremental nightly, and a subset build ( make clean; make)
from .../usr/src/lib.  No cigar, the resulting *.so has no debug information.
(Not stripped, but no debug info).

I figured I ask before I start looking for the needle in the haystack, maybe I
am missing something obvious.

Another thing,  ztest will load its libraries from the default library location
( typically /usr/lib), which is
probably not what a developer would want, esp. if a random zfs version happens 
to be already installed on the dev system.

Fortunately, this one has an easy solution:
set the env variable  LD_LIBRARY_PATH to the preferred load location ( typically
somewhere in the dev workspace).  A key point, the location must end in a
semicolon.

The semicolon causes the location specified to be searched _before_ the default.

Perhaps, the build environment should set this, or use "YP," to force
library
searches to start inside the build output area where the so-s go.
 
Also, as per docs, the opensolaris.sh "t" option is supposed to
"build and use" the .../usr/src/tools.
Well, it just uses the location, but does not build them if they are not already
 there. No big deal, they can be built from the aforementioned dir.
-- 
This message posted from opensolaris.org

Lori Alt

2009-Jun-11 22:15 UTC

head link

[zfs-code] Building a debug version of libzpool

I''ve run into this same issue recently and I''m working with
someone in
the Sun Studio group to help me figure it out, but so far, no luck. 

I thought that the only thing I needed to do was to make sure that the 
compiler and the linker were being executed with the -g option, but
I''ve
done that and I still can''t get dbx to recognize the dynamic libraries 
has having debugger info.

So if I get the answer, I''ll post it here.  In the meantime, if anyone 
else knows the trick, I''d appreciate learning it.

Lori


On 06/11/09 14:12, Steve Gonczi wrote:> I am not succeeding in doing this. 
>
> Opensolaris.sh has  option ( D and F) also 
> bldenv also has an option ( -d) I tried setting either, neither, both..
> Coincidentally, if bldenv is run with -d it outputs a burb mentioning that
a debug build is configured.
> If this option  is not given on the command line, the blurb says it is a
release
> build, regardless of the debug flag settings in opensolaris.sh
>
> Built full nightly, incremental nightly, and a subset build ( make clean;
make)
> from .../usr/src/lib.  No cigar, the resulting *.so has no debug
information.
> (Not stripped, but no debug info).
>
> I figured I ask before I start looking for the needle in the haystack,
maybe I am missing something obvious.
>
> Another thing,  ztest will load its libraries from the default library
location ( typically /usr/lib), which is
> probably not what a developer would want, esp. if a random zfs version
happens
> to be already installed on the dev system.
>
> Fortunately, this one has an easy solution:
> set the env variable  LD_LIBRARY_PATH to the preferred load location (
typically
> somewhere in the dev workspace).  A key point, the location must end in a
semicolon.
>
> The semicolon causes the location specified to be searched _before_ the
default.
>
> Perhaps, the build environment should set this, or use "YP," to
force library
> searches to start inside the build output area where the so-s go.
>  
> Also, as per docs, the opensolaris.sh "t" option is supposed to
"build and use" the .../usr/src/tools.
> Well, it just uses the location, but does not build them if they are not
already
>  there. No big deal, they can be built from the aforementioned dir.
>

"C. Bergström"

2009-Jun-11 22:28 UTC

head link

[zfs-code] Building a debug version of libzpool

Lori Alt wrote:> I''ve run into this same issue recently and I''m working
with someone in
> the Sun Studio group to help me figure it out, but so far, no luck.
> I thought that the only thing I needed to do was to make sure that the 
> compiler and the linker were being executed with the -g option, but 
> I''ve done that and I still can''t get dbx to recognize the
dynamic
> libraries has having debugger info.
>
> So if I get the answer, I''ll post it here.  In the meantime, if
anyone
> else knows the trick, I''d appreciate learning it.Are the binaries still gettting stripped? Run file path/to/libzpool.. It 
should tell you if it''s stripped or not..

/* Shameless self promotion
Our project (OSUNIX) is a work-in-progress, but makes all this a lot 
easier.  In theory to do this for osunix you''d just have to change a 
simple build configuration file and pmerge -1 libzpool.. You would 
probably want to make a snapshot and new boot environment before this, 
which in the future should be automatically.. You could also test any 
patches by changing only one line in the build script (This is 
convenient for testing webrev.. etc) Handling this cleanly was in my 
short lightening talk at CommunityOne..

After our next release I''ll post a tutorial on this..
*/

./C

Lori Alt

2009-Jun-11 23:41 UTC

head link

[zfs-code] Building a debug version of libzpool

On 06/11/09 16:28, C. Bergstr?m wrote:> Lori Alt wrote:
>> I''ve run into this same issue recently and I''m
working with someone
>> in the Sun Studio group to help me figure it out, but so far, no luck.
>> I thought that the only thing I needed to do was to make sure that 
>> the compiler and the linker were being executed with the -g option, 
>> but I''ve done that and I still can''t get dbx to
recognize the dynamic
>> libraries has having debugger info.
>>
>> So if I get the answer, I''ll post it here.  In the meantime,
if
>> anyone else knows the trick, I''d appreciate learning it.
> Are the binaries still gettting stripped? Run file path/to/libzpool.. 
> It should tell you if it''s stripped or not..
In my case, the binaries aren''t stripped.  I had checked on that.

But I did get an answer from the Sun Studio project:  the problem I''ve 
been seeing (which might not be the same problem that  Steve reports, 
but could be)  appears to be a known bug (6823053).  I don''t know the 
prospects for a fix, but there''s a workaround, which is to modify the 
compiler flags to change this:

-xdebugformat=stabs

to this:

-xdebugformat=dwarf

Also make sure the compiler is called with -g.

This worked for me.  But I have no idea what other effects this change 
may have, so use at your own risk.

Lori

Steve Gonczi

2009-Jun-12 02:37 UTC

head link

[zfs-code] Building a debug version of libzpool

Thanks for the replies.  I tried switching to dwarf.. no sucess.

I see what is happening. The last step in the scheme of the build system is
running ctfconvert
on the various *.o files, predictably converting the debug info to ctf format.  

The ctf  format AFAIK is primarily for mdb. 
Mdb is a fine tool for assembler/kernel debugging. 
Source debugging is preferable for some work, at least for me.
I do not know if there is an easy fix for this issue, my solution is to hack an
option,
 probably an env variable to disable the ctfconvert pass.

Ctf does not appear to be useful for any of the source debuggers (dbx or gdb).
BTW switching to dwarf reveales what appears to be a build anomaly...
Apparently, ctfconvert runs twice on at least one of the .o files.
 
This seems to cause no problem with the stabs format, 
but breaks the build when dwarf is used, and it tries to convert 
lib/libnsl/amd64/pics/rpc_comdata1.o  twice.
-- 
This message posted from opensolaris.org

"C. Bergström"

2009-Jun-12 04:19 UTC

head link

[zfs-code] Building a debug version of libzpool

Steve Gonczi wrote:> Thanks for the replies.  I tried switching to dwarf.. no sucess.
>
> I see what is happening. The last step in the scheme of the build system is
running ctfconvert
> on the various *.o files, predictably converting the debug info to ctf
format.
>   After you bldenv -d ./opensolaris.sh you can manually change a few 
things in the env.. (not normally recommended)

- export CTFCONVERT=/opt/onbld/bin/i386/ctfconvert
- export CTFMERGE=/opt/onbld/bin/i386/ctfmerge
- export CTFSTABS=/opt/onbld/bin/i386/ctfstabs
+ export CTFCONVERT=/bin/true
+ export CTFMERGE=/bin/true
+ export CTFSTABS=/bin/true

This way the build continues, but ctf* doesn''t actually do anything..

./C

Steve Gonczi

2009-Jun-12 04:30 UTC

head link

[zfs-code] Building a debug version of libzpool

Awesome, thanks. 

Much easier than what I was going to do.

I am going to go back to using stabs for this, I think.
-- 
This message posted from opensolaris.org

Steve Gonczi

2009-Jun-12 15:32 UTC

head link

[zfs-code] Building a debug version of libzpool

This proves to be unexpectedly  difficult to solve.

I find it hard to fathom, that nobody at SUN runs dbx on the various usermode
components of the OS,
specifically the ztest usermode exerciser.

I believe this forum proposes to be the right place to ask  this question:

Is there a working [easy] way to build dbx-debuggable userland components 
without hacking the build environment?

I was hoping to get a definitive answer from someone who does this on a daily
basis.
-- 
This message posted from opensolaris.org

Lori Alt

2009-Jun-12 16:28 UTC

head link

[zfs-code] Building a debug version of libzpool

Steve Gonczi wrote:
>This proves to be unexpectedly  difficult to solve.
>
>I find it hard to fathom, that nobody at SUN runs dbx on the various
usermode components of the OS,
>specifically the ztest usermode exerciser.
>
>I believe this forum proposes to be the right place to ask  this question:
>
>Is there a working [easy] way to build dbx-debuggable userland components 
>without hacking the build environment?
>
>I was hoping to get a definitive answer from someone who does this on a
daily basis.
>
I feel your pain.  I''ve been fighting this battle for a week.  I do 
Solaris debugging on a daily basis, but only recently started working on 
the zfs userland components.  I too am surprised that this isn''t
getting
wider attention.  (I should say that I don''t think this problem is 
limited to the zfs userland components.  I think it affect all userland 
code.)

I hope someone comes up with an answer to your question because all I 
can do is offer more details about how I hacked the environment to get 
this to work.  But I DID get it to work.  If you''re interested, I can 
give you all the steps I used (I left out the gory details in my earlier 
mail because I hoped that just setting the debug  format to "dwarf" 
would do the trick).  Let me know and i''ll write them down
(it''s not
THAT bad.  It''s a kludge, but it''s an easy kludge to try.)

In the meantime, I''m going to do what I can to raise the visibility of 
this problem.  The bug that appears to be at the root of at least my dbx 
problems is 6823053.  I''m going to raise the priority of it and add
some
comments to it.

Lori

Steve Gonczi

2009-Jun-12 17:48 UTC

head link

[zfs-code] Building a debug version of libzpool

By all means, please share your hacks.

I am not sure why switching to dwarf would make any difference, given that stabs
is the native Sun format, dbx can surely handle it.   I could see the point if
you are using a recent gdb maybe.

For me, the ctf* utilities are the likely problem, but seems I can not disable
running these because then some of the dynamically generated header files do not
get built.
-- 
This message posted from opensolaris.org

Lori Alt

2009-Jun-12 20:11 UTC

head link

[zfs-code] Building a debug version of libzpool

On 06/12/09 11:48, Steve Gonczi wrote:> By all means, please share your hacks.
>
> I am not sure why switching to dwarf would make any difference, given that
stabs is the native Sun format, dbx can surely handle it. From what I''ve learned, the bug isn''t in dbx.  Yes, it can
read stabs.
But somehow the compilation process isn''t generating them correctly.
>  I could see the point if you are using a recent gdb maybe.
>
> For me, the ctf* utilities are the likely problem, but seems I can not
disable running these because then some of the dynamically generated header
files do not get built.
>   First, I did a full build of the entire workspace.

Now cd to the <workspace>/usr/src/lib/libzfs directory (You will need to 
go the libzpool directory).

%  make clobber
% make install > doit 2>&1

Now edit the file "doit" to
a) remove all lines beginning with "+"
b) modify the lines that simply name a directory to prefix them with 
"cd".  i.e.,

this line:

/builds/lalt/onnv/usr/src/lib/libzfs/i386

becomes:

cd /builds/lalt/onnv/usr/src/lib/libzfs/i386

c) globally replace the string: 

    -xdebugformat=stabs

 with
   -xdebugformat=dwarf

Now, while your current directory is <repo>/usr/src/lib/libzfs, execute 
"doit" as a script.

At this point, I can now run a problem under dbx which accesses 
libzfs.so.1 (you
might have to set LD_LIBRARY_PATH) to make sure you''re using the one
you just built, not whatever is in /lib) and debug functions in the 
libzfs.so.1 library.
That is, I no longer get messages like:

(dbx) stop in zfs_send
dbx: warning: ''zfs_send'' has no debugger info -- will trigger
on first
instruction

I think you may be right that it''s the ctf* functions (which run
silently
during post processing) that are run by "make" that end up removing
the
debugger info.  Doing the compiles and links explicitly with a script
like this suppresses that step.  Yes, I''m sure that there are cleaner 
ways to
get around this, but I''m tired of experimenting and this works for me.

Lori

Steve Gonczi

2009-Jun-12 20:16 UTC

head link

[zfs-code] Building a debug version of libzpool

One problem with this build system, perhaps it is not fully implemented.

Its doc states it was a goal for it to allow building various sub-components by
switching to the sub-component''s predicatable directory, and building
said subcomponent from there, by typing make/dmake [all].

This fully works for some components ( e.g. tools)   and "sort of"
works for others.
E.g.: you can go to usr/src/lib and type "make" or "make
libzfs"  and something will run without complaining.   But dependencies do
not get picked up, e.g. if you wipe out any *.o files in
../usr/src/lib/libzpool/amd64/pics none of these will get rebuilt.

Yet if you do an incremental nightly build, they will be.  

I am experimenting with the previously mentioned permutations, like forcing
dwarf format, disabling the ctf conversion. First do a full nightly build with
the "standard" build settings, then wipe out the desired .o files and
do  an incremental nightly build with the various hacked settings.
-- 
This message posted from opensolaris.org

Steve Gonczi

2009-Jun-12 21:22 UTC

head link

[zfs-code] Building a debug version of libzpool

Thank you for posting this.

Let me recap  your key insights:
1) A successful full nightly build is pre-requisite
2) You can build a specific library from its lib/libxxx directory via 
make clobber; make install.
(I did notice make clean did not work but make clobber did. 
I was unaware of the "install" target.
3) dwarf works better than stabs

Any reason why redefining the preferred debug format in Makefile.master 
before step 2 to dwarf would not work?
-- 
This message posted from opensolaris.org

Steve Gonczi

2009-Jun-12 22:04 UTC

head link

[zfs-code] Building a debug version of libzpool

I did  the full nightly, then edited the Makefile.master debug setting so it
generates dwarf debug info.   Killed the CTF* utilities as outlined previously
in this thread.
Recompiled libzpool and libzfs from their usr/src/lib/libz* directories ( make
clean|clobber, usually try both, then make or make install - also try both).
After the successful compile, set  LD_LIBRARY_PATH 
to where the freshly baked libzpool lives. Do not forget the semicolon after the
search path, must be prefixed by \ else the shell strips it.

I did not need to capture, and edit the make output.

Things are mostly working now.  It is beyond me why this would not work with
stabs.
The only remaining thing is to turn off optimizing, so I can look at local
variables.

Thanks for all of you who helped out.

Cheers, 

Steve
-- 
This message posted from opensolaris.org

Jonathan Adams

2009-Jun-15 17:58 UTC

head link

[zfs-code] Building a debug version of libzpool

On Fri, Jun 12, 2009 at 03:04:55PM -0700, Steve Gonczi
wrote:> I did  the full nightly, then edited the Makefile.master debug setting so
it generates dwarf debug info.   Killed the CTF* utilities as outlined
previously in this thread.
> Recompiled libzpool and libzfs from their usr/src/lib/libz* directories (
make clean|clobber, usually try both, then make or make install - also try
both).
> After the successful compile, set  LD_LIBRARY_PATH 
> to where the freshly baked libzpool lives. Do not forget the semicolon
after the search path, must be prefixed by \ else the shell strips it.
> 
> I did not need to capture, and edit the make output.
> 
> Things are mostly working now.  It is beyond me why this would not
> work with stabs.  The only remaining thing is to turn off optimizing,
> so I can look at local variables.
Just jumping in here after the fact, but I thought I''d give some
background:

	Most of the engineers working in ON got started with kernel
	programming on Solaris, and so have a large amount of background
	with MDB and assembly-level debugging.  The CTF data provides
	all of the structure printing you might need using MDB, and
	the experience with assembly provides the "map stack trace to C
	program locations", along with figuring out which local variables
	are where, etc.

	Since most kernel engineers aren''t familiar with DBX, there''s
not a
	huge community clamoring for the ability to use it.

	The CTF tools process the debugging information (stabs or dwarf)
	in order to generate the C type information MDB uses.  By default,
	they also strip all other debugging information from the binaries,
	since they bloat the binary sizes and aren''t wanted in our shipping
	products.  Unfortunately, there is no easy way to add the
''-g'' flag
	to disable that stripping, except for:

	CTFCVTFLAGS=''-i -L VERSION -g''
CTFMGRFLAGS=''-g'' dmake install

I think that a set of simple environment settings which would enable this
for a build or part there-of would be useful, so you could do something like:

	(after a full nightly build)
	cd usr/src/lib/libzpool
	dmake clobber
	KEEP_DEBUG_INFO=yes DEBUG_TYPE=dwarf dmake install

and have both CTF and dwarf info would be a good RFE.

Cheers,
- jonathan

Steve Gonczi

2009-Jun-15 19:10 UTC

head link

[zfs-code] Building a debug version of libzpool

Hi Jonathan,

Thanks for providing us with a SUN developer''s perspective, and for the
new info on config tweaks.

I have gained a tiny bit more insight on this: 
The debug (-g) option is already set in all the compiles, given that the
requisite bldenv is set up as debug.
It is currently unclear whether the flag is set  set in opensolaris.sh or as  a
bldenv command line option,
but setting both of these certainly works.

Because the make option "-e" set by bldenv, any macro set in
Makefile.master can be
simply over-ridden by just setting it in the current build shell. 
This means, no editing of Makefile.master is necessary. E.g. in my environment
(tcsh)
I simply issue:

setenv DEBUGFORMAT -xdebugformat=dwarf; setenv COPTFLAG; setenv COPTFLAG64;

This effectively shuts off all optimizations, and switches to DWARF output for
any
subsequent debug userland builds. 

In addition, run dbx via ddd --dbx, and we have ourselves a neat gui source
debug environment.

Gdb does not seem to work with the SUN toolchain. 
For gdb afficionados, dbx commands can be munged into a gdb-like command set 
via .dbxrc macros.

Absolutely no disrespect to mdb, I love the thing for kernel/assembler
debugging.

However, stepping through source code,  walking the stack frames,  seeing local 
variables effortlessly is still a significant plus.

What I would ultimately like to see is a 2 machine kernel source debug
environment
similar to the Linux kgdb, or the IBM AIX kdbx environment. 
 Who knows, maybe an Ethernet gdb kernel stub could be implemented in mdb..
-- 
This message posted from opensolaris.org

Jonathan Adams

2009-Jun-15 19:36 UTC

head link

[zfs-code] Building a debug version of libzpool

On Mon, Jun 15, 2009 at 12:10:12PM -0700, Steve Gonczi
wrote:> Absolutely no disrespect to mdb, I love the thing for kernel/assembler
> debugging.
>
> However, stepping through source code, walking the stack frames,
> seeing local variables effortlessly is still a significant plus.
>
> What I would ultimately like to see is a 2 machine kernel source
> debug environment similar to the Linux kgdb, or the IBM AIX kdbx
> environment.  Who knows, maybe an Ethernet gdb kernel stub could be
> implemented in mdb..
My experience with the gdb protocol has been very negative; it''s
very poorly designed, and prone to breakage. 

While some kind of kmdb(1)-over-ethernet might be very cool, you''re
not likely to see a lot of excitement from Solaris engineering for a
source-level kernel debugger; in general, it saves you a very minor
amount of your total debugging time, since it''s straightforward (if
slightly time consuming) to backtrack from the assembly.  Figuring out
how the datastructures are corrupted and how they got that way is where
you spend most of your time.

Also, given that we''ve discovered compiler bugs on innumerable
occasions,
there''s not a lot of trust that the line-number/local variable
information
provided by the compiler is accurate enough, and turning off optimization to
get local variables changes the compiler output significantly compared to what
our customers are actually running, and can mask many subtle bugs.

Source-level debugging also adds a dependency from "binaries running on
system" to "corresponding source on other system" which will be
hard to
get right in practice.

So in summary, getting good source-level kernel debugging is viewed as a
large time investment for dubious gain.  We''d rather have smarter mdb
dcmds
or (for example) a better surface syntax for MDB than something which
won''t
actually make the bug analysis we do on a day-to-day basis measurably easier.

That''s not to say you''d be wasting your time trying to get
something like
this up-and-running, or that it wouldn''t be useful to get better DBX
support
for userland debugging.  Just understand that (especially for kernel
source-level support), the arguments have already happened several times in
the past, and you''re not likely to change the consensus.

(I could see DBX/gdb support for device driver writers to be something that
someone might find useful.  But Solaris engineering probably wouldn''t
find it
so.)

Cheers,
- jonathan

Steve Gonczi

2009-Jun-15 21:35 UTC

head link

[zfs-code] Building a debug version of libzpool

> My experience with the gdb protocol has been very
> negative; it''s
> very poorly designed, and prone to breakage. 
And, with a little luck, nobody from the GNU camp is reading this list  :-)
> While some kind of kmdb(1)-over-ethernet might be
> very cool...
Yes.. given that trying to get a KMDB session to work through LOM is such
a pain and many new x86 hardware platforms do not have a serial port, 
that would be wonderful.
> straightforward (if
> slightly time consuming) to backtrack from the
> assembly.  
I have one significant pain point with mdb, perhaps it is just my ignorance:

I find it difficult to reconstruct stack frames, ie. local variables on 64 bit
amd.
A bit off topic, but is there an easy way to obtain at least the rbp values for
the prior stack frames?  

E.g. the AIX kdb has an option for their stack backtrace to print register state
for each stack frame on the stack.

To be fair, kudos to the folks at SUN for recognizing the need of field
supportability,
and giving us mdb.  Also, dtrace is  sheer goodness,
and I do not have _much_  to complain about.

Cheers, 

Steve
-- 
This message posted from opensolaris.org

Jonathan Adams

2009-Jun-15 21:56 UTC

head link

[zfs-code] Building a debug version of libzpool

On Mon, Jun 15, 2009 at 02:35:10PM -0700, Steve Gonczi
wrote:> > My experience with the gdb protocol has been very
> > negative; it''s
> > very poorly designed, and prone to breakage. 
> 
> And, with a little luck, nobody from the GNU camp is reading this list  :-)
I think my experience is all well-known; the lack of versioning or even
verifying that the architectures both sides are using match, etc.  It''s
just the way the protocol grew.
> > While some kind of kmdb(1)-over-ethernet might be
> > very cool...
> 
> Yes.. given that trying to get a KMDB session to work through LOM is such
> a pain and many new x86 hardware platforms do not have a serial port, 
> that would be wonderful.
> 
> > straightforward (if
> > slightly time consuming) to backtrack from the
> > assembly.  
> 
> I have one significant pain point with mdb, perhaps it is just my
ignorance:
> 
> I find it difficult to reconstruct stack frames, ie. local variables
> on 64 bit amd.  A bit off topic, but is there an easy way to obtain at
> least the rbp values for the prior stack frames?
Yeah, for any given line of ::findstack -v or $C, you have:
> ffffff014bdab900::findstack -vstack pointer for thread ffffff014bdab900: ffffff00041dbcc0
[ ffffff00041dbcc0 _resume_from_idle+0xf1() ]
  ^^^^^^^^^^^^^^^^ %rsp ^^^ PC
  ffffff00041dbcf0 swtch+0x147()
  ^ rbp            ^ PC			(frame 1)
  ffffff00041dbd50 cv_wait_sig_swap_core+0x170(ffffff014bdabade,
ffffff014bdabae0, 0)
  ^ rbp		   ^ PC			(frame 2)
  ffffff00041dbd70 cv_wait_sig_swap+0x18(ffffff014bdabade, ffffff014bdabae0)
  ^ rbp		   ^ PC			(frame 3)
  ffffff00041dbde0 cv_waituntil_sig+0x13c(ffffff014bdabade, ffffff014bdabae0, 0
  ^ rbp		   ^ PC			(frame 4)
...

etc.

The main problem in AMD64 is backtracking the callee-saved variables, but at
least the function arguments are saved for kernel calls.

Cheers,
- jonathan
> E.g. the AIX kdb has an option for their stack backtrace to print
> register state for each stack frame on the stack.
>
> To be fair, kudos to the folks at SUN for recognizing the need of
> field supportability, and giving us mdb.  Also, dtrace is sheer
> goodness, and I do not have _much_ to complain about.
Understood.

Cheers,
- jonathan

Steve Gonczi

2009-Jun-15 23:58 UTC

head link

[zfs-code] Building a debug version of libzpool

I see. 

The first number (which I always assumed was just 
the numeric equivalent of the function+offset shown in the next column) 
is the frame pointer for the associated stack frame.  

Presumably, the first line findstack prints is a typo: it calls the value
"stack pointer"
I think it is really "frame pointer" ie it is rbp, not rsp.

Thank you very much for  the info.

Steve
-- 
This message posted from opensolaris.org

Jonathan Adams

2009-Jun-16 16:14 UTC

head link

[zfs-code] Building a debug version of libzpool

On Mon, Jun 15, 2009 at 04:58:27PM -0700, Steve Gonczi
wrote:> I see. 
> 
> The first number (which I always assumed was just 
> the numeric equivalent of the function+offset shown in the next column) 
> is the frame pointer for the associated stack frame.  
> 
> Presumably, the first line findstack prints is a typo: it calls the
> value "stack pointer" I think it is really "frame
pointer" ie it is
> rbp, not rsp.
It''s really a matter of terminology;  it''s called a
"stack pointer" because
it''s a value you pass to "$C" or "::stack" to get a
stack out.  On SPARC,
the stack pointer is *always* pointing to a stack frame, since that''s
where
the register window is saved (register windows are *FUN*), and the term has
stuck, even though on x86 "frame pointer" might be a better term.

Cheers,
- jonathan

Steve Gonczi

2009-Jun-18 17:57 UTC

head link

[zfs-code] Building a debug version of libzpool

I managed to get the ztest debug compile, using the technique written up in this
thread.

One thing is still not working:  Trying to apply the same methodology to an
userland
command (e.g. fashioning a debug-able zpool command) does not work.

No matter what I do, my freshly recompiled zpool command insists on loading its 
libraries from the system locations ( /lib and /usr/lib)  and not my workspace 
location (where the debug libraries are). 

I have a LD_LIBRARY_PATH set to where the debug lib''s are ( actually 2
locations
separated by ":" and terminated with a "\;".  Using the same
settings I use for ztest,
works like a charm there.

If you have managed to debug any of the usermode commands, please share 
your experience.

I continue looking at this, my current theory is maybe the library search
location
is made fixed during the build. 

Steve
-- 
This message posted from opensolaris.org

Jonathan Adams

2009-Jun-18 18:09 UTC

head link

[zfs-code] Building a debug version of libzpool

On Thu, Jun 18, 2009 at 10:57:45AM -0700, Steve Gonczi
wrote:> I managed to get the ztest debug compile, using the technique written up in
this thread.
> 
> One thing is still not working:  Trying to apply the same methodology to an
userland
> command (e.g. fashioning a debug-able zpool command) does not work.
> 
> No matter what I do, my freshly recompiled zpool command insists on loading
its
> libraries from the system locations ( /lib and /usr/lib)  and not my
workspace
> location (where the debug libraries are). 
> 
> I have a LD_LIBRARY_PATH set to where the debug lib''s are (
actually 2 locations
> separated by ":" and terminated with a "\;".  Using the
same settings I use for ztest,
> works like a charm there.
What does:
	ldd /path/to/zpool
output?

You should generally use:

	LD_LIBRARY_PATH_32=/path1:/path2

for 32-bit libraries and binaries, and

	LD_LIBRARY_PATH_64=/path1/64:/path2/64

(where 64 could also be "amd64" or "sparcv9", depending upon
your ISA)

There should be no '';''s in LD_LIBRARY_PATH.
> If you have managed to debug any of the usermode commands, please share 
> your experience.
> 
> I continue looking at this, my current theory is maybe the library    
> search location is made fixed during the build.                       
Unless:

	elfdump -d /path/to/zpool

contains a "RUNPATH" or "RPATH" line, the search location is
not fixed.  The
default build environment does not set a runpath.

Is your zpool binary setuid?  That will turn off LD_LIBRARY_PATH* searching.

Cheers,
- jonathan
> 
> Steve
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-code mailing list
> zfs-code at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-code

Steve Gonczi

2009-Jun-18 20:11 UTC

head link

[zfs-code] Building a debug version of libzpool

The problem turns out to be 32/64 bit. The zpool command uses 32 bit libraries,
and I
was only setting the 64 bit locations.  The minute I did a "file" on
the libraries zpool was
loading, this became obvious.  

Thanks for pointing out the 32/64 LD_LIBRARY_PATH variants, and the elfdump
trick to
see if  a search location was set.

I got the semicolon idea from the ld man pages. (Perhaps I was wrongly assuming
that it
also had some effect on the run-time library path search order)

Paraphrasing the man page:

For ld processing,  including a semicolon in the LD_LIBRARY_PATH causes the
path(s)
preceeding the semi to be searched first (before the -L paths).
Without the semi, the entire LD_LIBRARY PATH is searched _after_ processing the
-L options.

Cheers, 
Steve
-- 
This message posted from opensolaris.org

"C. Bergström"

2009-Jun-21 09:34 UTC

head link

[zfs-code] Building a debug version of libzpool

Jonathan Adams wrote:> On Fri, Jun 12, 2009 at 03:04:55PM -0700, Steve Gonczi wrote:
>   
>> I did  the full nightly, then edited the Makefile.master debug setting
so it generates dwarf debug info.   Killed the CTF* utilities as outlined
previously in this thread.
>> Recompiled libzpool and libzfs from their usr/src/lib/libz* directories
( make clean|clobber, usually try both, then make or make install - also try
both).
>> After the successful compile, set  LD_LIBRARY_PATH 
>> to where the freshly baked libzpool lives. Do not forget the semicolon
after the search path, must be prefixed by \ else the shell strips it.
>>
>> I did not need to capture, and edit the make output.
>>
>> Things are mostly working now.  It is beyond me why this would not
>> work with stabs.  The only remaining thing is to turn off optimizing,
>> so I can look at local variables.
>>     
>
> Just jumping in here after the fact, but I thought I''d give some
background:
>
> 	Most of the engineers working in ON got started with kernel
> 	programming on Solaris, and so have a large amount of background
> 	with MDB and assembly-level debugging.  The CTF data provides
> 	all of the structure printing you might need using MDB, and
> 	the experience with assembly provides the "map stack trace to C
> 	program locations", along with figuring out which local variables
> 	are where, etc.
>
> 	Since most kernel engineers aren''t familiar with DBX,
there''s not a
> 	huge community clamoring for the ability to use it.
>
> 	The CTF tools process the debugging information (stabs or dwarf)
> 	in order to generate the C type information MDB uses.A small nit on this thread and also a wild thought..

1) It''s stab (not stabs - I''ve been corrected on this before)
And there
are several sun doc pages that need updating as well.
2) While I''m unaware of an open source compiler and debugger with
dwarf3
support, but it''s supposed to handle compression well. [1] How this 
compares to CTF would have to be tested.


[1] http://reality.sgiweb.org/davea/dwarf3features.pdf

./C


---
OSUNIX - Built from the best of OpenSolaris Technology
http://www.osunix.org

Steve Gonczi

2009-Jul-01 20:23 UTC

head link

[zfs-code] Building a debug version of libzpool

Hello,

1)  Ztest  expects to run from /usr/bin.  It has a hard-coded assumption as to
its location.

2)  To debug it with some success, it should be linked with the -mt  option,
because it
uses multiple lwp-s.  Currently, it is not being built with that.

Cheers

Steve
-- 
This message posted from opensolaris.org

Steve Gonczi

2009-Jul-02 15:11 UTC

head link

[zfs-code] Building a debug version of libzpool

Here is another, interesting wrinkle:

Looking at /usr/bin and /usr/sbin,  I notice that a whole bunch of seemingly
unrelated
utilities appear to be just hard links to a shared file.  ( ls -il reveals a
shared
inode number, same size, and the same link count to groups of them).

I am guessing that some groups of these utilities go through common front end
code,
that then dispatches to the correct bits based on argv0.  Could someone confirm
this?

This would explain why recompiling ztest and zdb and copying the new bits into
/usr/bin
and /usr/sbin respectively resulted in a whole bunch of my utilities
"becoming zdb".
-- 
This message posted from opensolaris.org

Victor Latushkin

2009-Jul-02 15:25 UTC

head link

[zfs-code] Building a debug version of libzpool

On 02.07.09 19:11, Steve Gonczi wrote:> Here is another, interesting wrinkle:
> 
> Looking at /usr/bin and /usr/sbin,  I notice that a whole bunch of
seemingly unrelated
> utilities appear to be just hard links to a shared file.  ( ls -il reveals
a shared
> inode number, same size, and the same link count to groups of them).
> 
> I am guessing that some groups of these utilities go through common front
end code,
> that then dispatches to the correct bits based on argv0.  Could someone
confirm this?
> 
> This would explain why recompiling ztest and zdb and copying the new bits
into /usr/bin
> and /usr/sbin respectively resulted in a whole bunch of my utilities
"becoming zdb".
this is isaexec

you should put your recompiled zdb/ztest into appropriate directory like
/usr/bin/i86
/usr/bin/amd64
/usr/sbin/i86
/usr/sbin/amd64
/usr/sbin/sparcv9
/usr/sbin/sparcv9

etc

Victor

Darren J Moffat

2009-Jul-02 15:27 UTC

head link

[zfs-code] Building a debug version of libzpool

Steve Gonczi wrote:> Here is another, interesting wrinkle:
> 
> Looking at /usr/bin and /usr/sbin,  I notice that a whole bunch of
seemingly unrelated
> utilities appear to be just hard links to a shared file.  ( ls -il reveals
a shared
> inode number, same size, and the same link count to groups of them).
Lots are hardlinks to /usr/lib/isaexec

For why see the isaexec man page.
> I am guessing that some groups of these utilities go through common front
end code,
> that then dispatches to the correct bits based on argv0.  Could someone
confirm this?
Some do that but most of them will be the correct "bitness" versions
ie
32 vs 64 bit.
> This would explain why recompiling ztest and zdb and copying the new bits
into /usr/bin
> and /usr/sbin respectively resulted in a whole bunch of my utilities
"becoming zdb".
zdb is in fact an isaexec link.

-- 
Darren J Moffat

Tim Haley

2009-Jul-02 15:30 UTC

head link

[zfs-code] Building a debug version of libzpool

Steve Gonczi wrote:> Here is another, interesting wrinkle:
> 
> Looking at /usr/bin and /usr/sbin,  I notice that a whole bunch of
seemingly unrelated
> utilities appear to be just hard links to a shared file.  ( ls -il reveals
a shared
> inode number, same size, and the same link count to groups of them).
> 
> I am guessing that some groups of these utilities go through common front
end code,
> that then dispatches to the correct bits based on argv0.  Could someone
confirm this?
> 
> This would explain why recompiling ztest and zdb and copying the new bits
into /usr/bin
> and /usr/sbin respectively resulted in a whole bunch of my utilities
"becoming zdb".
They are links to isaexec, which is a wrapper executable that runs the most 
appropriate version for the current machines.

So zdb points to isaexec which then runs, for example, i86/zdb or amd64/zdb 
depending on the machine''s architecture.

-tim

Steve Gonczi

2009-Jul-08 17:33 UTC

head link

[zfs-code] Building a debug version of libzpool

OK thanks for everyone who commented on the isainfo topic.

I am running ztest under dbx control now ( added the -mt option as well).

After running for a while, it dies with the message: 
Running: ztest 
(process id 103450)
child died with signal 5.

Bringing up the core:

Corefile specified executable: "/usr/bin/amd64/ztest"
core file header read successfully
t at 530 (l at 530) terminated by signal TRAP (breakpoint trap)
0xfffffd7fff3c8e29: rtld_db_dlactivity+0x0001:  movq     %rsp,%rbp
Current function is gzip_compress
   46           if (z_compress_level(d_start, &dstlen, s_start, s_len, n) !=
Z_OK) {
(dbx 11) where
current thread: t at 530
  [1] rtld_db_dlactivity(0xfffffd7fff3fb1e0, 0x3, 0x1, 0xfffffd7fff3feac8,
0xfffffd7fff3c8e28, 0xfffffd7fff360800), at 0xfffffd7fff3c8e29
  [2] 0x40(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x40 
  [3] lm_move(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3ca5f5 
  [4] relocate_lmc(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3bc0e4 
  [5] elf_lazy_load(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3c08de 
  [6] _lookup_sym(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3bf96c 
  [7] lookup_sym(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3bfdf3 
  [8] elf_bndr(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3d70b1 
  [9] elf_rtbndr(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3bb4d4 
  [10] 0xfffffd7fff350030(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff350030 
  [11] 0xfffffd7fff350030(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff350030 
=>[12] gzip_compress(s_start = 0x40ef800, d_start = 0x4a23400, s_len = 1024U,
d_len = 512U, n = 3), line 46 in "gzip.c"
  [13] zio_compress_data(cpfunc = 7, src = 0x40ef800, srcsize = 1024U, destp =
0xfffffd7fec082ee8, destsizep = 0xfffffd7fec082ed8, destbufsizep =
0xfffffd7fec082ed0), line 115 in "zio_compress.c"
  [14] zio_write_bp_init(zio = 0x6305640), line 907 in "zio.c"
  [15] zio_execute(zio = 0x6305640), line 1051 in "zio.c"
  [16] taskq_thread(arg = 0x4aad00), line 157 in "taskq.c"
  [17] _thrp_setup(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffab4eb85 
  [18] _lwp_start(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7ffab4ee40 

I have no breakpoints set in the code,
Ztest runs to completion when not under dbx control.   
but when run from the debugger,  it always terminates with the above stack.
I tried to catch or ignore the signal, or all signals from dbx, makes no
difference.
Any suggestions what is happening here?

I a considering 2 possibilities:

1) dbx is reacting to ztest threads getting killed?
2) This is a bona fide crash, but dbx is interpreting it incorrectly.

TIA for any insights.

Steve
-- 
This message posted from opensolaris.org

Steve Gonczi

2009-Jul-09 15:01 UTC

head link

[zfs-code] Building a debug version of libzpool

This isssue is beginning to look like a consequence of ztest killing some of its
threads.
I think signal 5 is really CLD_STOPPED (same as SIGTRAP)

Anyone has a suggestion how to get around this in dbx?  What I would like to
happen is
for dbx to not terminate when a thread gets killed.

Tried "ignore 5" (no effect).
-- 
This message posted from opensolaris.org

zfs code - Jun 2009 - Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool

[zfs-code] Building a debug version of libzpool