Martin Cerveny
2009-Oct-10 11:43 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
Hello. The PCFS filesystem does not create files/dirs like ".NET" :-) Ok, I tried to dtrace PCFS flow in kernel, BUT if I run dtrace oneliner, the PCFS works. Can someone explain me why dtrace (non destructive) influences functionality of kernel ?? M.C> (attached output on my snv_123) -- This message posted from opensolaris.org -------------- next part -------------- root at martin-nb:/# mkfile 10m /pcfs_dev root at martin-nb:/# LOFIDEV=`lofiadm -a /pcfs_dev` root at martin-nb:/# RLOFIDEV="`echo $LOFIDEV | sed ''s/lofi/rlofi/''`" root at martin-nb:/# # following two lines is my workaround about bug in libfdisk root at martin-nb:/# mkdir -p /dev/rdsk//dev/rlofi/ root at martin-nb:/# ln -s /dev/rlofi/1 /dev/rdsk/dev/rlofi/1p0 root at martin-nb:/# fdisk -A 4:128:9:36:0:69:56:1:602:19866 $RLOFIDEV root at martin-nb:/# echo y | mkfs -F pcfs $RLOFIDEV:c root at martin-nb:/# mount -F pcfs $LOFIDEV:c /mnt root at martin-nb:/# root at martin-nb:/# cd /mnt root at martin-nb:/mnt# root at martin-nb:/mnt# let i=0; while [[ $i -le 10 ]]; do mkdir .NET; ls -lad .NET; rmdir .NET; let i=i+1; done .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist root at martin-nb:/mnt# root at martin-nb:/mnt# dtrace -Fn ''syscall::mkdir*:entry { self->trc=1 } syscall::mkdir*:return { self->trc=0 } fbt::: /self->trc/ {} '' > /dev/null & [1] 2652 root at martin-nb:/mnt# dtrace: description ''syscall::mkdir*:entry '' matched 88031 probes root at martin-nb:/mnt# root at martin-nb:/mnt# let i=0; while [[ $i -le 10 ]]; do mkdir .NET; ls -lad .NET; rmdir .NET; let i=i+1; done drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET drwxrwxrwx 1 root root 512 Oct 10 13:24 .NET root at martin-nb:/mnt# root at martin-nb:/mnt# kill %1 root at martin-nb:/mnt# root at martin-nb:/mnt# [1]+ Done dtrace -Fn ''syscall::mkdir*:entry { self->trc=1 } syscall::mkdir*:return { self->trc=0 } fbt::: /self->trc/ {} '' > /dev/null root at martin-nb:/mnt# root at martin-nb:/mnt# let i=0; while [[ $i -le 10 ]]; do mkdir .NET; ls -lad .NET; rmdir .NET; let i=i+1; done .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist root at martin-nb:/mnt# root at martin-nb:/mnt# root at martin-nb:/mnt# dtrace -Fn ''syscall::mkdir*:entry { self->trc=1 } syscall::mkdir*:return { self->trc=0 } fbt::: /self->trc/ {} '' > /dev/null & [1] 2719 root at martin-nb:/mnt# root at martin-nb:/mnt# dtrace: description ''syscall::mkdir*:entry '' matched 88031 probes root at martin-nb:/mnt# root at martin-nb:/mnt# let i=0; while [[ $i -le 10 ]]; do mkdir .NET; ls -lad .NET; rmdir .NET; let i=i+1; done drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET drwxrwxrwx 1 root root 512 Oct 10 13:25 .NET root at martin-nb:/mnt# root at martin-nb:/mnt# uname -a SunOS martin-nb 5.11 snv_123 i86pc i386 i86pc
Phil Harman
2009-Oct-10 12:56 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
I can reproduce the bug on snv_82, but not the "fix". However, enabling fbt::: (even with predicate) on my system slows down my mkdir and ls commands by a factor of 10x or more. Interestingly, if I use the "foldcase" mount option and then use "mkdir .net" but leave .NET in the ls and rmdir commands, it works. However, it is still broken for "mkdir .NET". It would seem that there bug is is the PCFS implementation in any case (e.g. why does mkdir return with out an error, and yet the directory has not been created). Your test seems to indicate there is also a race condition, which hides the bug, and which your fbt::: probe effect opens the window for. Quite bizzarre! What happens if you insert a "sleep 1" in the loop between the mkdir and ls? How on earth did you stumble on this? Phil Martin Cerveny wrote:> Hello. > > The PCFS filesystem does not create files/dirs like ".NET" :-) > Ok, I tried to dtrace PCFS flow in kernel, BUT if I run dtrace oneliner, the PCFS works. > > Can someone explain me why dtrace (non destructive) influences functionality of kernel ?? > > M.C> > > (attached output on my snv_123) > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Martin Cerveny
2009-Oct-10 14:56 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
Hello,> What happens if you insert a "sleep 1" in the loop > between the mkdir and ls?Error still exist with sleep or with mount/umount (with stable "dtracing side effect"; not shown here): # let i=0; while [[ $i -le 10 ]]; do mkdir .NET; sleep 1; ls -lad .NET; sleep 1; rmdir .NET; let i=i+1; done .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory rmdir: directory ".NET": Directory does not exist .NET: No such file or directory .... # cd .. # let i=0; while [[ $i -le 10 ]]; do mkdir /mnt/.NET; umount /mnt; mount -F pcfs $LOFIDEV:c /mnt; ls -lad /mnt/.NET; sleep 1; rmdir /mnt/.NET; let i=i+1; done /mnt/.NET: No such file or directory rmdir: directory "/mnt/.NET": Directory does not exist /mnt/.NET: No such file or directory rmdir: directory "/mnt/.NET": Directory does not exist ... M.C> PS: I understand that there is the error in PCFS (I will probably find and correct the problem), but I am very disappointed by "dtracing side effect" !!!! This breaks non intrusiveness of dtrace. -- This message posted from opensolaris.org
Phil Harman
2009-Oct-10 15:38 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
DTrace does allow for some very unobtrusive observation. Enabling every fbt::: probe is always going to have a measurable effect on performance. From what we know of the implementation of DTrace, it would seem very unlikely that DTrace per se has changed the way the already broken PCFS code behaves. Rather, I suspect a latent race condition, the window for which is opened once the code has been slowed down by the sledge hammer of enabling every fbt probe. Your experience doesn''t undermine confidence in DTrace, it simply underlines how complex systems are, and that we need to take things like probe effect into account, no matter how small it may be. Phil Martin Cerveny wrote:> Hello, > >> What happens if you insert a "sleep 1" in the loop >> between the mkdir and ls? >> > > Error still exist with sleep or with mount/umount (with stable "dtracing side effect"; not shown here): > > # let i=0; while [[ $i -le 10 ]]; do mkdir .NET; sleep 1; ls -lad .NET; sleep 1; rmdir .NET; let i=i+1; done > .NET: No such file or directory > rmdir: directory ".NET": Directory does not exist > .NET: No such file or directory > rmdir: directory ".NET": Directory does not exist > .NET: No such file or directory > .... > > # cd .. > # let i=0; while [[ $i -le 10 ]]; do mkdir /mnt/.NET; umount /mnt; mount -F pcfs $LOFIDEV:c /mnt; ls -lad /mnt/.NET; sleep 1; rmdir /mnt/.NET; let i=i+1; done > /mnt/.NET: No such file or directory > rmdir: directory "/mnt/.NET": Directory does not exist > /mnt/.NET: No such file or directory > rmdir: directory "/mnt/.NET": Directory does not exist > ... > > M.C> > > PS: I understand that there is the error in PCFS (I will probably find and correct the problem), but I am very disappointed by "dtracing side effect" !!!! This breaks non intrusiveness of dtrace. >
Martin Cerveny
2009-Oct-10 19:24 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
> DTrace does allow for some very unobtrusive > observation. Enabling every > fbt::: probe is always going to have a measurable > effect on performance. > > Rather, I suspect a latent race condition, the window > for which is > opened once the code has been slowed down by the > sledge hammer of > enabling every fbt probe.Ok, this should be some race condition. I get the "error" even with 61 probes enabled only in PCFS (fired 33146 per 10 mkdir iterations in test line). dtrace -Fn ''syscall::mkdir*:entry { self->trc=1 } syscall::mkdir*:return { self->trc=0 } fbt:pcfs:pc_*:return /self->trc/ { } '' > /dev/null I noticed very funny fact if I changedfbt:pcfs:pc_*:return -> changedfbt:pcfs:pc_*:entry removes the "error". I am very unhappy :-( M.C> -- This message posted from opensolaris.org
Ryan Johnson
2009-Oct-15 07:45 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
Martin Cerveny wrote:>> DTrace does allow for some very unobtrusive >> observation. Enabling every >> fbt::: probe is always going to have a measurable >> effect on performance. >> >> Rather, I suspect a latent race condition, the window >> for which is >> opened once the code has been slowed down by the >> sledge hammer of >> enabling every fbt probe. >> > > Ok, this should be some race condition. > I get the "error" even with 61 probes enabled only in PCFS (fired 33146 per 10 mkdir iterations in test line). > > dtrace -Fn ''syscall::mkdir*:entry { self->trc=1 } syscall::mkdir*:return { self->trc=0 } fbt:pcfs:pc_*:return /self->trc/ { } '' > /dev/null > > I noticed very funny fact if I changedfbt:pcfs:pc_*:return -> changedfbt:pcfs:pc_*:entry removes the "error". > I am very unhappy :-( >Welcome to the joys of parallel programming! Sounds like you''ve got a class "A" Heisenbug on your hands (see, e.g. http://catb.org/jargon/html/H/heisenbug.html). Seriously, I wouldn''t blame this on DTrace. I''ve had Heisenbugs exactly like you describe, and they''re a royal pain precisely because the slightest change can open or close timing windows. The worst was the time I declared a local variable before a problematic section of code, assigned hopefully-helpful constants at various points, then sat back and waited for the seg fault... which never came. Adding even 2-3 extra instructions to any one path through the code of interest was too much. This would be like blaming a debug malloc implementation for failing to catch (or worsening the effects of) a bad pointer access which clobbers malloc''s internal state. All bets are off at that point. If anything, be happy that there''s a relatively painless way to narrow down the bug''s hiding place (by varying which probes are enabled and seeing what happens). The chances of a timely fix are exponentially proportional to the ease of reproducing the problem... Regards, Ryan
Nicolas Williams
2009-Oct-15 15:18 UTC
[dtrace-discuss] Dtracing influence functionality (of PCFS)
On Sat, Oct 10, 2009 at 07:56:14AM -0700, Martin Cerveny wrote:> PS: I understand that there is the error in PCFS (I will probably find > and correct the problem), but I am very disappointed by "dtracing side > effect" !!!! This breaks non intrusiveness of dtrace.All DTrace did here was have some enabled-probe performance impact, which is what it normally does and unavoidable. The fact that there is a race condition which can be triggered by that performance impact is a bug, not in DTrace, but in the affected sub-system (pcfs). The fact that DTrace can have such impact is sometimes a very useful side-effect of DTrace, and it''s why there''s a chill() [destructive] function that you can use in probe actions: race conditions can be very difficult to trigger, making the ability to play with timing via DTrace very useful. Also, as others have pointed out, you enabled all FBT probes; when you do that the per-enabled-probe performance impact of DTrace adds up quickly. You might argue that enabling too many probes is itself as destructive as chill(), but unlike chill(), DTrace cannot judge the overall performance impact of your D scripts; failing to distinguish between destructive and non-destructive actions on the argument that all performance impact of enabled probes, no matter how tiny, is destructive would make it harder to use DTrace safely, not easier. Nico --