flight 13461 xen-unstable real [real] http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/ Regressions :-( Tests which did not succeed and are blocking, including tests which could not be run: test-amd64-i386-pv 18 leak-check/check fail REGR. vs. 13459 test-amd64-amd64-pv 18 leak-check/check fail REGR. vs. 13459 test-amd64-i386-xl 18 leak-check/check fail REGR. vs. 13459 test-amd64-i386-xl-multivcpu 18 leak-check/check fail REGR. vs. 13459 test-amd64-i386-xl-credit2 18 leak-check/check fail REGR. vs. 13459 test-amd64-amd64-pair 21 leak-check/check/dst_host fail REGR. vs. 13459 test-amd64-amd64-pair 20 leak-check/check/src_host fail REGR. vs. 13459 test-i386-i386-pv 18 leak-check/check fail REGR. vs. 13459 test-i386-i386-xl 18 leak-check/check fail REGR. vs. 13459 test-amd64-amd64-xl 18 leak-check/check fail REGR. vs. 13459 test-i386-i386-pair 20 leak-check/check/src_host fail REGR. vs. 13459 test-i386-i386-pair 21 leak-check/check/dst_host fail REGR. vs. 13459 test-amd64-i386-pair 20 leak-check/check/src_host fail REGR. vs. 13459 test-amd64-i386-pair 21 leak-check/check/dst_host fail REGR. vs. 13459 Regressions which are regarded as allowable (not blocking): test-amd64-amd64-xl-sedf 18 leak-check/check fail REGR. vs. 13458 test-amd64-amd64-xl-sedf-pin 18 leak-check/check fail REGR. vs. 13459 test-i386-i386-xl-qemuu-winxpsp3 9 guest-localmigrate fail like 13459 test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail like 13459 test-amd64-amd64-xl-qemuu-win7-amd64 9 guest-localmigrate fail like 13459 Tests which did not succeed, but are not blocking: test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass test-amd64-i386-rhel6hvm-amd 11 leak-check/check fail never pass test-amd64-i386-qemuu-rhel6hvm-amd 11 leak-check/check fail never pass test-amd64-i386-rhel6hvm-intel 11 leak-check/check fail never pass test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass test-amd64-amd64-win 16 leak-check/check fail never pass test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass test-amd64-i386-qemuu-rhel6hvm-intel 11 leak-check/check fail never pass test-amd64-i386-win 16 leak-check/check fail never pass test-i386-i386-win 16 leak-check/check fail never pass test-i386-i386-xl-win 13 guest-stop fail never pass test-amd64-amd64-xl-win 13 guest-stop fail never pass test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass test-i386-i386-xl-winxpsp3 13 guest-stop fail never pass test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass version targeted for testing: xen 7c0351dc41a5 baseline version: xen 42f76d536b11 ------------------------------------------------------------ People who touched revisions under test: Daniel P. Berrange <berrange@redhat.com> Deep Debroy <ddebroy@gmail.com> Ian Campbell <ian.campbell@citrix.com> Ian Jackson <ian.jackson@eu.citrix.com> Jan Beulich <jbeulich@suse.com> Keir Fraser <keir@xen.org> Matt Wilson <msw@amazon.com> Michael Young <m.a.young@durham.ac.uk> Olaf Hering <olaf@aepfle.de> Roger Pau Monn? <roger.pau@citrix.com> Stefano Stabellini <stefano.stabellini@eu.citrix.com> W. Michael Petullo <mike@flyn.org> Zhigang Wang <zhigang.x.wang@oracle.com> ------------------------------------------------------------ jobs: build-amd64 pass build-i386 pass build-amd64-oldkern pass build-i386-oldkern pass build-amd64-pvops pass build-i386-pvops pass test-amd64-amd64-xl fail test-amd64-i386-xl fail test-i386-i386-xl fail test-amd64-i386-rhel6hvm-amd fail test-amd64-i386-qemuu-rhel6hvm-amd fail test-amd64-amd64-xl-qemuu-win7-amd64 fail test-amd64-amd64-xl-win7-amd64 fail test-amd64-i386-xl-win7-amd64 fail test-amd64-i386-xl-credit2 fail test-amd64-amd64-xl-pcipt-intel fail test-amd64-i386-rhel6hvm-intel fail test-amd64-i386-qemuu-rhel6hvm-intel fail test-amd64-i386-xl-multivcpu fail test-amd64-amd64-pair fail test-amd64-i386-pair fail test-i386-i386-pair fail test-amd64-amd64-xl-sedf-pin fail test-amd64-amd64-pv fail test-amd64-i386-pv fail test-i386-i386-pv fail test-amd64-amd64-xl-sedf fail test-amd64-i386-win-vcpus1 fail test-amd64-i386-xl-win-vcpus1 fail test-amd64-i386-xl-winxpsp3-vcpus1 fail test-amd64-amd64-win fail test-amd64-i386-win fail test-i386-i386-win fail test-amd64-amd64-xl-win fail test-i386-i386-xl-win fail test-amd64-amd64-xl-qemuu-winxpsp3 fail test-i386-i386-xl-qemuu-winxpsp3 fail test-amd64-i386-xend-winxpsp3 fail test-amd64-amd64-xl-winxpsp3 fail test-i386-i386-xl-winxpsp3 fail ------------------------------------------------------------ sg-report-flight on woking.cam.xci-test.com logs: /home/xc_osstest/logs images: /home/xc_osstest/images Logs, config files, etc. are available at http://www.chiark.greenend.org.uk/~xensrcts/logs Test harness code can be found at http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary Not pushing. ------------------------------------------------------------ changeset: 25593:7c0351dc41a5 tag: tip user: Olaf Hering <olaf@aepfle.de> date: Wed Jul 04 15:46:17 2012 +0100 tools/m4: add AC_LANG_SOURCE to fix autoconf warnings I see these warnings with autoconf 2.68, add AC_LANG_SOURCE as suggested by upstream documentation. ... # bash autogen.sh configure.ac:141: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... ../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from... ../../lib/autoconf/general.m4:2032: AC_CACHE_VAL is expanded from... ../../lib/autoconf/general.m4:2053: AC_CACHE_CHECK is expanded from... m4/pthread.m4:21: AX_CHECK_PTHREAD is expanded from... configure.ac:141: the top level configure.ac:142: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... ../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from... ../../lib/autoconf/general.m4:2032: AC_CACHE_VAL is expanded from... ../../lib/autoconf/general.m4:2053: AC_CACHE_CHECK is expanded from... m4/ptyfuncs.m4:1: AX_CHECK_PTYFUNCS is expanded from... configure.ac:142: the top level configure.ac:141: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... ../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from... ../../lib/autoconf/general.m4:2032: AC_CACHE_VAL is expanded from... ../../lib/autoconf/general.m4:2053: AC_CACHE_CHECK is expanded from... m4/pthread.m4:21: AX_CHECK_PTHREAD is expanded from... configure.ac:141: the top level configure.ac:142: warning: AC_LANG_CONFTEST: no AC_LANG_SOURCE call detected in body ../../lib/autoconf/lang.m4:194: AC_LANG_CONFTEST is expanded from... ../../lib/autoconf/general.m4:2662: _AC_LINK_IFELSE is expanded from... ../../lib/autoconf/general.m4:2679: AC_LINK_IFELSE is expanded from... ../../lib/m4sugar/m4sh.m4:606: AS_IF is expanded from... ../../lib/autoconf/general.m4:2032: AC_CACHE_VAL is expanded from... ../../lib/autoconf/general.m4:2053: AC_CACHE_CHECK is expanded from... m4/ptyfuncs.m4:1: AX_CHECK_PTYFUNCS is expanded from... configure.ac:142: the top level Please rerun autoconf after applying this. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com> changeset: 25592:197138cc9436 user: Olaf Hering <olaf@aepfle.de> date: Wed Jul 04 15:46:16 2012 +0100 tools/configure.ac: add version check for glib2 xen-unstable fails to build in a SLES10SP4 environment since a long time because the included version of glib is slightly older than the required glib version. According to the glib docs version 2.12 includes base64 support, but SLES10 is shipped with glib 2.8.6: qemu-timer-common.o: In function `init_get_clock'': /usr/src/packages/BUILD/xen-4.2.25432/non-dbg/tools/qemu-xen-dir/qemu-timer-common.c:57: undefined reference to `clock_gettime'' qga/guest-agent-commands.o: In function `qmp_guest_file_write'': qga/guest-agent-commands.c:249: undefined reference to `g_base64_decode'' qga/guest-agent-commands.o: In function `qmp_guest_file_read'': qga/guest-agent-commands.c:224: undefined reference to `g_base64_encode'' collect2: ld returned 1 exit status make[3]: *** [qemu-ga] Error 1 Add a version check to toplevel configure to require at least glib 2.12. This makes sure configure can detect the condition early instead of failing later in the middle of tools build when qemu-upstream errors out. Please rerun autoconf after applying this. Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Roger Pau Monn? <roger.pau@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com> changeset: 25591:328f466f9ee6 user: W. Michael Petullo <mike@flyn.org> date: Wed Jul 04 15:46:15 2012 +0100 xl: Allow use of /dev/null with xl create to enable command-line definition xm allows specifying /dev/null as the domain configuration argument to its create option; add same functionality to xl. xl treats the configuration argument /dev/null as a special case. This allows specifying an entire domain configuration on the command line. Signed-off-by: W. Michael Petullo <mike@flyn.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> [ ijc -- ported to xen-unstable from 4.1 ] Committed-by: Ian Campbell <ian.campbell@citrix.com> changeset: 25590:bb250383a4f5 user: Zhigang Wang <zhigang.x.wang@oracle.com> date: Wed Jul 04 15:46:14 2012 +0100 hotplug/Linux: use flock based locking In the normal case of a single domain booting with one disk, the disk hotplug script will fire once. In this case taking out the lock will never cause a sleep because there''s no other concurrent invocations. If a domain has 4 disks configured, then the hotplug script will fire 4 times, all 4 invocations at pretty much the same time. If there is even a little load on the system, the locking function in the shell script will sleep for a few seconds - as many as 5 seconds, or potentially even time out & fail completely. If say 4 or even more domains each with 4 disks start up at once, that''s 16 invocations of the hotplug script running at once. There will be a lot of sleep''s done & because of the very coarse 1 second granularity the delay can add up significantly. The change to using flock() removes the arbitrary 1 second sleep, so the very instant once hotplug script finishes, another can start & there is no repeated attempts & failures to lock which would add more delay. In addition the current locking implementation would allow two processes get the lock simultaneously if one decided the other had timed out. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Zhigang Wang <zhigang.x.wang@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com> changeset: 25589:60f09d1ab1fe user: M A Young <m.a.young@durham.ac.uk> date: Wed Jul 04 15:46:14 2012 +0100 pygrub: cope better with big files in the guest. Only read the first megabyte of a configuration file (grub etc.) and read the kernel and ramdisk files from the guest in one megabyte pieces so pygrub doesn''t use a lot of memory if the files are large. With --not-really option check that the chosen kernel and ramdisk files exist. If there are problems writing the copy of the kernel or ramdisk, delete the copied files and exit in case they have filled the filesystem. Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Acked-by: Matt Wilson <msw@amazon.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com> changeset: 25588:42f76d536b11 user: Stefano Stabellini <stefano.stabellini@eu.citrix.com> date: Tue Jul 03 13:39:01 2012 +0100 xen: event channel remapping for emulated MSIs Linux PV on HVM guests remap all the MSIs onto event channels, including MSIs corresponding to QEMU''s emulated devices. This patch makes sure that we handle correctly the case of emulated MSI that have been remapped, sending a pirq to the guest instead. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Deep Debroy <ddebroy@gmail.com> Committed-by: Keir Fraser <keir@xen.org> =======================================commit fa2e8e3bc869c04d4a4d9b8f70c1cab7e53778d6 Author: Jan Beulich <jbeulich@suse.com> Date: Fri Jun 29 17:06:25 2012 +0100 passthrough: fix off-by-one in PCI config space register index check Register 255 (0xff) is still valid to be accessed. Reported-by: Rolu <rolu@roce.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
On Thu, 2012-07-05 at 05:23 +0100, xen.org wrote:> flight 13461 xen-unstable real [real] > http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/ > > Regressions :-( > > Tests which did not succeed and are blocking, > including tests which could not be run: > test-amd64-i386-pv 18 leak-check/check fail REGR. vs. 13459 > test-amd64-amd64-pv 18 leak-check/check fail REGR. vs. 13459http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/test-amd64-amd64-xl/18.ts-leak-check.log: 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/iptables] file: 465271 0 -rw-r--r-- 1 root root 0 Jul 5 02:37 /var/run/xen-hotplug/iptables 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/block] file: 465252 0 -rw-r--r-- 1 root root 0 Jul 5 02:38 /var/run/xen-hotplug/block This is obviously as a result of the "hotplug/Linux: use flock based locking" changes. However it isn''t clear to me that this isn''t a feature of the way flock based locking works, since it locks on (I guess) the underlying inode not the path, such that cleaning up these files in release_lock would be incorrect because two invocations would end up effectively taking different locks despite using the same path: Thread A Thread B Thread C claim /foo claim /foo blocks release /foo rm /foo unblock claim /foo new /foo -- doesn''t block! Both B and C have "the" lock. Is that correct? Ian.
Daniel P. Berrange
2012-Jul-05 09:28 UTC
Re: [xen-unstable test] 13461: regressions - FAIL
On Thu, Jul 05, 2012 at 06:49:33AM +0100, Ian Campbell wrote:> On Thu, 2012-07-05 at 05:23 +0100, xen.org wrote: > > flight 13461 xen-unstable real [real] > > http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/ > > > > Regressions :-( > > > > Tests which did not succeed and are blocking, > > including tests which could not be run: > > test-amd64-i386-pv 18 leak-check/check fail REGR. vs. 13459 > > test-amd64-amd64-pv 18 leak-check/check fail REGR. vs. 13459 > > http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/test-amd64-amd64-xl/18.ts-leak-check.log: > 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/iptables] file: 465271 0 -rw-r--r-- 1 root root 0 Jul 5 02:37 /var/run/xen-hotplug/iptables > 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/block] file: 465252 0 -rw-r--r-- 1 root root 0 Jul 5 02:38 /var/run/xen-hotplug/block > > This is obviously as a result of the "hotplug/Linux: use flock based > locking" changes. > > However it isn''t clear to me that this isn''t a feature of the way flock > based locking works, since it locks on (I guess) the underlying inode > not the path, such that cleaning up these files in release_lock would be > incorrect because two invocations would end up effectively taking > different locks despite using the same path: > > Thread A Thread B Thread C > claim /foo > claim /foo > blocks > release /foo > rm /foo > unblock > claim /foo > new /foo -- doesn''t block! > > Both B and C have "the" lock. > > Is that correct?Yes, as you say flock() operates on the inode, so if something deletes and recreates the file, future flocks will operate differently. Ideally you should just never rm the files at all. If you need to ''rm'' them, then to avoid this, you must do two things - Only ''rm /foo'' while holding the lock on /foo - Record the inode before acquiring the lock. After acquiring the lock check whether the inode on disk is the same. If not, release the lock & repeat. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
On Thu, 2012-07-05 at 10:28 +0100, Daniel P. Berrange wrote:> On Thu, Jul 05, 2012 at 06:49:33AM +0100, Ian Campbell wrote: > > On Thu, 2012-07-05 at 05:23 +0100, xen.org wrote: > > > flight 13461 xen-unstable real [real] > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/ > > > > > > Regressions :-( > > > > > > Tests which did not succeed and are blocking, > > > including tests which could not be run: > > > test-amd64-i386-pv 18 leak-check/check fail REGR. vs. 13459 > > > test-amd64-amd64-pv 18 leak-check/check fail REGR. vs. 13459 > > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/13461/test-amd64-amd64-xl/18.ts-leak-check.log: > > 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/iptables] file: 465271 0 -rw-r--r-- 1 root root 0 Jul 5 02:37 /var/run/xen-hotplug/iptables > > 2012-07-05 01:45:13 Z LEAKED [file /var/run/xen-hotplug/block] file: 465252 0 -rw-r--r-- 1 root root 0 Jul 5 02:38 /var/run/xen-hotplug/block > > > > This is obviously as a result of the "hotplug/Linux: use flock based > > locking" changes. > > > > However it isn''t clear to me that this isn''t a feature of the way flock > > based locking works, since it locks on (I guess) the underlying inode > > not the path, such that cleaning up these files in release_lock would be > > incorrect because two invocations would end up effectively taking > > different locks despite using the same path: > > > > Thread A Thread B Thread C > > claim /foo > > claim /foo > > blocks > > release /foo > > rm /foo > > unblock > > claim /foo > > new /foo -- doesn''t block! > > > > Both B and C have "the" lock. > > > > Is that correct? > > Yes, as you say flock() operates on the inode, so if something deletes > and recreates the file, future flocks will operate differently. Ideally > you should just never rm the files at all. > > If you need to ''rm'' them, then to avoid this, you must do two things > > - Only ''rm /foo'' while holding the lock on /foo > - Record the inode before acquiring the lock. After acquiring the > lock check whether the inode on disk is the same. If not, > release the lock & repeat.Thanks. I''m inclined towards not deleting the lock file. Ian.
Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"):> Yes, as you say flock() operates on the inode, so if something deletes > and recreates the file, future flocks will operate differently. Ideally > you should just never rm the files at all. > > If you need to ''rm'' them, then to avoid this, you must do two things > > - Only ''rm /foo'' while holding the lock on /foo > - Record the inode before acquiring the lock. After acquiring the > lock check whether the inode on disk is the same. If not, > release the lock & repeat.It seems more logical to me to check the inum of the open fd against the file. Something like this perhaps (untested): diff -r ad08cd8e7097 tools/hotplug/Linux/locking.sh --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 11:39:59 2012 +0100 @@ -30,6 +30,7 @@ _setlockfd() done _lockdict[$i]="$1" let _lockfd=200+i + let _lockfile="$LOCK_BASEDIR/$1" } @@ -37,13 +38,32 @@ claim_lock() { mkdir -p "$LOCK_BASEDIR" _setlockfd $1 - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" - flock -x $_lockfd + # The locking strategy is identical to that from with-lock-ex(1) + # from chiark-utils, except using flock. It has the benefit of + # it being possible to safely remove the lockfile when done. + local rightfile + while true; do + eval "exec $_lockfd>>$lockfile" + flock -x $_lockfd + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or + # use bash''s test -ef because those all go through what is + # actually a synthetic symlink in /proc and we aren''t + # guaranteed that our stat(2) won''t lose the race with an + # rm(1) between reading the synthetic link and traversing the + # file system to find the inum. Perl is very fast so use that. + rightfile=$( perl -e '' + open STDIN, "<&''$_lockfd''" or die $!; + my $fd_inum = (stat STDIN)[1]; die $! unless defined $fd_inum; + my $file_inum = (stat $ARGV[0])[1]; + print "y\n" if $fd_inum eq $file_inum; + '' "$_lockfile" ) + if [ x$rightfile = xy ]; then break; fi + done } release_lock() { _setlockfd $1 - flock -u $_lockfd + rm "$_lockfile" } Ian.
Daniel P. Berrange
2012-Jul-05 10:46 UTC
Re: [xen-unstable test] 13461: regressions - FAIL
On Thu, Jul 05, 2012 at 11:40:04AM +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > > Yes, as you say flock() operates on the inode, so if something deletes > > and recreates the file, future flocks will operate differently. Ideally > > you should just never rm the files at all. > > > > If you need to ''rm'' them, then to avoid this, you must do two things > > > > - Only ''rm /foo'' while holding the lock on /foo > > - Record the inode before acquiring the lock. After acquiring the > > lock check whether the inode on disk is the same. If not, > > release the lock & repeat. > > It seems more logical to me to check the inum of the open fd against > the file. Something like this perhaps (untested):Yes, using the open fd is actually what I should have said.> > diff -r ad08cd8e7097 tools/hotplug/Linux/locking.sh > --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 > +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 11:39:59 2012 +0100 > @@ -30,6 +30,7 @@ _setlockfd() > done > _lockdict[$i]="$1" > let _lockfd=200+i > + let _lockfile="$LOCK_BASEDIR/$1" > } > > > @@ -37,13 +38,32 @@ claim_lock() > { > mkdir -p "$LOCK_BASEDIR" > _setlockfd $1 > - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" > - flock -x $_lockfd > + # The locking strategy is identical to that from with-lock-ex(1) > + # from chiark-utils, except using flock. It has the benefit of > + # it being possible to safely remove the lockfile when done. > + local rightfile > + while true; do > + eval "exec $_lockfd>>$lockfile" > + flock -x $_lockfd > + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or > + # use bash''s test -ef because those all go through what is > + # actually a synthetic symlink in /proc and we aren''t > + # guaranteed that our stat(2) won''t lose the race with an > + # rm(1) between reading the synthetic link and traversing the > + # file system to find the inum. Perl is very fast so use that. > + rightfile=$( perl -e '' > + open STDIN, "<&''$_lockfd''" or die $!; > + my $fd_inum = (stat STDIN)[1]; die $! unless defined $fd_inum; > + my $file_inum = (stat $ARGV[0])[1]; > + print "y\n" if $fd_inum eq $file_inum; > + '' "$_lockfile" ) > + if [ x$rightfile = xy ]; then break; fi > + done > } > > > release_lock() > { > _setlockfd $1 > - flock -u $_lockfd > + rm "$_lockfile"I think you still want the ''flock'' line here, but have it after the ''rm'' line. Otherwise you leave the $_lockfd filehandle open. Yes, I know the calling script will probably just exit, but it doesn''t hurt to be careful here. Regards, Danie. -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"):> On Thu, Jul 05, 2012 at 11:40:04AM +0100, Ian Jackson wrote: > > release_lock() > > { > > _setlockfd $1 > > - flock -u $_lockfd > > + rm "$_lockfile" > > I think you still want the ''flock'' line here, but have it after the > ''rm'' line. Otherwise you leave the $_lockfd filehandle open. Yes, > I know the calling script will probably just exit, but it doesn''t > hurt to be careful here.flock -u won''t close the fd. If that''s what you want then we would have to do so explicitly. But, in the meantime the old lockfile exists and is pointlessly flocked but is unlinked, which is harmless apart from being a slight resource leak. Note that anyone who inherits a the fd from the hotplug script will end up keeping the file in existence, so closing it here (with or without unlocking it) may not do any good. And if there are no such children with a copy of the fd then when the hotplug script dies or exits the stale lockfile will be reaped by the kernel. So I think it''s fine as it is. Ian.
Daniel P. Berrange
2012-Jul-05 10:57 UTC
Re: [xen-unstable test] 13461: regressions - FAIL
On Thu, Jul 05, 2012 at 11:54:14AM +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > > On Thu, Jul 05, 2012 at 11:40:04AM +0100, Ian Jackson wrote: > > > release_lock() > > > { > > > _setlockfd $1 > > > - flock -u $_lockfd > > > + rm "$_lockfile" > > > > I think you still want the ''flock'' line here, but have it after the > > ''rm'' line. Otherwise you leave the $_lockfd filehandle open. Yes, > > I know the calling script will probably just exit, but it doesn''t > > hurt to be careful here. > > flock -u won''t close the fd. If that''s what you want then we would > have to do so explicitly. > > But, in the meantime the old lockfile exists and is pointlessly > flocked but is unlinked, which is harmless apart from being a slight > resource leak. > > Note that anyone who inherits a the fd from the hotplug script will > end up keeping the file in existence, so closing it here (with or > without unlocking it) may not do any good. And if there are no such > children with a copy of the fd then when the hotplug script dies or > exits the stale lockfile will be reaped by the kernel. > > So I think it''s fine as it is.Ok, agreed. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
On Thu, 2012-07-05 at 11:40 +0100, Ian Jackson wrote:> Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > > Yes, as you say flock() operates on the inode, so if something deletes > > and recreates the file, future flocks will operate differently. Ideally > > you should just never rm the files at all. > > > > If you need to ''rm'' them, then to avoid this, you must do two things > > > > - Only ''rm /foo'' while holding the lock on /foo > > - Record the inode before acquiring the lock. After acquiring the > > lock check whether the inode on disk is the same. If not, > > release the lock & repeat. > > It seems more logical to me to check the inum of the open fd against > the file. Something like this perhaps (untested): > > diff -r ad08cd8e7097 tools/hotplug/Linux/locking.sh > --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 > +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 11:39:59 2012 +0100 > @@ -30,6 +30,7 @@ _setlockfd() > done > _lockdict[$i]="$1" > let _lockfd=200+i > + let _lockfile="$LOCK_BASEDIR/$1" > } > > > @@ -37,13 +38,32 @@ claim_lock() > { > mkdir -p "$LOCK_BASEDIR" > _setlockfd $1 > - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" > - flock -x $_lockfd > + # The locking strategy is identical to that from with-lock-ex(1) > + # from chiark-utils, except using flock. It has the benefit of > + # it being possible to safely remove the lockfile when done. > + local rightfile > + while true; do > + eval "exec $_lockfd>>$lockfile"you mean $_lockfile here I think.> + flock -x $_lockfd > + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or > + # use bash''s test -ef because those all go through what is > + # actually a synthetic symlink in /proc and we aren''t > + # guaranteed that our stat(2) won''t lose the race with an > + # rm(1) between reading the synthetic link and traversing the > + # file system to find the inum. Perl is very fast so use that. > + rightfile=$( perl -e ''Won''t this need to become $(PERL) (or @PERL@ and some seddery at install time) for the benefit of BSD? Does stating $_lockfile before and after the flock, in addition to comparing both with $_lockfd, close the race you are worried about?> + open STDIN, "<&''$_lockfd''" or die $!; > + my $fd_inum = (stat STDIN)[1]; die $! unless defined $fd_inum; > + my $file_inum = (stat $ARGV[0])[1]; > + print "y\n" if $fd_inum eq $file_inum; > + '' "$_lockfile" ) > + if [ x$rightfile = xy ]; then break; fi > + done > } > > > release_lock() > { > _setlockfd $1 > - flock -u $_lockfd > + rm "$_lockfile"I suppose there is no point in unlocking too?> } > > Ian.
Ian Campbell wrote:> On Thu, 2012-07-05 at 11:40 +0100, Ian Jackson wrote: >> Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): >>> Yes, as you say flock() operates on the inode, so if something deletes >>> and recreates the file, future flocks will operate differently. Ideally >>> you should just never rm the files at all. >>> >>> If you need to ''rm'' them, then to avoid this, you must do two things >>> >>> - Only ''rm /foo'' while holding the lock on /foo >>> - Record the inode before acquiring the lock. After acquiring the >>> lock check whether the inode on disk is the same. If not, >>> release the lock& repeat. >> It seems more logical to me to check the inum of the open fd against >> the file. Something like this perhaps (untested): >> >> diff -r ad08cd8e7097 tools/hotplug/Linux/locking.sh >> --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 >> +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 11:39:59 2012 +0100 >> @@ -30,6 +30,7 @@ _setlockfd() >> done >> _lockdict[$i]="$1" >> let _lockfd=200+i >> + let _lockfile="$LOCK_BASEDIR/$1" >> } >> >> >> @@ -37,13 +38,32 @@ claim_lock() >> { >> mkdir -p "$LOCK_BASEDIR" >> _setlockfd $1 >> - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" >> - flock -x $_lockfd >> + # The locking strategy is identical to that from with-lock-ex(1) >> + # from chiark-utils, except using flock. It has the benefit of >> + # it being possible to safely remove the lockfile when done. >> + local rightfile >> + while true; do >> + eval "exec $_lockfd>>$lockfile" > > you mean $_lockfile here I think. > >> + flock -x $_lockfd >> + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or >> + # use bash''s test -ef because those all go through what is >> + # actually a synthetic symlink in /proc and we aren''t >> + # guaranteed that our stat(2) won''t lose the race with an >> + # rm(1) between reading the synthetic link and traversing the >> + # file system to find the inum. Perl is very fast so use that. >> + rightfile=$( perl -e '' > > Won''t this need to become $(PERL) (or @PERL@ and some seddery at install > time) for the benefit of BSD?BSD don''t use this scripts, so you don''t have to worry about this here.
On Thu, 2012-07-05 at 12:00 +0100, Roger Pau Monne wrote:> Ian Campbell wrote: > > On Thu, 2012-07-05 at 11:40 +0100, Ian Jackson wrote: > >> Daniel P. Berrange writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > >>> Yes, as you say flock() operates on the inode, so if something deletes > >>> and recreates the file, future flocks will operate differently. Ideally > >>> you should just never rm the files at all. > >>> > >>> If you need to ''rm'' them, then to avoid this, you must do two things > >>> > >>> - Only ''rm /foo'' while holding the lock on /foo > >>> - Record the inode before acquiring the lock. After acquiring the > >>> lock check whether the inode on disk is the same. If not, > >>> release the lock& repeat. > >> It seems more logical to me to check the inum of the open fd against > >> the file. Something like this perhaps (untested): > >> > >> diff -r ad08cd8e7097 tools/hotplug/Linux/locking.sh > >> --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 > >> +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 11:39:59 2012 +0100 > >> @@ -30,6 +30,7 @@ _setlockfd() > >> done > >> _lockdict[$i]="$1" > >> let _lockfd=200+i > >> + let _lockfile="$LOCK_BASEDIR/$1" > >> } > >> > >> > >> @@ -37,13 +38,32 @@ claim_lock() > >> { > >> mkdir -p "$LOCK_BASEDIR" > >> _setlockfd $1 > >> - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" > >> - flock -x $_lockfd > >> + # The locking strategy is identical to that from with-lock-ex(1) > >> + # from chiark-utils, except using flock. It has the benefit of > >> + # it being possible to safely remove the lockfile when done. > >> + local rightfile > >> + while true; do > >> + eval "exec $_lockfd>>$lockfile" > > > > you mean $_lockfile here I think. > > > >> + flock -x $_lockfd > >> + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or > >> + # use bash''s test -ef because those all go through what is > >> + # actually a synthetic symlink in /proc and we aren''t > >> + # guaranteed that our stat(2) won''t lose the race with an > >> + # rm(1) between reading the synthetic link and traversing the > >> + # file system to find the inum. Perl is very fast so use that. > >> + rightfile=$( perl -e '' > > > > Won''t this need to become $(PERL) (or @PERL@ and some seddery at install > > time) for the benefit of BSD? > > BSD don''t use this scripts, so you don''t have to worry about this here.Right, obviously! Ian.
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"):> > + while true; do > > + eval "exec $_lockfd>>$lockfile" > > you mean $_lockfile here I think.Heh. Yes.> > + flock -x $_lockfd > > + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or > > + # use bash''s test -ef because those all go through what is > > + # actually a synthetic symlink in /proc and we aren''t > > + # guaranteed that our stat(2) won''t lose the race with an > > + # rm(1) between reading the synthetic link and traversing the > > + # file system to find the inum. Perl is very fast so use that. > > + rightfile=$( perl -e '' > > Won''t this need to become $(PERL) (or @PERL@ and some seddery at install > time) for the benefit of BSD?This is in tools/hotplug/Linux. AFAICT at least some of the BSDs have an fstat(1).> Does stating $_lockfile before and after the flock, in addition to > comparing both with $_lockfd, close the race you are worried about?The correct answer to this question is to present a proof that what I''m doing is correct. So: * The lock is owned not by a process but by an open-file (informally an fd). Any process with an fd onto this open-file is a lockholder and may perform the various operations; such a process should only do so when its co-lockholder processes expect. Ie, we will treat all processes holding fds onto the open-file as acting in concert and not distinguish between them. * You are a lockholder if - You have an fd onto an open-file which - currently holds an exclusive flock lock on its inum - and that inum is currently linked at the lockfile path * The rules are: - No-one but a lockholder may unlink the lockfile path (or otherwise cause it to stop referring to a file it refers to). - Anyone may open the lockfile with O_CREAT * The protocol for locking is: - Open the file (O_CREAT) - flock it - fstat the fd you got - stat the lockfile - if both are equal you have the lock, otherwise try again. * Informal proof of exclusivity: - No two open-files can hold an fcntl lock onto the same file at the same time - No two files can have the same name at the same time * Informal proof of correctness of locking protocol: - After you call flock successfully no-one other than you (someone with the same open-file) can stop you having that flock lock. - Obviously the inum you get from the fstat is fixed - At the point where you call stat there are two possibilities: (i) the lockfile path referred to some other inum in which case you have failed (ii) the lockfile path referred to the same file in which case you had (by definition) at that point the file locked. * Informal proof that no-one else can steal the lock: - After you call flock successfully no-one other than you can stop you having that flock lock - No-one other than the lockholder is permitted to stop the path referring to a particular inum. So if you hold the lock then only you are allowed to stop the path referring to the file whose flock you hold Thus once you hold the lock at any instant, you will continue to do so until you voluntarily stop doing so (eg by unlinking the lockfile or closing the fd). This demonstrates, I think, that the design is correct. I will do some testing on the implementation to try to check that it implements the design.> I suppose there is no point in unlocking too?See my reply to Daniel. Ian.
On Thu, 2012-07-05 at 12:21 +0100, Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > > Does stating $_lockfile before and after the flock, in addition to > > comparing both with $_lockfd, close the race you are worried about? > > The correct answer to this question is to present a proof that what > I''m doing is correct.I was actually just speculating on ways to avoid $(PERL) rather than questioning the algorithm but as Roger and you have pointed out there is no such concern in this particular file. The analysis is still useful though. [...]> This demonstrates, I think, that the design is correct. I will do > some testing on the implementation to try to check that it implements > the design.I think so too. Ian.
On Thu, 2012-07-05 at 12:21 +0100, Ian Jackson wrote:> This demonstrates, I think, that the design is correct. I will do > some testing on the implementation to try to check that it implements > the design.Can we get this checked in today (ideally early this afternoon) do you think? Otherwise I think we should revert the original locking change, to unblock testing, and re-do it along with this fix when it is ready. Ian.
Ian Jackson
2012-Jul-05 11:43 UTC
Re: [xen-unstable test] 13461: regressions - FAIL [and 1 more messages]
Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"):> On Thu, 2012-07-05 at 12:21 +0100, Ian Jackson wrote: > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > > > Does stating $_lockfile before and after the flock, in addition to > > > comparing both with $_lockfd, close the race you are worried about? > > > > The correct answer to this question is to present a proof that what > > I''m doing is correct. > > I was actually just speculating on ways to avoid $(PERL) rather than > questioning the algorithm but as Roger and you have pointed out there is > no such concern in this particular file.Ah, yes. However, if you look at my proof you''ll see that it''s really best to stat the fd. It might be possible to construct a more complicated proof that statting the file at some points would be good enough but I don''t care to do so :-).> The analysis is still useful though.I will include it in the file.> > This demonstrates, I think, that the design is correct. I will do > > some testing on the implementation to try to check that it implements > > the design. > > I think so too.It works in my tests; I also tested the "lost race" condition and it did the right thing. Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"):> Can we get this checked in today (ideally early this afternoon) do you > think? Otherwise I think we should revert the original locking change, > to unblock testing, and re-do it along with this fix when it is ready.I have pushed the fix. The code is nearly identical. I fixed the new "let" which ought to have been a plain assignment. I also attach for your interest the test script I used. For the "lost race" test I also added a "read" to locking.sh before "flock". Ian. # HG changeset patch # User Ian Jackson <Ian.Jackson@eu.citrix.com> # Date 1341488425 -3600 # Node ID 497e2fe4945594cd2846f0302bef5e5dafa1c2c6 # Parent ad08cd8e7097ec6da6526cf8ac26f0fa72cd1e01 hotplug/Linux: do not leak lockfiles 25590:bb250383a4f5 introduced a new locking scheme. Unfortunately it leaves behind files in /var/run/xen-hotplug. These are spotted as regressions by the autotester. Fix this. This involves changing the locking protocol to allow lockfiles to be deleted (as removing lockfiles is unsafe with a naive flock-based algorithm). Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com> diff -r ad08cd8e7097 -r 497e2fe49455 tools/hotplug/Linux/locking.sh --- a/tools/hotplug/Linux/locking.sh Thu Jul 05 11:00:28 2012 +0100 +++ b/tools/hotplug/Linux/locking.sh Thu Jul 05 12:40:25 2012 +0100 @@ -30,6 +30,7 @@ _setlockfd() done _lockdict[$i]="$1" let _lockfd=200+i + _lockfile="$LOCK_BASEDIR/$1" } @@ -37,13 +38,92 @@ claim_lock() { mkdir -p "$LOCK_BASEDIR" _setlockfd $1 - eval "exec $_lockfd>>$LOCK_BASEDIR/$1" - flock -x $_lockfd + # The locking strategy is identical to that from with-lock-ex(1) + # from chiark-utils, except using flock. It has the benefit of + # it being possible to safely remove the lockfile when done. + # See below for a correctness proof. + local rightfile + while true; do + eval "exec $_lockfd>>$_lockfile" + flock -x $_lockfd || return $? + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or + # use bash''s test -ef because those all go through what is + # actually a synthetic symlink in /proc and we aren''t + # guaranteed that our stat(2) won''t lose the race with an + # rm(1) between reading the synthetic link and traversing the + # file system to find the inum. Perl is very fast so use that. + rightfile=$( perl -e '' + open STDIN, "<&''$_lockfd''" or die $!; + my $fd_inum = (stat STDIN)[1]; die $! unless defined $fd_inum; + my $file_inum = (stat $ARGV[0])[1]; + print "y\n" if $fd_inum eq $file_inum; + '' "$_lockfile" ) + if [ x$rightfile = xy ]; then break; fi + done } release_lock() { _setlockfd $1 - flock -u $_lockfd + rm "$_lockfile" } + +# Protocol and correctness proof: +# +# * The lock is owned not by a process but by an open-file (informally +# an fd). Any process with an fd onto this open-file is a +# lockholder and may perform the various operations; such a process +# should only do so when its co-lockholder processes expect. Ie, we +# will treat all processes holding fds onto the open-file as acting +# in concert and not distinguish between them. +# +# * You are a lockholder if +# - You have an fd onto an open-file which +# currently holds an exclusive flock lock on its inum +# - and that inum is currently linked at the lockfile path +# +# * The rules are: +# - No-one but a lockholder may unlink the lockfile path +# (or otherwise cause it to stop referring to a file it +# refers to). +# - Anyone may open the lockfile with O_CREAT +# +# * The protocol for locking is: +# - Open the file (O_CREAT) +# - flock it +# - fstat the fd you have open +# - stat the lockfile path +# - if both are equal you have the lock, otherwise try again. +# +# * Informal proof of exclusivity: +# - No two open-files can hold an fcntl lock onto the same file +# at the same time +# - No two files can have the same name at the same time +# +# * Informal proof of correctness of locking protocol: +# - After you call flock successfully no-one other than you +# (someone with the same open-file) can stop you having +# that flock lock. +# - Obviously the inum you get from the fstat is fixed +# - At the point where you call stat there are two +# possibilities: +# (i) the lockfile path referred to some other inum +# in which case you have failed +# (ii) the lockfile path referred to the same file +# in which case at that point you were the +# lockholder (by definition). +# +# * Informal proof that no-one else can steal the lock: +# - After you call flock successfully no-one other than you +# can stop you having that flock lock +# - No-one other than the lockholder is permitted to stop +# the path referring to a particular inum. So if you +# hold the lock then only you are allowed to stop the +# path referring to the file whose flock you hold; so +# it will continue to refer to that file. +# That''s both the conditions for being the lockholder. +# +# Thus once you hold the lock at any instant, you will +# continue to do so until you voluntarily stop doing so +# (eg by unlinking the lockfile or closing the fd). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Ian Jackson wrote:> Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): >>> + flock -x $_lockfd >>> + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or >>> + # use bash''s test -ef because those all go through what is >>> + # actually a synthetic symlink in /proc and we aren''t >>> + # guaranteed that our stat(2) won''t lose the race with an >>> + # rm(1) between reading the synthetic link and traversing the >>> + # file system to find the inum. Perl is very fast so use that. >>> + rightfile=$( perl -e '' >> Won''t this need to become $(PERL) (or @PERL@ and some seddery at install >> time) for the benefit of BSD? > > This is in tools/hotplug/Linux. AFAICT at least some of the BSDs have > an fstat(1).BSD don''t currently have any of this locking mechanism, which makes it vulnerable to races regarding hotplug scripts launch. I think libxl should be a better place to perform all this kind of locking, instead of the hotplug scripts, which are already polluted enough, but as far as I know we don''t have any kind of inter-process locking mechanism right now?
On Thu, 2012-07-05 at 12:53 +0100, Roger Pau Monne wrote:> Ian Jackson wrote: > > Ian Campbell writes ("Re: [Xen-devel] [xen-unstable test] 13461: regressions - FAIL"): > >>> + flock -x $_lockfd > >>> + # We can''t just stat /dev/stdin or /proc/self/fd/$_lockfd or > >>> + # use bash''s test -ef because those all go through what is > >>> + # actually a synthetic symlink in /proc and we aren''t > >>> + # guaranteed that our stat(2) won''t lose the race with an > >>> + # rm(1) between reading the synthetic link and traversing the > >>> + # file system to find the inum. Perl is very fast so use that. > >>> + rightfile=$( perl -e '' > >> Won''t this need to become $(PERL) (or @PERL@ and some seddery at install > >> time) for the benefit of BSD? > > > > This is in tools/hotplug/Linux. AFAICT at least some of the BSDs have > > an fstat(1). > > BSD don''t currently have any of this locking mechanism, which makes it > vulnerable to races regarding hotplug scripts launch. I think libxl > should be a better place to perform all this kind of locking, instead of > the hotplug scripts, which are already polluted enough, but as far as I > know we don''t have any kind of inter-process locking mechanism right now?The advantage of doing it this way (in the scripts) is that it is much more fine grained. e.g. the script locks the iptables lock only for as long as it is manipulating the firewall rules and so on. If libxl were doing this it would have to lock for the entire duration of the script which may end up needlessly serialising things. Ian.