Tim Woodall
2025-May-24 16:33 UTC
[Pkg-xen-devel] Bug#1106420: xen-utils-common: block-iscsi script doesn't work when iqn is a prefix of an existing iqn
Package: xen-utils-common
Version: 4.17.5+23-ga4e5191dc0-1+deb12u1
Severity: normal
two issues with block-iscsi
1. If the iqn that is being used as a disk device is a prefix of an
existing in use iqn then xl create will fail with
libxl: error: libxl_device.c:1337:device_hotplug_child_death_cb: script: Device
already opened
My fix is to match a space before and after the iqn.
iscsiadm -m session | grep ' iqn.xen17:trixie17 '
tcp: [32] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17
(non-flash)
Without the trailing space:
iscsiadm -m session | grep 'iqn.xen17:trixie17'
tcp: [30] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17-build
(non-flash)
tcp: [32] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17
(non-flash)
2. There's no way to specify the LUN, it seems to always want lun-0. All
of my LUNS are 1, so I've changed the default to 1 rather than 0 but it
needs an extra lun=<> parameter.
diff -u scripts/block-iscsi.distrib scripts/block-iscsi
--- scripts/block-iscsi.distrib 2022-12-29 23:12:25.000000000 +0000
+++ scripts/block-iscsi 2025-05-24 16:24:53.000000000 +0000
@@ -26,6 +26,8 @@
dir=$(dirname "$0")
. "$dir/block-common.sh"
+LUN=1
+
remove_label()
{
echo $1 | sed "s/^\("$2"\)//"
@@ -59,6 +61,9 @@
multipath=*)
multipath=$(remove_label $param "multipath=")
;;
+ lun=*)
+ LUN=$(remove_label $param "lun=")
+ ;;
esac
done
if [ -z "$iqn" ] || [ -z "$portal" ]; then
@@ -73,7 +78,7 @@
find_device()
{
count=0
- while [ ! -e /dev/disk/by-path/*"$iqn"-lun-0 ]; do
+ while [ ! -e /dev/disk/by-path/*"$iqn"-lun-"${LUN}" ];
do
sleep 1
count=`expr $count + 1`
if [ count = 100 ]; then
@@ -81,7 +86,7 @@
fatal "timeout waiting for iSCSI disk to settle"
fi
done
- sddev=$(readlink -f /dev/disk/by-path/*"$iqn"-lun-0 || true)
+ sddev=$(readlink -f
/dev/disk/by-path/*"$iqn"-lun-"${LUN}" || true)
if [ ! -b "$sddev" ]; then
fatal "Unable to find attached device path"
fi
@@ -109,9 +114,9 @@
prepare()
{
# Check if target is already opened
- iscsiadm -m session 2>&1 | grep -q "$iqn" && fatal
"Device already opened"
+ iscsiadm -m session 2>&1 | grep -q " $iqn " &&
fatal "Device already opened"
# Discover portal targets
- iscsiadm -m discovery -t st -p $portal 2>&1 | grep -q
"$iqn" || \
+ iscsiadm -m discovery -t st -p $portal 2>&1 | grep -q "
$iqn\$" || \
fatal "No matching target iqn found"
}
-- System Information:
Debian Release: 12.11
APT prefers stable-security
APT policy: (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 6.1.0-35-amd64 (SMP w/4 CPU threads; PREEMPT)
Kernel taint flags: TAINT_WARN
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)
Versions of packages xen-utils-common depends on:
ii libc6 2.36-9+deb12u10
ii libxenhypfs1 4.17.5+23-ga4e5191dc0-1+deb12u1
ii libxenstore4 4.17.5+23-ga4e5191dc0-1+deb12u1
ii lsb-base 11.6
ii python3 3.11.2-1+b1
ii sysvinit-utils [lsb-base] 3.06-4
ii ucf 3.0043+nmu1+deb12u1
ii udev 252.36-1~deb12u1
ii xenstore-utils 4.17.5+23-ga4e5191dc0-1+deb12u1
xen-utils-common recommends no packages.
Versions of packages xen-utils-common suggests:
pn xen-doc <none>
-- Configuration Files:
/etc/xen/scripts/block-iscsi changed:
<snip - diff above>
-- no debconf information
Hans van Kranenburg
2025-Nov-28 19:16 UTC
[Pkg-xen-devel] Bug#1106420: xen-utils-common: block-iscsi script doesn't work when iqn is a prefix of an existing iqn
Hi Tim! Thanks for the report. I see the problem, and the changes make sense indeed. On 5/24/25 6:33 PM, Tim Woodall wrote:> Package: xen-utils-common > Version: 4.17.5+23-ga4e5191dc0-1+deb12u1 > Severity: normal > > two issues with block-iscsi > > 1. If the iqn that is being used as a disk device is a prefix of an > existing in use iqn then xl create will fail with > libxl: error: libxl_device.c:1337:device_hotplug_child_death_cb: script: Device already opened > > My fix is to match a space before and after the iqn. > > iscsiadm -m session | grep ' iqn.xen17:trixie17 ' > tcp: [32] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17 (non-flash) > > Without the trailing space: > iscsiadm -m session | grep 'iqn.xen17:trixie17' > tcp: [30] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17-build (non-flash) > tcp: [32] [fd01:8b0:bfcd:100:230:18ff:fe08:5ad6]:3260,1 iqn.xen17:trixie17 (non-flash) > > 2. There's no way to specify the LUN, it seems to always want lun-0. All > of my LUNS are 1, so I've changed the default to 1 rather than 0 but it > needs an extra lun=<> parameter. > > diff -u scripts/block-iscsi.distrib scripts/block-iscsi > --- scripts/block-iscsi.distrib 2022-12-29 23:12:25.000000000 +0000 > +++ scripts/block-iscsi 2025-05-24 16:24:53.000000000 +0000 > @@ -26,6 +26,8 @@ > dir=$(dirname "$0") > . "$dir/block-common.sh" > > +LUN=1 > + > remove_label() > { > [...]If we want to see these changes in our Debian Xen packages, then they should first be sent to upstream Xen and be accepted and included there. If you would like to do this, we can help with some advice and guidance. Meanwhile, I actually have a another suggestion. In this case, when using iSCSI, I would recommend using multipath, even when there is just 1 path to the lun. A few reasons for this are: 1. You can use a feature in multipath where all IO is queued in case there are 0 paths available to the target. (features='1 queue_if_no_path')>From the perspective of the user of the block device (in this case yourXen domU), this looks just like it hangs and is super slow, but there will be no hard IO errors causing problems in the VM. This means you can 'just' unplug network cables or accidentally offline your lun and online it again, and stuff will hang and will continue after the communication to the iSCSI target is restored. Stuff like disappearing and reappearing block devices (while being attached to a Xen domU) is something that really quickly brings you into unknown edge case horror territories where bad things might happen. 2. You have a stable device name /dev/mapper/<wwid> which you can use just with the 'normal/regular' block script. disk = [ "/dev/mapper/<wwid>,raw,xvda,rw", ] It's fun that there is a block-iscsi variant, but while trying to make fixes I suspect you also feel a bit uncomfortable because of how fragile the code is, yolo matching strings in a shell script... This block-iscsi was added in 2013, and there is 1 small fix done to it in 2016, and I have no idea how much it is used. ---- >8 ---- At work, in the past, I used all of this a lot. We had NetApp storage and then a whole bunch of Xen dom0 with multipath which would combine several iSCSI network paths into multipath block devices, on which we put LVM again (with sanlock and live migration), and then the actual LVs were the block devices for Xen domUs. And it actually worked really well. There's a bunch of knobs and handles in the iscsi and multipath config to get this right. If you're interested in some been-there-done-that, I would not mind sharing something. Thanks, Hans