thr3ads.net - CentOS - [CentOS] EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller [Mar 2023]

If this information is useful, please help other people find it:
Share via:

Simon Matter

2023-Mar-01 11:22 UTC

[CentOS] EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller

Hi,

I see some strange and dangerous things happening on a HPE server with HPE
Smart Array controller where EL9 ends up with wrong device nodes/symlinks
to the attached disks/raid volumes:

(I didn't touch anything here but at 08:09 some symlinks were changed)
/dev/disk/by-id/:
lrwxrwxrwx 1 root root  9 Mar  1 07:57 scsi-0HP_LOGICAL_VOLUME_00000000 ->
../../sdc
lrwxrwxrwx 1 root root 10 Mar  1 07:57
scsi-0HP_LOGICAL_VOLUME_00000000-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar  1 07:57
scsi-0HP_LOGICAL_VOLUME_00000000-part2 -> ../../sdc2
lrwxrwxrwx 1 root root  9 Mar  1 07:57 scsi-0HP_LOGICAL_VOLUME_01000000 ->
../../sdb
lrwxrwxrwx 1 root root  9 Mar  1 08:09 scsi-0HP_LOGICAL_VOLUME_02000000 ->
../../sda
lrwxrwxrwx 1 root root  9 Mar  1 07:57 scsi-0HP_LOGICAL_VOLUME_03000000 ->
../../sdd
lrwxrwxrwx 1 root root  9 Mar  1 08:09
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda
lrwxrwxrwx 1 root root 10 Mar  1 07:57
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar  1 07:57
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdc2

/dev/disk/by-path/:
lrwxrwxrwx 1 root root  9 Mar  1 07:57 pci-0000:03:00.0-scsi-0:1:0:0 ->
../../sdc
lrwxrwxrwx 1 root root 10 Mar  1 07:57 pci-0000:03:00.0-scsi-0:1:0:0-part1
-> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar  1 07:57 pci-0000:03:00.0-scsi-0:1:0:0-part2
-> ../../sdc2
lrwxrwxrwx 1 root root  9 Mar  1 07:57 pci-0000:03:00.0-scsi-0:1:0:1 ->
../../sdb
lrwxrwxrwx 1 root root  9 Mar  1 08:09 pci-0000:03:00.0-scsi-0:1:0:2 ->
../../sda
lrwxrwxrwx 1 root root  9 Mar  1 07:57 pci-0000:03:00.0-scsi-0:1:0:3 ->
../../sdd

After rebooting, the things are different but also wrong:

(here nothing has changed after boot but symlinks are already wrong)
/dev/disk/by-id/:
lrwxrwxrwx 1 root root   9 Mar  1 10:56 scsi-0HP_LOGICAL_VOLUME_00000000
-> ../../sdb
lrwxrwxrwx 1 root root  10 Mar  1 10:56
scsi-0HP_LOGICAL_VOLUME_00000000-part1 -> ../../sdb1
lrwxrwxrwx 1 root root  10 Mar  1 10:56
scsi-0HP_LOGICAL_VOLUME_00000000-part2 -> ../../sdb2
lrwxrwxrwx 1 root root   9 Mar  1 10:56 scsi-0HP_LOGICAL_VOLUME_01000000
-> ../../sda
lrwxrwxrwx 1 root root   9 Mar  1 10:56 scsi-0HP_LOGICAL_VOLUME_02000000
-> ../../sdd
lrwxrwxrwx 1 root root   9 Mar  1 10:56 scsi-0HP_LOGICAL_VOLUME_03000000
-> ../../sdc
lrwxrwxrwx 1 root root   9 Mar  1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda
lrwxrwxrwx 1 root root  10 Mar  1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root  10 Mar  1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdb2

/dev/disk/by-path/:
lrwxrwxrwx 1 root root   9 Mar  1 10:56 pci-0000:03:00.0-scsi-0:1:0:0 ->
../../sdb
lrwxrwxrwx 1 root root  10 Mar  1 10:56
pci-0000:03:00.0-scsi-0:1:0:0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root  10 Mar  1 10:56
pci-0000:03:00.0-scsi-0:1:0:0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root   9 Mar  1 10:56 pci-0000:03:00.0-scsi-0:1:0:1 ->
../../sda
lrwxrwxrwx 1 root root   9 Mar  1 10:56 pci-0000:03:00.0-scsi-0:1:0:2 ->
../../sdd
lrwxrwxrwx 1 root root   9 Mar  1 10:56 pci-0000:03:00.0-scsi-0:1:0:3 ->
../../sdc

Note that two things are strange:

1) the /dev/sd* nodes are in a random order after every restart.
# lsscsi
[1:0:0:0]    storage HP       P410i            6.64  -
[1:1:0:0]    disk    HP       LOGICAL VOLUME   6.64  /dev/sdb
[1:1:0:1]    disk    HP       LOGICAL VOLUME   6.64  /dev/sda
[1:1:0:2]    disk    HP       LOGICAL VOLUME   6.64  /dev/sdd
[1:1:0:3]    disk    HP       LOGICAL VOLUME   6.64  /dev/sdc

2) some symlinks created by udev are just wrong and therefore very
dangerous to use:
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdb1
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdb2

While 1 may be expected(???) I think 2 should really not happen.

I've tried to find out where things go wrong but the whole udev stuff
started to hurt my brain :)

I'm quite sure HPE Smart Array based servers are quite common so my big
question is: do others see that same?

While it's possible to live with this mess I'd really like to fix it
somehow.

Thanks,
Simon

d tbsky

2023-Mar-02 04:41 UTC

head link

[CentOS] EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller

Simon Matter <simon.matter at invoca.ch>> 2) some symlinks created by udev are just wrong and therefore very
> dangerous to use:
> scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda
> scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdb1
> scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdb2
   I think it maybe caused by sd driver asynchronous scanning.
   I am lucky that I didn't see this before. nvme may have similar
issues, but nvme has boot parameter to avoid it.
   Suse has boot parameter to avoid it.
   with EL9 we will wait until EL 9.3 if we are lucky.
   I had report issue: https://bugzilla.redhat.com/show_bug.cgi?id=2140017

Maybe Matching Threads

Search for more apparently analagous threads

CentOS - Mar 2023 - EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller

[CentOS] EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller

[CentOS] EL9/udev generates wrong device nodes/symlinks with HPE Smart Array controller

Maybe Matching Threads