Hi, we have a two-node cluster with pacemaker a SAN. The resources are inside virtual domains. The images of the virtual disks reside on the SAN. On one domain i have errors from the hd in my log: 2021-03-24T21:02:28.416504+01:00 geneious kernel: [2159685.909613] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:02:46.505323+01:00 geneious kernel: [2159704.012213] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:02:55.573149+01:00 geneious kernel: [2159713.078560] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:03:23.702946+01:00 geneious kernel: [2159741.202546] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:03:30.289606+01:00 geneious kernel: [2159747.796192] ------------[ cut here ]------------ 2021-03-24T21:03:30.289635+01:00 geneious kernel: [2159747.796207] WARNING: CPU: 0 PID: 457 at ../fs/buffer.c:1108 mark_buffer_dirty+0xe8/0x100 2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796208] Modules linked in: st sr_mod cdrom lp parport_pc ppdev parport xfrm_user xfrm_algo binfmt_misc uinput nf_log_ipv6 xt_comme nt nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_pkttype xt_tcpudp iptable_filter ip6table_mangl e nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c ip6table_filter ip6_tables x_tables joydev virtio_net net_fai lover failover virtio_balloon i2c_piix4 qemu_fw_cfg pcspkr button ext4 crc16 jbd2 mbcache ata_generic hid_generic usbhid ata_piix sd_mod virtio_rng ahci floppy libahci serio_raw ehci_pci bo chs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm uhci_hcd ehci_hcd usbcore virtio_pci 2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796374] drm_panel_orientation_quirks libata dm_mirror dm_region_hash dm_log sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_ dh_alua scsi_mod autofs4 [last unloaded: parport_pc] 2021-03-24T21:03:30.289643+01:00 geneious kernel: [2159747.796400] Supported: Yes 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID: 457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796406] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014 2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796407] task: ffff8ba32766c380 task.stack: ffff99954124c000 2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796409] RIP: 0010:mark_buffer_dirty+0xe8/0x100 2021-03-24T21:03:30.289646+01:00 geneious kernel: [2159747.796409] RSP: 0018:ffff99954124fcf0 EFLAGS: 00010246 2021-03-24T21:03:30.289650+01:00 geneious kernel: [2159747.796413] RAX: 0000000000a20828 RBX: ffff8ba209a58d90 RCX: ffff8ba3292d7958 2021-03-24T21:03:30.289651+01:00 geneious kernel: [2159747.796413] RDX: ffff8ba209a585b0 RSI: ffff8ba24270b690 RDI: ffff8ba3292d7958 2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796414] RBP: ffff8ba3292d7958 R08: ffff8ba209a585b0 R09: 0000000000000001 2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796415] R10: ffff8ba328c1c0b0 R11: ffff8ba287805380 R12: ffff8ba3292d795a 2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796415] R13: 0000000000000000 R14: ffff8ba3292d7958 R15: ffff8ba209a58d90 2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796417] FS: 0000000000000000(0000) GS:ffff8ba333c00000(0000) knlGS:0000000000000000 2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796417] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796418] CR2: 0000000099bff000 CR3: 0000000101b06000 CR4: 00000000000006f0 2021-03-24T21:03:30.289655+01:00 geneious kernel: [2159747.796424] Call Trace: 2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796470] __jbd2_journal_refile_buffer+0xbb/0xe0 [jbd2] 2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796479] jbd2_journal_commit_transaction+0xf1a/0x1870 [jbd2] 2021-03-24T21:03:30.289657+01:00 geneious kernel: [2159747.796489] ? __switch_to_asm+0x41/0x70 2021-03-24T21:03:30.289658+01:00 geneious kernel: [2159747.796490] ? __switch_to_asm+0x35/0x70 2021-03-24T21:03:30.289662+01:00 geneious kernel: [2159747.796493] kjournald2+0xbb/0x230 [jbd2] 2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796499] ? wait_woken+0x80/0x80 2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796503] kthread+0xf6/0x130 2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796508] ? commit_timeout+0x10/0x10 [jbd2] 2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796510] ? kthread_bind+0x10/0x10 2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796511] ret_from_fork+0x35/0x40 2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796517] Code: 1b 48 8b 03 48 8b 7b 08 48 83 c3 18 48 89 ee e8 bf 42 76 00 48 8b 03 48 85 c0 75 e8 e9 3c ff ff ff 48 89 df 5b 5d e9 c8 35 fb ff <0f> 0b e9 26 ff ff ff 48 83 e8 01 e9 5b ff ff ff 0f 1f 84 00 00 2021-03-24T21:03:30.289670+01:00 geneious kernel: [2159747.796533] ---[ end trace db796891c8ff94af ]--- 2021-03-24T21:03:46.593225+01:00 geneious kernel: [2159764.100145] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:05:09.372772+01:00 geneious kernel: [2159846.877201] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:06:39.943519+01:00 geneious kernel: [2159937.381068] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:07:42.364311+01:00 geneious kernel: [2159999.793805] JBD2: Detected IO errors while flushing file data on dm-1-8 2021-03-24T21:07:57.822133+01:00 geneious kernel: [2160015.291776] JBD2: Detected IO errors while flushing file data on dm-1-8 First i'm wondering: what is dm-1-8 ? I don't have a device like that. geneious:~ # find /dev -iname '*dm*' /dev/dm-1 /dev/dm-0 /dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6lq0ntdZJxxLIIp5G8XihsuYrTbx7Rs0vc /dev/disk/by-id/dm-name-vg_local-lv_var /dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6l3fdsOpBFoDWral3Fa7c6ZeYECmLd6FFj /dev/disk/by-id/dm-name-vg_local-lv_root /dev/cpu_dma_latency I just find /proc/fs/jbd2/dm-1-8. There is a file /proc/fs/jbd2/dm-1-8/info: 453005 transactions (319055 requested), each up to 8192 blocks average: 0ms waiting for transaction 12ms request delay 5124ms running transaction 0ms transaction was being locked 0ms flushing data (in ordered mode) 44ms logging transaction 8031us average transaction commit time 64 handles per transaction 5 blocks per transaction 6 logged blocks per transaction What is that ? The logfile says also something about dm-0-8: 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID: 457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5 geneious:~ # find / -iname dm-0-8 /proc/fs/jbd2/dm-0-8 geneious:~ # ll /proc/fs/jbd2/dm-0-8 total 0 -r--r--r-- 1 root root 0 Mar 29 12:56 info geneious:~ # cat /proc/fs/jbd2/dm-0-8/info 7356 transactions (556 requested), each up to 8192 blocks average: 0ms waiting for transaction 20ms request delay 5628ms running transaction 4ms transaction was being locked 0ms flushing data (in ordered mode) 132ms logging transaction 134769us average transaction commit time 52 handles per transaction 18 blocks per transaction 19 logged blocks per transaction geneious:~ # I assume i have a harddisk problem. I'm checking currently the SAN with its own tools, via a web interface. Afterwards i want to stop the domain, boot it with a live cd and run badblocks and fsck.ext3. What else can i do ? Bernd -- Bernd Lentes System Administrator Institute for Metabolism and Cell Death (MCD) Building 25 - office 122 HelmholtzZentrum M?nchen bernd.lentes at helmholtz-muenchen.de phone: +49 89 3187 1241 phone: +49 89 3187 3827 fax: +49 89 3187 2294 http://www.helmholtz-muenchen.de/mcd Public key: 30 82 01 0a 02 82 01 01 00 b3 72 3e ce 2c 0a 6f 58 49 2c 92 23 c7 b9 c1 ff 6c 3a 53 be f7 9e e9 24 b7 49 fa 3c e8 de 28 85 2c d3 ed f7 70 03 3f 4d 82 fc cc 96 4f 18 27 1f df 25 b3 13 00 db 4b 1d ec 7f 1b cf f9 cd e8 5b 1f 11 b3 a7 48 f8 c8 37 ed 41 ff 18 9f d7 83 51 a9 bd 86 c2 32 b3 d6 2d 77 ff 32 83 92 67 9e ae ae 9c 99 ce 42 27 6f bf d8 c2 a1 54 fd 2b 6b 12 65 0e 8a 79 56 be 53 89 70 51 02 6a eb 76 b8 92 25 2d 88 aa 57 08 42 ef 57 fb fe 00 71 8e 90 ef b2 e3 22 f3 34 4f 7b f1 c4 b1 7c 2f 1d 6f bd c8 a6 a1 1f 25 f3 e4 4b 6a 23 d3 d2 fa 27 ae 97 80 a3 f0 5a c4 50 4a 45 e3 45 4d 82 9f 8b 87 90 d0 f9 92 2d a7 d2 67 53 e6 ae 1e 72 3e e9 e0 c9 d3 1c 23 e0 75 78 4a 45 60 94 f8 e3 03 0b 09 85 08 d0 6c f3 ff ce fa 50 25 d9 da 81 7b 2a dc 9e 28 8b 83 04 b4 0a 9f 37 b8 ac 58 f1 38 43 0e 72 af 02 03 01 00 01 -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2217 bytes Desc: S/MIME Cryptographic Signature URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20210329/84744fe6/attachment.p7s>
----- On Mar 29, 2021, at 12:58 PM, Bernd Lentes bernd.lentes at helmholtz-muenchen.de wrote:> Hi, > > we have a two-node cluster with pacemaker a SAN. > The resources are inside virtual domains. > The images of the virtual disks reside on the SAN. > On one domain i have errors from the hd in my log: > > 2021-03-24T21:02:28.416504+01:00 geneious kernel: [2159685.909613] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:02:46.505323+01:00 geneious kernel: [2159704.012213] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:02:55.573149+01:00 geneious kernel: [2159713.078560] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:03:23.702946+01:00 geneious kernel: [2159741.202546] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:03:30.289606+01:00 geneious kernel: [2159747.796192] ------------[ > cut here ]------------ > 2021-03-24T21:03:30.289635+01:00 geneious kernel: [2159747.796207] WARNING: CPU: > 0 PID: 457 at ../fs/buffer.c:1108 mark_buffer_dirty+0xe8/0x100 > 2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796208] Modules > linked in: st sr_mod cdrom lp parport_pc ppdev parport xfrm_user xfrm_algo > binfmt_misc uinput nf_log_ipv6 xt_comme > nt nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft > iscsi_boot_sysfs ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT > xt_pkttype xt_tcpudp iptable_filter ip6table_mangl > e nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 > nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack libcrc32c ip6table_filter > ip6_tables x_tables joydev virtio_net net_fai > lover failover virtio_balloon i2c_piix4 qemu_fw_cfg pcspkr button ext4 crc16 > jbd2 mbcache ata_generic hid_generic usbhid ata_piix sd_mod virtio_rng ahci > floppy libahci serio_raw ehci_pci bo > chs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm > uhci_hcd ehci_hcd usbcore virtio_pci > 2021-03-24T21:03:30.289637+01:00 geneious kernel: [2159747.796374] > drm_panel_orientation_quirks libata dm_mirror dm_region_hash dm_log sg > dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_ > dh_alua scsi_mod autofs4 [last unloaded: parport_pc] > 2021-03-24T21:03:30.289643+01:00 geneious kernel: [2159747.796400] Supported: > Yes > 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID: > 457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5 > 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796406] Hardware > name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014 > 2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796407] task: > ffff8ba32766c380 task.stack: ffff99954124c000 > 2021-03-24T21:03:30.289645+01:00 geneious kernel: [2159747.796409] RIP: > 0010:mark_buffer_dirty+0xe8/0x100 > 2021-03-24T21:03:30.289646+01:00 geneious kernel: [2159747.796409] RSP: > 0018:ffff99954124fcf0 EFLAGS: 00010246 > 2021-03-24T21:03:30.289650+01:00 geneious kernel: [2159747.796413] RAX: > 0000000000a20828 RBX: ffff8ba209a58d90 RCX: ffff8ba3292d7958 > 2021-03-24T21:03:30.289651+01:00 geneious kernel: [2159747.796413] RDX: > ffff8ba209a585b0 RSI: ffff8ba24270b690 RDI: ffff8ba3292d7958 > 2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796414] RBP: > ffff8ba3292d7958 R08: ffff8ba209a585b0 R09: 0000000000000001 > 2021-03-24T21:03:30.289652+01:00 geneious kernel: [2159747.796415] R10: > ffff8ba328c1c0b0 R11: ffff8ba287805380 R12: ffff8ba3292d795a > 2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796415] R13: > 0000000000000000 R14: ffff8ba3292d7958 R15: ffff8ba209a58d90 > 2021-03-24T21:03:30.289653+01:00 geneious kernel: [2159747.796417] FS: > 0000000000000000(0000) GS:ffff8ba333c00000(0000) knlGS:0000000000000000 > 2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796417] CS: 0010 DS: > 0000 ES: 0000 CR0: 0000000080050033 > 2021-03-24T21:03:30.289654+01:00 geneious kernel: [2159747.796418] CR2: > 0000000099bff000 CR3: 0000000101b06000 CR4: 00000000000006f0 > 2021-03-24T21:03:30.289655+01:00 geneious kernel: [2159747.796424] Call Trace: > 2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796470] > __jbd2_journal_refile_buffer+0xbb/0xe0 [jbd2] > 2021-03-24T21:03:30.289656+01:00 geneious kernel: [2159747.796479] > jbd2_journal_commit_transaction+0xf1a/0x1870 [jbd2] > 2021-03-24T21:03:30.289657+01:00 geneious kernel: [2159747.796489] ? > __switch_to_asm+0x41/0x70 > 2021-03-24T21:03:30.289658+01:00 geneious kernel: [2159747.796490] ? > __switch_to_asm+0x35/0x70 > 2021-03-24T21:03:30.289662+01:00 geneious kernel: [2159747.796493] > kjournald2+0xbb/0x230 [jbd2] > 2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796499] ? > wait_woken+0x80/0x80 > 2021-03-24T21:03:30.289663+01:00 geneious kernel: [2159747.796503] > kthread+0xf6/0x130 > 2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796508] ? > commit_timeout+0x10/0x10 [jbd2] > 2021-03-24T21:03:30.289664+01:00 geneious kernel: [2159747.796510] ? > kthread_bind+0x10/0x10 > 2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796511] > ret_from_fork+0x35/0x40 > 2021-03-24T21:03:30.289665+01:00 geneious kernel: [2159747.796517] Code: 1b 48 > 8b 03 48 8b 7b 08 48 83 c3 18 48 89 ee e8 bf 42 76 00 48 8b 03 48 85 c0 75 e8 > e9 3c ff ff ff 48 89 df 5b 5d e9 > c8 35 fb ff <0f> 0b e9 26 ff ff ff 48 83 e8 01 e9 5b ff ff ff 0f 1f 84 00 00 > 2021-03-24T21:03:30.289670+01:00 geneious kernel: [2159747.796533] ---[ end > trace db796891c8ff94af ]--- > 2021-03-24T21:03:46.593225+01:00 geneious kernel: [2159764.100145] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:05:09.372772+01:00 geneious kernel: [2159846.877201] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:06:39.943519+01:00 geneious kernel: [2159937.381068] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:07:42.364311+01:00 geneious kernel: [2159999.793805] JBD2: > Detected IO errors while flushing file data on dm-1-8 > 2021-03-24T21:07:57.822133+01:00 geneious kernel: [2160015.291776] JBD2: > Detected IO errors while flushing file data on dm-1-8 > > First i'm wondering: what is dm-1-8 ? > I don't have a device like that. > > geneious:~ # find /dev -iname '*dm*' > /dev/dm-1 > /dev/dm-0 > /dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6lq0ntdZJxxLIIp5G8XihsuYrTbx7Rs0vc > /dev/disk/by-id/dm-name-vg_local-lv_var > /dev/disk/by-id/dm-uuid-LVM-a9Cy1cweHgXlAEECqZL5KZBfnuigUG6l3fdsOpBFoDWral3Fa7c6ZeYECmLd6FFj > /dev/disk/by-id/dm-name-vg_local-lv_root > /dev/cpu_dma_latency > > I just find /proc/fs/jbd2/dm-1-8. > There is a file /proc/fs/jbd2/dm-1-8/info: > 453005 transactions (319055 requested), each up to 8192 blocks > average: > 0ms waiting for transaction > 12ms request delay > 5124ms running transaction > 0ms transaction was being locked > 0ms flushing data (in ordered mode) > 44ms logging transaction > 8031us average transaction commit time > 64 handles per transaction > 5 blocks per transaction > 6 logged blocks per transaction > > What is that ? > > The logfile says also something about dm-0-8: > 2021-03-24T21:03:30.289644+01:00 geneious kernel: [2159747.796405] CPU: 0 PID: > 457 Comm: jbd2/dm-0-8 Not tainted 4.12.14-122.57-default #1 SLE12-SP5 > > geneious:~ # find / -iname dm-0-8 > /proc/fs/jbd2/dm-0-8 > geneious:~ # ll /proc/fs/jbd2/dm-0-8 > total 0 > -r--r--r-- 1 root root 0 Mar 29 12:56 info > geneious:~ # cat /proc/fs/jbd2/dm-0-8/info > 7356 transactions (556 requested), each up to 8192 blocks > average: > 0ms waiting for transaction > 20ms request delay > 5628ms running transaction > 4ms transaction was being locked > 0ms flushing data (in ordered mode) > 132ms logging transaction > 134769us average transaction commit time > 52 handles per transaction > 18 blocks per transaction > 19 logged blocks per transaction > geneious:~ # > > > I assume i have a harddisk problem. I'm checking currently the SAN with its own > tools, via a web interface. > Afterwards i want to stop the domain, boot it with a live cd and run badblocks > and fsck.ext3. > What else can i do ? > > Bernd > >I forgot: host is SLES 12 SP5, virtual domain too. The image file is in raw format. -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2217 bytes Desc: S/MIME Cryptographic Signature URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20210329/0569ac07/attachment.p7s>