Hello! I''ve a ubuntu server running over a HVM DomU. After a hang, the domU doesn''t start again. The domU was not able to mount the virtual machine disk so I started with a live cd and I tried to mount the disk root part / with no success so I ran fsck.ext3 and it gave me input/output error in the domU and in dom0 the kernel says: Feb 14 18:22:28 scofield last message repeated 5 times Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Unhandled error code Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Feb 14 18:22:30 scofield kernel: end_request: I/O error, dev sdd, sector 1073919 Feb 14 18:22:30 scofield Server Administrator: Storage Service EventID: 2095 SCSI sense data Sense key: 3 Sense code: 11 Sense qualifier: 0: Physical Disk 0:0:3 Controller 0, Connector 0<xen-users@lists.xensource.com> Other important think was that in domU with live CD I could mount other partitions of the disk with no errors but not the root partition. The first think I thought it was a hard disk problem, so I tried to copy the img to another hard disk in my dom0 but input /output erros was reported. After that I''ve reinstalled the machine on the same hard disk and it works perfectly. How is it possible? Could a corrupt image make impossible copy it? I''ve had this problem twice and I''ve to reinstalled the machine. Maybe I''m doing something wrong create the image with dd. I use the following command for a image of 900GB dd if=/dev/zero of=file900G.img bs=1M count=921600 I cant remember very well, but I think the issue is because I created a image with dd bigger than hard disk size, If I do it dd command doesnt says nothing, let''s to see that. # df -h Size Used Avail Use Mounted on /dev/sdd1 917G 1,7G 869G 1% /vserver/images/domains/mahone Let''s create a image of 900GB it should be report errors but NO errors are reported!! # dd if=/dev/zero of=/vserver/images/domains/mahone/mahone.img bs=1M count=921600 921600+0 records in 921600+0 records out 966367641600 bytes (966 GB) copied, 26901,7 s, 35,9 MB/s # df -h Size Used Avail Use Mounted on /dev/sdd1 917G 902G 0 100% /vserver/images/domains/mahone Thanks in advance <xen-users@lists.xensource.com> _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Feb 15, 2011 at 7:57 AM, Alberto Asuero Arroyo <albertoasuero@gmail.com> wrote:> Hello! > > I''ve a ubuntu server running over a HVM DomU. After a hang, the domU doesn''t > start again. > > The domU was not able to mount the virtual machine disk so I started with a > live cd and I tried to mount the disk root part / with no success so I ran > fsck.ext3 and it gave me input/output error in the domU and in dom0 the > kernel says: > > > Feb 14 18:22:28 scofield last message repeated 5 times > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Unhandled error code > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Result: > hostbyte=DID_ERROR driverbyte=DRIVER_OK > Feb 14 18:22:30 scofield kernel: end_request: I/O error, dev sdd, sector > 1073919 > Feb 14 18:22:30 scofield Server Administrator: Storage Service EventID: > 2095 SCSI sense data Sense key: 3 Sense code: 11 Sense qualifier: 0: > Physical Disk 0:0:3 Controller 0, Connector 0 > > > Other important think was that in domU with live CD I could mount other > partitions of the disk with no errors but not the root partition. > > The first think I thought it was a hard disk problem, so I tried to copy the > img to another hard disk in my dom0 but input /output erros was reported. > After that I''ve reinstalled the machine on the same hard disk and it works > perfectly. > > How is it possible? Could a corrupt image make impossible copy it? I''ve had > this problem twice and I''ve to reinstalled the machine. >file system corruption can happen with improper shutdowns or when the file system is mounted writable twice without the support of a clustered or network file system. Your dd commands seem fine. I think that your file system was corrupt on the disk image and yes copying a corrupt disk image would still leave the file system corrupt.> Maybe I''m doing something wrong create the image with dd. I use the > following command for a image of 900GB > > dd if=/dev/zero of=file900G.img bs=1M count=921600 > > I cant remember very well, but I think the issue is because I created a > image with dd bigger than hard disk size, If I do it dd command doesnt says > nothing, let''s to see that. > > # df -h > Size Used Avail Use Mounted on > > /dev/sdd1 917G 1,7G 869G 1% > /vserver/images/domains/mahone > > Let''s create a image of 900GB it should be report errors but NO errors are > reported!! > > # dd if=/dev/zero of=/vserver/images/domains/mahone/mahone.img bs=1M > count=921600 > 921600+0 records in > 921600+0 records out > 966367641600 bytes (966 GB) copied, 26901,7 s, 35,9 MB/s > > # df -h > Size Used Avail Use Mounted on > > /dev/sdd1 917G 902G 0 100% > /vserver/images/domains/mahone > > > > > > Thanks in advance > > > > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
>I think that your file system was corrupt on the disk image and yes >copying a corrupt disk image would still leave the file system >corrupt.I know but Why cannot I copy this image if the hard disk of dom0 is correct? On Tue, Feb 15, 2011 at 3:11 PM, Todd Deshane <todd.deshane@xen.org> wrote:> On Tue, Feb 15, 2011 at 7:57 AM, Alberto Asuero Arroyo > <albertoasuero@gmail.com> wrote: > > Hello! > > > > I''ve a ubuntu server running over a HVM DomU. After a hang, the domU > doesn''t > > start again. > > > > The domU was not able to mount the virtual machine disk so I started with > a > > live cd and I tried to mount the disk root part / with no success so I > ran > > fsck.ext3 and it gave me input/output error in the domU and in dom0 the > > kernel says: > > > > > > Feb 14 18:22:28 scofield last message repeated 5 times > > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Unhandled error code > > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Result: > > hostbyte=DID_ERROR driverbyte=DRIVER_OK > > Feb 14 18:22:30 scofield kernel: end_request: I/O error, dev sdd, sector > > 1073919 > > Feb 14 18:22:30 scofield Server Administrator: Storage Service EventID: > > 2095 SCSI sense data Sense key: 3 Sense code: 11 Sense qualifier: 0: > > Physical Disk 0:0:3 Controller 0, Connector 0 > > > > > > Other important think was that in domU with live CD I could mount other > > partitions of the disk with no errors but not the root partition. > > > > The first think I thought it was a hard disk problem, so I tried to copy > the > > img to another hard disk in my dom0 but input /output erros was reported. > > After that I''ve reinstalled the machine on the same hard disk and it > works > > perfectly. > > > > How is it possible? Could a corrupt image make impossible copy it? I''ve > had > > this problem twice and I''ve to reinstalled the machine. > > > > file system corruption can happen with improper shutdowns or when the > file system is mounted writable twice without the support of a > clustered or network file system. > > Your dd commands seem fine. > > I think that your file system was corrupt on the disk image and yes > copying a corrupt disk image would still leave the file system > corrupt. > > > > Maybe I''m doing something wrong create the image with dd. I use the > > following command for a image of 900GB > > > > dd if=/dev/zero of=file900G.img bs=1M count=921600 > > > > I cant remember very well, but I think the issue is because I created a > > image with dd bigger than hard disk size, If I do it dd command doesnt > says > > nothing, let''s to see that. > > > > # df -h > > Size Used Avail Use Mounted on > > > > /dev/sdd1 917G 1,7G 869G 1% > > /vserver/images/domains/mahone > > > > Let''s create a image of 900GB it should be report errors but NO errors > are > > reported!! > > > > # dd if=/dev/zero of=/vserver/images/domains/mahone/mahone.img bs=1M > > count=921600 > > 921600+0 records in > > 921600+0 records out > > 966367641600 bytes (966 GB) copied, 26901,7 s, 35,9 MB/s > > > > # df -h > > Size Used Avail Use Mounted on > > > > /dev/sdd1 917G 902G 0 100% > > /vserver/images/domains/mahone > > > > > > > > > > > > Thanks in advance > > > > > > > > > > > > > > > > _______________________________________________ > > Xen-users mailing list > > Xen-users@lists.xensource.com > > http://lists.xensource.com/xen-users > > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Feb 15, 2011 at 2:57 PM, Alberto Asuero Arroyo <albertoasuero@gmail.com> wrote:> Hello! > > I''ve a ubuntu server running over a HVM DomU. After a hang, the domU doesn''t > start again. > > The domU was not able to mount the virtual machine disk so I started with a > live cd and I tried to mount the disk root part / with no success so I ran > fsck.ext3 and it gave me input/output error in the domU and in dom0 the > kernel says: > > > Feb 14 18:22:28 scofield last message repeated 5 times > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Unhandled error code > Feb 14 18:22:30 scofield kernel: sd 0:2:3:0: [sdd] Result: > hostbyte=DID_ERROR driverbyte=DRIVER_OK > Feb 14 18:22:30 scofield kernel: end_request: I/O error, dev sdd, sector > 1073919 > Feb 14 18:22:30 scofield Server Administrator: Storage Service EventID: > 2095 SCSI sense data Sense key: 3 Sense code: 11 Sense qualifier: 0: > Physical Disk 0:0:3 Controller 0, Connector 0Is the message on dom0? If there''s a disk error on dom0, then it''d make perfect sense that domU will have an error too (regardless whether it''s passed directly as phy:/, or as a file on that disk).> > > Other important think was that in domU with live CD I could mount other > partitions of the disk with no errors but not the root partition. > > The first think I thought it was a hard disk problem, so I tried to copy the > img to another hard disk in my dom0 but input /output erros was reported. > After that I''ve reinstalled the machine on the same hard disk and it works > perfectly.How do you determine "work perfectly"? Two things can happen when you reuse a disk with bad sector: - the disk will mark the sector as bad, and assigning a spare sector to replace it when another write occurs (something like https://guust.tuxes.nl/~bas/wordpress/?p=12) - the bad sector is still there, but your new installation simply does not use it yet (you''ll get an error later when you eventually use that sector) If your data is more valuable than the cost of new disk, I''d simply replace the disk, possibly with enough redundancy in place (RAID5, etc) -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
>Is the message on dom0?Yes>the bad sector is still there, but your new installation simply doesnot use it yet (you''ll get an error later when you eventually use that sector) I know but after the errors I''ve done a new partition (mfks.ext3 /dev/sdd1) without errors in my kernel and I''ve created a image of 99% of the disk with dd without errors too. And an fsck.ext4 /dev/sdd1 I''ve a dell server and the dell panel dont tell nothing about wrong disk... after that I''ve already told, do you think the disk has bad sector? My data is much more valuable of the cost of the disk so I''m going to replace the disk...but I''m curious because I though mkfs or fsck find bad sector on a disk. Thanks in advance On Tue, Feb 15, 2011 at 3:53 PM, Fajar A. Nugraha <list@fajar.net> wrote:> Is the message on dom0?_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Wed, Feb 16, 2011 at 4:34 AM, Alberto Asuero Arroyo <albertoasuero@gmail.com> wrote:>>Is the message on dom0? > Yes > > >>the bad sector is still there, but your new installation simply does > not use it yet (you''ll get an error later when you eventually use that > sector) > > I know but after the errors I''ve done a new partition (mfks.ext3 /dev/sdd1) > without errors in my kernel and I''ve created a image of 99% of the disk > with dd without errors too. And an fsck.ext4 /dev/sdd1Then it''s most likely you encountered #1, and the disk assigns a new sector when you attempt to write to the bad sector.> > I''ve a dell server and the dell panel dont tell nothing about wrong disk... >logs don''t lie :) Does the panel handle bad sector cases as well, or does it simply test whether the disk is accessible? And if this is a server, you should definitely use RAID1/RAID5. If the disk is passed to the OS directly (i.e. not managed via hardware RAID) then you should be able to see disk logs (whether or not is has experienced bad sector, and which one) using smartctl from smartmontools package.> after that I''ve already told, do you think the disk has bad sector?Definitely.> > My data is much more valuable of the cost of the disk so I''m going to > replace the disk...but I''m curious because I though mkfs or fsck find bad > sector on a disk.you should''ve test "fcsk -c" before attempting mkfs or dd. It''d do a read-test, which should catch the bad sector before it gets a new sector reassigned (after writes from mkfs/dd). -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Definetly the disk is wrong after a fsck the disk has died... Dell panel show the disk like wrong now! Thanks everyone! On Tue, Feb 15, 2011 at 10:53 PM, Fajar A. Nugraha <list@fajar.net> wrote:> On Wed, Feb 16, 2011 at 4:34 AM, Alberto Asuero Arroyo > <albertoasuero@gmail.com> wrote: > >>Is the message on dom0? > > Yes > > > > > >>the bad sector is still there, but your new installation simply does > > not use it yet (you''ll get an error later when you eventually use that > > sector) > > > > I know but after the errors I''ve done a new partition (mfks.ext3 > /dev/sdd1) > > without errors in my kernel and I''ve created a image of 99% of the disk > > with dd without errors too. And an fsck.ext4 /dev/sdd1 > > Then it''s most likely you encountered #1, and the disk assigns a new > sector when you attempt to write to the bad sector. > > > > > I''ve a dell server and the dell panel dont tell nothing about wrong > disk... > > > > logs don''t lie :) > Does the panel handle bad sector cases as well, or does it simply test > whether the disk is accessible? > And if this is a server, you should definitely use RAID1/RAID5. > > If the disk is passed to the OS directly (i.e. not managed via > hardware RAID) then you should be able to see disk logs (whether or > not is has experienced bad sector, and which one) using smartctl from > smartmontools package. > > > after that I''ve already told, do you think the disk has bad sector? > > Definitely. > > > > > My data is much more valuable of the cost of the disk so I''m going to > > replace the disk...but I''m curious because I though mkfs or fsck find > bad > > sector on a disk. > > you should''ve test "fcsk -c" before attempting mkfs or dd. It''d do a > read-test, which should catch the bad sector before it gets a new > sector reassigned (after writes from mkfs/dd). > > -- > Fajar >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users