I am experiencing a weird issue with a raid 5+0 under dom0. I am running xen 3.2 from debian lenny which has the 2.6.26-2-xen-amd64 dom0 kernel. There are 6 1TB sata disks which are arranged in 2 sets of 3 disk raid5''s which are raid0''d together. Chunk size on all arrays is 64k and I was able to create and sync all arrays with no issues, then initialized lvm on the raid0 and created 2 lv''s all with no issues. I was able to install 2 guests with no apparent problems however after 2 days I noticed errors in the guests that their disks had bad blocks. I checked dom0 and noticed lots of messages like these: [305012.467758] raid0_make_request bug: can''t convert block across chunks or bigger than 64k 2385277 4 I have posted this to linux-raid mailinglist where they have indicated that this bug is likely due to xenified kernel. A quote from the linux-raid mailinglist:> This looks like a bug in ''dm'' or more likely xen. > Assuming you are using a recent kernel (you didn''t say), raid0 is > receiving a request that does not fit entirely in on chunk, and > which has more than on page in the bi_iovec. > i.e. bi_vcnt != 1 or bi_idx != 0. > > As raid0 has a merge_bvec_fn, dm should not be sending bios with more than 1 > page without first cheking that the merge_bvec_fn accepts the extra page. > But the raid0 merge_bvec_fn will reject any bio which does not fit in > a chunk. > > dm-linear appears to honour the merge_bvec_fn of the underlying device > in the implementation of its own merge_bvec_fn. So presumably the xen client > is not making the appropriate merge_bvec_fn call. > I am not very familiar with xen: how exactly are you making the logical > volume available to xen? > Also, what kernel are you running? > > NeilBrownUnfortunately since I am running 3.2 from what I understand there are limited dom0 options, so I am not sure if there is any advice on this mailinglist or if I should bring this up on xen-devel. I have detailed raid information and errors at http://pastebin.com/f6a52db74 I would appreciate any advice or input on this issue. - chris _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sun, Feb 21, 2010 at 05:48:47AM -0500, chris wrote:> I am experiencing a weird issue with a raid 5+0 under dom0. I am > running xen 3.2 from debian lenny which has the 2.6.26-2-xen-amd64 > dom0 kernel. There are 6 1TB sata disks which are arranged in 2 sets > of 3 disk raid5''s which are raid0''d together. Chunk size on all arrays > is 64k and I was able to create and sync all arrays with no issues, > then initialized lvm on the raid0 and created 2 lv''s all with no > issues. I was able to install 2 guests with no apparent problems > however after 2 days I noticed errors in the guests that their disks > had bad blocks. I checked dom0 and noticed lots of messages like > these: > > [305012.467758] raid0_make_request bug: can''t convert block across > chunks or bigger than 64k 2385277 4 > > I have posted this to linux-raid mailinglist where they have indicated > that this bug is likely due to xenified kernel. > > A quote from the linux-raid mailinglist: > > > This looks like a bug in ''dm'' or more likely xen. > > Assuming you are using a recent kernel (you didn''t say), raid0 is > > receiving a request that does not fit entirely in on chunk, and > > which has more than on page in the bi_iovec. > > i.e. bi_vcnt != 1 or bi_idx != 0. > > > > As raid0 has a merge_bvec_fn, dm should not be sending bios with more than 1 > > page without first cheking that the merge_bvec_fn accepts the extra page. > > But the raid0 merge_bvec_fn will reject any bio which does not fit in > > a chunk. > > > > dm-linear appears to honour the merge_bvec_fn of the underlying device > > in the implementation of its own merge_bvec_fn. So presumably the xen client > > is not making the appropriate merge_bvec_fn call. > > I am not very familiar with xen: how exactly are you making the logical > > volume available to xen? > > Also, what kernel are you running? > > > > NeilBrown > > Unfortunately since I am running 3.2 from what I understand there are > limited dom0 options, so I am not sure if there is any advice on this > mailinglist or if I should bring this up on xen-devel. I have detailed > raid information and errors at http://pastebin.com/f6a52db74 > > I would appreciate any advice or input on this issue. >Try with different dom0 kernel: http://wiki.xensource.com/xenwiki/XenDom0Kernels I''d suggest linux-2.6.18-xen or some forward-port of it (2.6.31). If different dom0 kernel doesn''t help, then try emailing to xen-devel with the info/quote above included. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I''ve switched to raid-6 on this machine which is comparable enough in space/speed/redundancy as I don''t have the resources available to debug this further but hopefully on the next similar system I will have more time to figure out what the bug actually is and hopefully we can resolve it. It appears to only affect raid0 though from that I can tell. - chris On Sun, Feb 21, 2010 at 8:19 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Sun, Feb 21, 2010 at 05:48:47AM -0500, chris wrote: >> I am experiencing a weird issue with a raid 5+0 under dom0. I am >> running xen 3.2 from debian lenny which has the 2.6.26-2-xen-amd64 >> dom0 kernel. There are 6 1TB sata disks which are arranged in 2 sets >> of 3 disk raid5''s which are raid0''d together. Chunk size on all arrays >> is 64k and I was able to create and sync all arrays with no issues, >> then initialized lvm on the raid0 and created 2 lv''s all with no >> issues. I was able to install 2 guests with no apparent problems >> however after 2 days I noticed errors in the guests that their disks >> had bad blocks. I checked dom0 and noticed lots of messages like >> these: >> >> [305012.467758] raid0_make_request bug: can''t convert block across >> chunks or bigger than 64k 2385277 4 >> >> I have posted this to linux-raid mailinglist where they have indicated >> that this bug is likely due to xenified kernel. >> >> A quote from the linux-raid mailinglist: >> >> > This looks like a bug in ''dm'' or more likely xen. >> > Assuming you are using a recent kernel (you didn''t say), raid0 is >> > receiving a request that does not fit entirely in on chunk, and >> > which has more than on page in the bi_iovec. >> > i.e. bi_vcnt != 1 or bi_idx != 0. >> > >> > As raid0 has a merge_bvec_fn, dm should not be sending bios with more than 1 >> > page without first cheking that the merge_bvec_fn accepts the extra page. >> > But the raid0 merge_bvec_fn will reject any bio which does not fit in >> > a chunk. >> > >> > dm-linear appears to honour the merge_bvec_fn of the underlying device >> > in the implementation of its own merge_bvec_fn. So presumably the xen client >> > is not making the appropriate merge_bvec_fn call. >> > I am not very familiar with xen: how exactly are you making the logical >> > volume available to xen? >> > Also, what kernel are you running? >> > >> > NeilBrown >> >> Unfortunately since I am running 3.2 from what I understand there are >> limited dom0 options, so I am not sure if there is any advice on this >> mailinglist or if I should bring this up on xen-devel. I have detailed >> raid information and errors at http://pastebin.com/f6a52db74 >> >> I would appreciate any advice or input on this issue. >> > > Try with different dom0 kernel: > http://wiki.xensource.com/xenwiki/XenDom0Kernels > > I''d suggest linux-2.6.18-xen or some forward-port of it (2.6.31). > > If different dom0 kernel doesn''t help, then try emailing to xen-devel with the info/quote above included. > > -- Pasi > >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users