Niels de Vos
2015-Jul-19 20:47 UTC
[Gluster-users] back to problems: gluster 3.5.4, qemu and debian 8
On Sat, Jul 18, 2015 at 03:56:37PM +0000, Michael Mol wrote:> I think you'll find it's the write-behind that was killing you. > Write-behind opens you up to a number of data consistency issues, and I > strongly disrecommend it unless you have a rock-solid infrastructure from > the writer all the way to the disk the data ultimately sits on.The suggestion to disable these two options was to change the access pattern done by Qemu+libgfapi. Without looking into the source code, I do not know how write-behind and read-ahead play together. Anything that is "written behind" should get flushed when a "read ahead" operation overlaps a "written behind" area. There have been issues with write-behind before, where the flushing was not done in some corner case (https://github.com/gluster/glusterfs/commit/b0515e2a). Maybe this is something similar.> I bet that if you re-enabled read-ahead, you won't see the problem. Just > leave write-behind off.Indeed, write-behind is most likely. Both results of one of the options disabled at the time would be interesting to have. Once it is clear which option causes the problem, we can analyze the access pattern and hopefully fix the xlator. Thanks, Niels> On Sat, Jul 18, 2015, 10:44 AM Roman <romeo.r at gmail.com> wrote: > > solved after I've added (thanks to Niels de Vos) these options to the > volumes: > > performance.read-ahead: off > > performance.write-behind: off > > > 2015-07-15 17:23 GMT+03:00 Roman <romeo.r at gmail.com>: > > hey, > > I've updated the bug, if some1 has some ideas - share plz. > > https://bugzilla.redhat.com/show_bug.cgi?id=1242913 > > > 2015-07-14 19:14 GMT+03:00 Kaushal M <kshlmster at gmail.com>: > > Just a wild guess. What is the filesystem used for the debian 8 > installation? It could be the culprit. > > On Tue, Jul 14, 2015 at 7:27 PM, Roman <romeo.r at gmail.com> wrote: > > I've done this way: installed debian8 on local disks using netinstall iso, > > created a template of it and then cloned (full clone) it to glusterfs > > storage backend. VM boots and runs fine... untill I start to install > > something massive (DE ie). Last time it was mate failed to install due to > > python-gtk2 package problems (complaing that it could not compile it) > > > > > 2015-07-14 16:37 GMT+03:00 Scott Harvanek <scott.harvanek at login.com>: > >> > >> What happens if you install from a full CD and not a net-install? > >> > >> Limit the variables. Currently you are relying on remote mirrors and > >> Internet connectivity. > >> > >> It's either a Proxmox or Debian issue, I really don't think it's Gluster. > >> We have hundreds of Jessie installs running on GlusterFS backends. > >> > >> -- > >> Scott H. > >> Login, LLC. > >> > >> > >> > >> Roman > >> July 14, 2015 at 9:30 AM > >> Hey, > >> > >> thanks for reply. > >> If it would be networking related, it would affect everything. But it is > >> only debian 8 which won't install. > >> And yes, i did iperf test between gluster and proxmox nodes. Its ok. > >> Installation fails on every node, where i try to install d8. Sometimes it > >> goes well (today 1 of 6 tries was fine). Other distros install fine. > >> Sometimes installation process finishes, but VM won't start, just hangs > >> with errors like in this attached. > >> > >> > >> > >> > >> -- > >> Best regards, > >> Roman. > >> Scott Harvanek > >> July 14, 2015 at 9:17 AM > >> We don't have this issue, I'll take a stab tho- > >> > >> Have you confirmed everything is good on the network side of things? > >> MTU/Loss/Errors? > >> > >> Is your inconsistency linked to one specific brick? Have you tried > running > >> a replica instead of distributed? > >> > >> > >> > >> _______________________________________________ > >> Gluster-users mailing list > > >> Gluster-users at gluster.org > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> Roman > >> July 14, 2015 at 6:38 AM > >> here is one of the errors example. its like files that debian installer > >> copies to the virtual disk that is located on glusterfs storage are > getting > >> corrupted. > >> in-target is /dev/vda1 > >> > >> > >> > >> > >> > >> -- > >> Best regards, > >> Roman. > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users at gluster.org > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> Roman > >> July 14, 2015 at 4:50 AM > >> Ubuntu 14.04 LTS base install and then mate install were fine! > >> > >> > >> > >> > >> -- > >> Best regards, > >> Roman. > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users at gluster.org > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> Roman > >> July 13, 2015 at 7:35 PM > >> Bah... the randomness of this issue is killing me. > >> Not only HA volumes are affected. Got an error during installation of d8 > >> with mate (on python-gtk2 pkg) on Distributed volume also. > >> I've checked the MD5SUM of installation iso, its ok. > >> > >> Shortly after that on the same VE node I installed D7 with Gnome without > >> any problem on the HA glusterf volume. > >> > >> And on the same VE node I've installed D8 with both Mate and Gnome using > >> local storage disks without problems. There is a bug somewhere in > gluster or > >> qemu... Proxmox uses RH kernel btw: > >> > >> Linux services 2.6.32-37-pve > >> QEMU emulator version 2.2.1 > >> glusterfs 3.6.4 > >> > >> any ideas? > >> I'm ready to help to investigate this bug. > >> When sun will shine, I'll try to install latest Ubuntu also. But now I'm > >> going to sleep. > >> > >> > >> > >> > >> -- > >> Best regards, > >> Roman. > >> _______________________________________________ > >> Gluster-users mailing list > >> Gluster-users at gluster.org > >> http://www.gluster.org/mailman/listinfo/gluster-users > >> > >> > > > > > > > > -- > > Best regards, > > Roman. > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > -- > > Best regards, > Roman. > > > > -- > > Best regards, > Roman. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users
Roman
2015-Jul-19 22:53 UTC
[Gluster-users] back to problems: gluster 3.5.4, qemu and debian 8
Thanks for your reply. I will test these options as soon as I'm back from ma vacation (2 weeks from now on). I'll be too far from servers to change something even on testing volume 8) 2015-07-19 23:47 GMT+03:00 Niels de Vos <ndevos at redhat.com>:> On Sat, Jul 18, 2015 at 03:56:37PM +0000, Michael Mol wrote: > > I think you'll find it's the write-behind that was killing you. > > Write-behind opens you up to a number of data consistency issues, and I > > strongly disrecommend it unless you have a rock-solid infrastructure from > > the writer all the way to the disk the data ultimately sits on. > > The suggestion to disable these two options was to change the access > pattern done by Qemu+libgfapi. Without looking into the source code, I > do not know how write-behind and read-ahead play together. Anything that > is "written behind" should get flushed when a "read ahead" operation > overlaps a "written behind" area. There have been issues with > write-behind before, where the flushing was not done in some corner > case (https://github.com/gluster/glusterfs/commit/b0515e2a). Maybe this > is something similar. > > > I bet that if you re-enabled read-ahead, you won't see the problem. Just > > leave write-behind off. > > Indeed, write-behind is most likely. Both results of one of the options > disabled at the time would be interesting to have. > > Once it is clear which option causes the problem, we can analyze the > access pattern and hopefully fix the xlator. > > Thanks, > Niels > > > > On Sat, Jul 18, 2015, 10:44 AM Roman <romeo.r at gmail.com> wrote: > > > > solved after I've added (thanks to Niels de Vos) these options to the > > volumes: > > > > performance.read-ahead: off > > > > performance.write-behind: off > > > > > > 2015-07-15 17:23 GMT+03:00 Roman <romeo.r at gmail.com>: > > > > hey, > > > > I've updated the bug, if some1 has some ideas - share plz. > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1242913 > > > > > > 2015-07-14 19:14 GMT+03:00 Kaushal M <kshlmster at gmail.com>: > > > > Just a wild guess. What is the filesystem used for the debian 8 > > installation? It could be the culprit. > > > > On Tue, Jul 14, 2015 at 7:27 PM, Roman <romeo.r at gmail.com> wrote: > > > I've done this way: installed debian8 on local disks using netinstall > iso, > > > created a template of it and then cloned (full clone) it to glusterfs > > > storage backend. VM boots and runs fine... untill I start to install > > > something massive (DE ie). Last time it was mate failed to install due > to > > > python-gtk2 package problems (complaing that it could not compile it) > > > > > > > > 2015-07-14 16:37 GMT+03:00 Scott Harvanek <scott.harvanek at login.com>: > > >> > > >> What happens if you install from a full CD and not a net-install? > > >> > > >> Limit the variables. Currently you are relying on remote mirrors and > > >> Internet connectivity. > > >> > > >> It's either a Proxmox or Debian issue, I really don't think it's > Gluster. > > >> We have hundreds of Jessie installs running on GlusterFS backends. > > >> > > >> -- > > >> Scott H. > > >> Login, LLC. > > >> > > >> > > >> > > >> Roman > > >> July 14, 2015 at 9:30 AM > > >> Hey, > > >> > > >> thanks for reply. > > >> If it would be networking related, it would affect everything. But it > is > > >> only debian 8 which won't install. > > >> And yes, i did iperf test between gluster and proxmox nodes. Its ok. > > >> Installation fails on every node, where i try to install d8. > Sometimes it > > >> goes well (today 1 of 6 tries was fine). Other distros install fine. > > >> Sometimes installation process finishes, but VM won't start, just > hangs > > >> with errors like in this attached. > > >> > > >> > > >> > > >> > > >> -- > > >> Best regards, > > >> Roman. > > >> Scott Harvanek > > >> July 14, 2015 at 9:17 AM > > >> We don't have this issue, I'll take a stab tho- > > >> > > >> Have you confirmed everything is good on the network side of things? > > >> MTU/Loss/Errors? > > >> > > >> Is your inconsistency linked to one specific brick? Have you tried > > running > > >> a replica instead of distributed? > > >> > > >> > > >> > > >> _______________________________________________ > > >> Gluster-users mailing list > > > > >> Gluster-users at gluster.org > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > >> Roman > > >> July 14, 2015 at 6:38 AM > > >> here is one of the errors example. its like files that debian > installer > > >> copies to the virtual disk that is located on glusterfs storage are > > getting > > >> corrupted. > > >> in-target is /dev/vda1 > > >> > > >> > > >> > > >> > > >> > > >> -- > > >> Best regards, > > >> Roman. > > >> _______________________________________________ > > >> Gluster-users mailing list > > >> Gluster-users at gluster.org > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > >> Roman > > >> July 14, 2015 at 4:50 AM > > >> Ubuntu 14.04 LTS base install and then mate install were fine! > > >> > > >> > > >> > > >> > > >> -- > > >> Best regards, > > >> Roman. > > >> _______________________________________________ > > >> Gluster-users mailing list > > >> Gluster-users at gluster.org > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > >> Roman > > >> July 13, 2015 at 7:35 PM > > >> Bah... the randomness of this issue is killing me. > > >> Not only HA volumes are affected. Got an error during installation of > d8 > > >> with mate (on python-gtk2 pkg) on Distributed volume also. > > >> I've checked the MD5SUM of installation iso, its ok. > > >> > > >> Shortly after that on the same VE node I installed D7 with Gnome > without > > >> any problem on the HA glusterf volume. > > >> > > >> And on the same VE node I've installed D8 with both Mate and Gnome > using > > >> local storage disks without problems. There is a bug somewhere in > > gluster or > > >> qemu... Proxmox uses RH kernel btw: > > >> > > >> Linux services 2.6.32-37-pve > > >> QEMU emulator version 2.2.1 > > >> glusterfs 3.6.4 > > >> > > >> any ideas? > > >> I'm ready to help to investigate this bug. > > >> When sun will shine, I'll try to install latest Ubuntu also. But now > I'm > > >> going to sleep. > > >> > > >> > > >> > > >> > > >> -- > > >> Best regards, > > >> Roman. > > >> _______________________________________________ > > >> Gluster-users mailing list > > >> Gluster-users at gluster.org > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > >> > > >> > > > > > > > > > > > > -- > > > Best regards, > > > Roman. > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > -- > > > > Best regards, > > Roman. > > > > > > > > -- > > > > Best regards, > > Roman. > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > >-- Best regards, Roman. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150720/5d0d9c41/attachment.html>
Roman
2015-Aug-04 13:06 UTC
[Gluster-users] back to problems: gluster 3.5.4, qemu and debian 8
Hi all, I'm back and tested those things. Michael was right. I've enabled read-ahead option and nothing changed. So the thing causes the problem with libgfapi and d8 virtio drivers is performance.write-behind. If it is off, everything works perfect. If I set it On, different problems are as result (which were confirmed by other user with same configuration). 2015-07-20 1:53 GMT+03:00 Roman <romeo.r at gmail.com>:> Thanks for your reply. > I will test these options as soon as I'm back from ma vacation (2 weeks > from now on). I'll be too far from servers to change something even on > testing volume 8) > > 2015-07-19 23:47 GMT+03:00 Niels de Vos <ndevos at redhat.com>: > >> On Sat, Jul 18, 2015 at 03:56:37PM +0000, Michael Mol wrote: >> > I think you'll find it's the write-behind that was killing you. >> > Write-behind opens you up to a number of data consistency issues, and I >> > strongly disrecommend it unless you have a rock-solid infrastructure >> from >> > the writer all the way to the disk the data ultimately sits on. >> >> The suggestion to disable these two options was to change the access >> pattern done by Qemu+libgfapi. Without looking into the source code, I >> do not know how write-behind and read-ahead play together. Anything that >> is "written behind" should get flushed when a "read ahead" operation >> overlaps a "written behind" area. There have been issues with >> write-behind before, where the flushing was not done in some corner >> case (https://github.com/gluster/glusterfs/commit/b0515e2a). Maybe this >> is something similar. >> >> > I bet that if you re-enabled read-ahead, you won't see the problem. Just >> > leave write-behind off. >> >> Indeed, write-behind is most likely. Both results of one of the options >> disabled at the time would be interesting to have. >> >> Once it is clear which option causes the problem, we can analyze the >> access pattern and hopefully fix the xlator. >> >> Thanks, >> Niels >> >> >> > On Sat, Jul 18, 2015, 10:44 AM Roman <romeo.r at gmail.com> wrote: >> > >> > solved after I've added (thanks to Niels de Vos) these options to the >> > volumes: >> > >> > performance.read-ahead: off >> > >> > performance.write-behind: off >> > >> > >> > 2015-07-15 17:23 GMT+03:00 Roman <romeo.r at gmail.com>: >> > >> > hey, >> > >> > I've updated the bug, if some1 has some ideas - share plz. >> > >> > https://bugzilla.redhat.com/show_bug.cgi?id=1242913 >> > >> > >> > 2015-07-14 19:14 GMT+03:00 Kaushal M <kshlmster at gmail.com>: >> > >> > Just a wild guess. What is the filesystem used for the debian 8 >> > installation? It could be the culprit. >> > >> > On Tue, Jul 14, 2015 at 7:27 PM, Roman <romeo.r at gmail.com> wrote: >> > > I've done this way: installed debian8 on local disks using netinstall >> iso, >> > > created a template of it and then cloned (full clone) it to glusterfs >> > > storage backend. VM boots and runs fine... untill I start to install >> > > something massive (DE ie). Last time it was mate failed to install >> due to >> > > python-gtk2 package problems (complaing that it could not compile it) >> > > >> > >> > > 2015-07-14 16:37 GMT+03:00 Scott Harvanek <scott.harvanek at login.com>: >> > >> >> > >> What happens if you install from a full CD and not a net-install? >> > >> >> > >> Limit the variables. Currently you are relying on remote mirrors and >> > >> Internet connectivity. >> > >> >> > >> It's either a Proxmox or Debian issue, I really don't think it's >> Gluster. >> > >> We have hundreds of Jessie installs running on GlusterFS backends. >> > >> >> > >> -- >> > >> Scott H. >> > >> Login, LLC. >> > >> >> > >> >> > >> >> > >> Roman >> > >> July 14, 2015 at 9:30 AM >> > >> Hey, >> > >> >> > >> thanks for reply. >> > >> If it would be networking related, it would affect everything. But >> it is >> > >> only debian 8 which won't install. >> > >> And yes, i did iperf test between gluster and proxmox nodes. Its ok. >> > >> Installation fails on every node, where i try to install d8. >> Sometimes it >> > >> goes well (today 1 of 6 tries was fine). Other distros install fine. >> > >> Sometimes installation process finishes, but VM won't start, just >> hangs >> > >> with errors like in this attached. >> > >> >> > >> >> > >> >> > >> >> > >> -- >> > >> Best regards, >> > >> Roman. >> > >> Scott Harvanek >> > >> July 14, 2015 at 9:17 AM >> > >> We don't have this issue, I'll take a stab tho- >> > >> >> > >> Have you confirmed everything is good on the network side of things? >> > >> MTU/Loss/Errors? >> > >> >> > >> Is your inconsistency linked to one specific brick? Have you tried >> > running >> > >> a replica instead of distributed? >> > >> >> > >> >> > >> >> > >> _______________________________________________ >> > >> Gluster-users mailing list >> > >> > >> Gluster-users at gluster.org >> > >> http://www.gluster.org/mailman/listinfo/gluster-users >> > >> Roman >> > >> July 14, 2015 at 6:38 AM >> > >> here is one of the errors example. its like files that debian >> installer >> > >> copies to the virtual disk that is located on glusterfs storage are >> > getting >> > >> corrupted. >> > >> in-target is /dev/vda1 >> > >> >> > >> >> > >> >> > >> >> > >> >> > >> -- >> > >> Best regards, >> > >> Roman. >> > >> _______________________________________________ >> > >> Gluster-users mailing list >> > >> Gluster-users at gluster.org >> > >> http://www.gluster.org/mailman/listinfo/gluster-users >> > >> Roman >> > >> July 14, 2015 at 4:50 AM >> > >> Ubuntu 14.04 LTS base install and then mate install were fine! >> > >> >> > >> >> > >> >> > >> >> > >> -- >> > >> Best regards, >> > >> Roman. >> > >> _______________________________________________ >> > >> Gluster-users mailing list >> > >> Gluster-users at gluster.org >> > >> http://www.gluster.org/mailman/listinfo/gluster-users >> > >> Roman >> > >> July 13, 2015 at 7:35 PM >> > >> Bah... the randomness of this issue is killing me. >> > >> Not only HA volumes are affected. Got an error during installation >> of d8 >> > >> with mate (on python-gtk2 pkg) on Distributed volume also. >> > >> I've checked the MD5SUM of installation iso, its ok. >> > >> >> > >> Shortly after that on the same VE node I installed D7 with Gnome >> without >> > >> any problem on the HA glusterf volume. >> > >> >> > >> And on the same VE node I've installed D8 with both Mate and Gnome >> using >> > >> local storage disks without problems. There is a bug somewhere in >> > gluster or >> > >> qemu... Proxmox uses RH kernel btw: >> > >> >> > >> Linux services 2.6.32-37-pve >> > >> QEMU emulator version 2.2.1 >> > >> glusterfs 3.6.4 >> > >> >> > >> any ideas? >> > >> I'm ready to help to investigate this bug. >> > >> When sun will shine, I'll try to install latest Ubuntu also. But now >> I'm >> > >> going to sleep. >> > >> >> > >> >> > >> >> > >> >> > >> -- >> > >> Best regards, >> > >> Roman. >> > >> _______________________________________________ >> > >> Gluster-users mailing list >> > >> Gluster-users at gluster.org >> > >> http://www.gluster.org/mailman/listinfo/gluster-users >> > >> >> > >> >> > > >> > > >> > > >> > > -- >> > > Best regards, >> > > Roman. >> > > >> > > _______________________________________________ >> > > Gluster-users mailing list >> > > Gluster-users at gluster.org >> > > http://www.gluster.org/mailman/listinfo/gluster-users >> > >> > >> > >> > -- >> > >> > Best regards, >> > Roman. >> > >> > >> > >> > -- >> > >> > Best regards, >> > Roman. >> > >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://www.gluster.org/mailman/listinfo/gluster-users >> >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://www.gluster.org/mailman/listinfo/gluster-users >> >> > > > -- > Best regards, > Roman. >-- Best regards, Roman. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150804/9c882fe9/attachment.html>