Lindsay Mathieson
2016-Oct-19 21:27 UTC
[Gluster-users] [URGENT] Add-bricks to a volume corrupted the files
On 20/10/2016 7:01 AM, Kevin Lemonnier wrote:> Yes, you need to add a full replica set at once. > I don't remember, but according to my history, looks like I've used this : > > gluster volume add-brick VMs host1:/brick host2:/brick host3:/brick force > > (I have the same without force just before that, so I assume force is needed)Ok, I did a: gluster volume add-brick datastore1 vna.proxmox.softlog:/tank/vmdata/datastore1-2 vnb.proxmox.softlog:/tank/vmdata/datastore1-2 vng.proxmox.softlog:/tank/vmdata/datastore1-2 I had added a 2nd windows VM as well. Looked like it was going ok for a while, then blew up. The first windows vm which was running diskmark died and won't boot. qemu-img check shows the image hopelessly corrupted. 2nd VM has also crashed and is unbootable, though qemuimg shows the qcow2 file as ok. I have a sneaking suspicion its related to active IO. VM1 was doing heavy io compared to vm2, perhaps thats while is image was corrupted worse. rebalance status looks odd to me: root at vna:~# gluster volume rebalance datastore1 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 0 0 0 completed 0:0:1 vnb.proxmox.softlog 0 0Bytes 0 0 0 completed 0:0:1 vng.proxmox.softlog 328 19.2GB 1440 0 0 in progress 0:11:55 Don't know why vng is taking so much longer, the nodes are identical. But maybe this normal? When I get time, I'll try again with: - all vm's shutdown (no IO) - All VM's running off the gluster fuse mount (no gfapi). cheers, -- Lindsay Mathieson
Kevin Lemonnier
2016-Oct-19 21:39 UTC
[Gluster-users] [URGENT] Add-bricks to a volume corrupted the files
> > Looked like it was going ok for a while, then blew up. The first windows > vm which was running diskmark died and won't boot. qemu-img check shows > the image hopelessly corrupted. 2nd VM has also crashed and is > unbootable, though qemuimg shows the qcow2 file as ok. >Ha, glad you could reproduce this ! (Well, all things considered) Looks very much like what I had indeed. So it's still a problem in recent versions, glad I didn't try again then. Thanks for taking the time, let's hope that'll help them :) -- Kevin Lemonnier PGP Fingerprint : 89A5 2283 04A0 E6E9 0111 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Digital signature URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161019/09c76613/attachment.sig>
Krutika Dhananjay
2016-Oct-20 11:13 UTC
[Gluster-users] [URGENT] Add-bricks to a volume corrupted the files
Thanks a lot, Lindsay! Appreciate the help. It would be awesome if you could tell us whether you see the issue with FUSE as well, while we get around to setting up the environment and running the test ourselves. -Krutika On Thu, Oct 20, 2016 at 2:57 AM, Lindsay Mathieson < lindsay.mathieson at gmail.com> wrote:> On 20/10/2016 7:01 AM, Kevin Lemonnier wrote: > >> Yes, you need to add a full replica set at once. >> I don't remember, but according to my history, looks like I've used this : >> >> gluster volume add-brick VMs host1:/brick host2:/brick host3:/brick force >> >> (I have the same without force just before that, so I assume force is >> needed) >> > > Ok, I did a: > > gluster volume add-brick datastore1 vna.proxmox.softlog:/tank/vmdata/datastore1-2 > vnb.proxmox.softlog:/tank/vmdata/datastore1-2 > vng.proxmox.softlog:/tank/vmdata/datastore1-2 > > I had added a 2nd windows VM as well. > > Looked like it was going ok for a while, then blew up. The first windows > vm which was running diskmark died and won't boot. qemu-img check shows the > image hopelessly corrupted. 2nd VM has also crashed and is unbootable, > though qemuimg shows the qcow2 file as ok. > > > I have a sneaking suspicion its related to active IO. VM1 was doing heavy > io compared to vm2, perhaps thats while is image was corrupted worse. > > > rebalance status looks odd to me: > > root at vna:~# gluster volume rebalance datastore1 status > Node Rebalanced-files size > scanned failures skipped status run time in h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 0 0Bytes 0 > 0 0 completed 0:0:1 > vnb.proxmox.softlog 0 0Bytes 0 > 0 0 completed 0:0:1 > vng.proxmox.softlog 328 19.2GB 1440 > 0 0 in progress 0:11:55 > > > Don't know why vng is taking so much longer, the nodes are identical. But > maybe this normal? > > > When I get time, I'll try again with: > > - all vm's shutdown (no IO) > > - All VM's running off the gluster fuse mount (no gfapi). > > > cheers, > > > -- > Lindsay Mathieson > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161020/b03d9c0a/attachment.html>