Marco Marino
2014-Oct-03 18:07 UTC
[Gluster-users] Brick is not connected (and other funny things)
Hi, I'm trying to use glusterfs with my openstack private cloud for storing ephemeral disks. In this way, each compute node mount glusterfs in /nova and save instances on a remote glusterfs (shared between the compute nodes, so live migration is very fast). I have 2 storage node (storage1 and storage2) with replica 2. In a first configuration i've used nfs on the clients. In /etc/fstab of the compute nodes i have: storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0 This creates a single point of failure because if storage1 goes down, i have to remount manually on storage2. And this causes the complete disk corruption of all VMs that running on all the compute nodes. Really funny... In a second configuration, i've used the gluster native client with "backupvolfile-server=storage2". i've made few tests, but it seems to work. What i've tested: on the compute node i have: mount -t glusterfs -o backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log server1:/test-volume /gluster_mount Then, I booted a vm and started to download a large file (1GB) from the vm (so, i'm writing on the ephemeral disk stored via glusterfs). During this download, i rebooted storage1 and the VM seems to be not corrupted (so, the vm write only on storage2). Can i have a confirmation about this? Is this the right way? Next question: When i rebooted storage1, it fails to start. it tells me that /dev/sdc1 (the partition that i'm using for the test) is corrupted. It could be a normal behavior because the server goes down during a write. So, started the storage1 in single user mode and xfs_repair /dev/sdc1. This make me able to start storage1. (yuppy) Glusterfs starts correctly, but now i have "brick1 is not connected", where /export/brick1 is the brick that i'm using on storage1 for the volume used for tests. On storage2 all is ok, but i have a split brain condition. /export/brick1 on storage1 doesn't contain datas.... What can i have to do to restore /export/brick1 on storage1 ??? Thanks MM -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141003/35f41d44/attachment.html>
Marco Marino
2014-Oct-08 16:09 UTC
[Gluster-users] Brick is not connected (and other funny things)
Can someone help me? I'd like to restore my /export/brick1 on server1. Actually i have datas only on server2. I think that right instructions are: 1) setfattr -n ... on server1 ( this is a bug. Here more info -> http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/ I have the same error in logs) 2) Now i think i can re-start volume, so i should see an automatic healing procedure 3) All datas are replicated on server1 Can i have a confirmation about this procedure? Other volumes are affected? Please, i cannot loose my data Thanks 2014-10-03 20:07 GMT+02:00 Marco Marino <marino.mrc at gmail.com>:> Hi, > I'm trying to use glusterfs with my openstack private cloud for storing > ephemeral disks. In this way, each compute node mount glusterfs in /nova > and save instances on a remote glusterfs (shared between the compute nodes, > so live migration is very fast). > I have 2 storage node (storage1 and storage2) with replica 2. > In a first configuration i've used nfs on the clients. In /etc/fstab of > the compute nodes i have: > storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0 > > This creates a single point of failure because if storage1 goes down, i > have to remount manually on storage2. And this causes the complete disk > corruption of all VMs that running on all the compute nodes. Really > funny... > > In a second configuration, i've used the gluster native client with > "backupvolfile-server=storage2". i've made few tests, but it seems to work. > What i've tested: > on the compute node i have: > mount -t glusterfs -o > backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log > server1:/test-volume /gluster_mount > > Then, I booted a vm and started to download a large file (1GB) from the vm > (so, i'm writing on the ephemeral disk stored via glusterfs). During this > download, i rebooted storage1 and the VM seems to be not corrupted (so, the > vm write only on storage2). > Can i have a confirmation about this? Is this the right way? > > Next question: > When i rebooted storage1, it fails to start. it tells me that /dev/sdc1 > (the partition that i'm using for the test) is corrupted. It could be a > normal behavior because the server goes down during a write. So, started > the storage1 in single user mode and xfs_repair /dev/sdc1. This make me > able to start storage1. (yuppy) > Glusterfs starts correctly, but now i have "brick1 is not connected", > where /export/brick1 is the brick that i'm using on storage1 for the volume > used for tests. > On storage2 all is ok, but i have a split brain condition. /export/brick1 > on storage1 doesn't contain datas.... > What can i have to do to restore /export/brick1 on storage1 ??? > > > Thanks > MM > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141008/448fded8/attachment.html>