thr3ads.net - Gluster users - [Gluster-users] Brick is not connected (and other funny things) [Oct 2014]

If this information is useful, please help other people find it:
Share via:

Marco Marino

2014-Oct-03 18:07 UTC

[Gluster-users] Brick is not connected (and other funny things)

Hi,
I'm trying to use glusterfs with my openstack private cloud for storing
ephemeral disks. In this way, each compute node mount glusterfs in /nova
and save instances on a remote glusterfs (shared between the compute nodes,
so live migration is very fast).
I have 2 storage node (storage1 and storage2) with replica 2.
In a first configuration i've used nfs on the clients. In /etc/fstab of the
compute nodes i have:
storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0

This creates a single point of failure because if storage1 goes down, i
have to remount manually on storage2. And this causes the complete disk
corruption of all VMs that running on all the compute nodes. Really
funny...

In a second configuration, i've used the gluster native client with
"backupvolfile-server=storage2". i've made few tests, but it seems
to work.
What i've tested:
on the compute node i have:
mount -t glusterfs -o
backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log
server1:/test-volume /gluster_mount

Then, I booted a vm and started to download a large file (1GB) from the vm
(so, i'm writing on the ephemeral disk stored via glusterfs). During this
download, i rebooted storage1 and the VM seems to be not corrupted (so, the
vm write only on storage2).
Can i have a confirmation about this? Is this the right way?

Next question:
When i rebooted storage1, it fails to start. it tells me that /dev/sdc1
(the partition that i'm using for the test) is corrupted. It could be a
normal behavior because the server goes down during a write. So, started
the storage1 in single user mode and xfs_repair /dev/sdc1. This make me
able to start storage1. (yuppy)
Glusterfs starts correctly, but now i have "brick1 is not connected",
where
/export/brick1 is the brick that i'm using on storage1 for the volume used
for tests.
On storage2 all is ok, but i have a split brain condition. /export/brick1
on storage1 doesn't contain datas....
What can i have to do to restore /export/brick1 on storage1 ???

Thanks
MM
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141003/35f41d44/attachment.html>

Marco Marino

2014-Oct-08 16:09 UTC

head link

[Gluster-users] Brick is not connected (and other funny things)

Can someone help me?
I'd like to restore my /export/brick1 on server1. Actually i have datas
only on server2.
I think that right instructions are:
1) setfattr -n ... on server1 ( this is a bug. Here more info ->
http://www.joejulian.name/blog/replacing-a-brick-on-glusterfs-340/   I have
the same error in logs)
2) Now i think i can re-start volume, so i should see an automatic healing
procedure
3) All datas are replicated on server1

Can i have a confirmation about this procedure? Other volumes are affected?
Please, i cannot loose my data

Thanks

2014-10-03 20:07 GMT+02:00 Marco Marino <marino.mrc at gmail.com>:
> Hi,
> I'm trying to use glusterfs with my openstack private cloud for storing
> ephemeral disks. In this way, each compute node mount glusterfs in /nova
> and save instances on a remote glusterfs (shared between the compute nodes,
> so live migration is very fast).
> I have 2 storage node (storage1 and storage2) with replica 2.
> In a first configuration i've used nfs on the clients. In /etc/fstab of
> the compute nodes i have:
> storage1:/cloud_rootdisk /nova nfs mountproto=tcp,vers=3 0 0
>
> This creates a single point of failure because if storage1 goes down, i
> have to remount manually on storage2. And this causes the complete disk
> corruption of all VMs that running on all the compute nodes. Really
> funny...
>
> In a second configuration, i've used the gluster native client with
> "backupvolfile-server=storage2". i've made few tests, but it
seems to work.
> What i've tested:
> on the compute node i have:
> mount -t glusterfs -o
>
backupvolfile-server=server2,fetch-attempts=2,log-level=WARNING,log-file=/var/log/gluster.log
> server1:/test-volume /gluster_mount
>
> Then, I booted a vm and started to download a large file (1GB) from the vm
> (so, i'm writing on the ephemeral disk stored via glusterfs). During
this
> download, i rebooted storage1 and the VM seems to be not corrupted (so, the
> vm write only on storage2).
> Can i have a confirmation about this? Is this the right way?
>
> Next question:
> When i rebooted storage1, it fails to start. it tells me that /dev/sdc1
> (the partition that i'm using for the test) is corrupted. It could be a
> normal behavior because the server goes down during a write. So, started
> the storage1 in single user mode and xfs_repair /dev/sdc1. This make me
> able to start storage1. (yuppy)
> Glusterfs starts correctly, but now i have "brick1 is not
connected",
> where /export/brick1 is the brick that i'm using on storage1 for the
volume
> used for tests.
> On storage2 all is ok, but i have a split brain condition. /export/brick1
> on storage1 doesn't contain datas....
> What can i have to do to restore /export/brick1 on storage1 ???
>
>
> Thanks
> MM
>
>
>
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20141008/448fded8/attachment.html>

Gluster users - Oct 2014 - Brick is not connected (and other funny things)

[Gluster-users] Brick is not connected (and other funny things)

[Gluster-users] Brick is not connected (and other funny things)