Justice London
2009-Aug-06 00:01 UTC
[Gluster-users] 'Primary' brick outage or reboot issues
It appears that if the first brick in a replicated/distributed configuration is rebooted or suffers some sort of a temporary issue, it both means that the system doesn't appear to be dropped after 10 seconds from the cluster and also that after it comes back up, pending transactions have issues for the next 10 minutes or so. Is this a locks issue or is this a bug? Justice London E-mail: jlondon at lawinfo.com
----- "Justice London" <jlondon at lawinfo.com> wrote:> It appears that if the first brick in a replicated/distributed > configuration is rebooted or suffers some sort of a temporary issue, it both means > that the system doesn't appear to be dropped after 10 seconds from the > cluster and also that after it comes back up, pending transactions have issues > for the next 10 minutes or so. Is this a locks issue or is this a bug?If the first subvolume silently goes down (without resetting the connection) then an 'ls' will hang for 10 seconds (this is the "ping-pong" timeout) because replicate will not notice until then that the server has failed. Other operations should work fine, though. Can you elaborate what you mean by 'pending transactions' and what kind of issues they face? Vikas -- Engineer - http://gluster.com/