Lindsay Mathieson
2016-Jan-23 03:50 UTC
[Gluster-users] More Peculiar heal behaviour after removing brick
Maybe I'm doing some wrong here but I'm not sure what, or maybe this is
normal behaviour?
All of the following is performed from my vna node which has the highest
numbered uuid. Indenting applied by me for readability.
Spoiler, because it happens at the end: removing the vng brick followed
by a full heal gives this error:
"*Commit failed on vng.proxmox.softlog. Please check log file for
details."*
Sets to recreate:
1. Create a test volume:
vna$ gluster volume create test3 rep 3 transport tcp
vnb.proxmox.softlog:/vmdata/test3 vng.proxmox.softlog:/vmdata/test3
vna.proxmox.softlog:/vmdata/test3
vna$ gluster volume set test3 group softlog
vna$ gluster volume info test3
Volume Name: test3
Type: Replicate
Volume ID: 0be89d63-775c-4eb5-9d98-0a4a87f30fbf
Status: Created
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/vmdata/test3
Brick2: vng.proxmox.softlog:/vmdata/test3
Brick3: vna.proxmox.softlog:/vmdata/test3
Options Reconfigured:
cluster.data-self-heal-algorithm: full
network.remote-dio: enable
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.stat-prefetch: off
performance.strict-write-ordering: on
performance.write-behind: off
nfs.enable-ino32: off
nfs.addr-namelookup: off
nfs.disable: on
performance.cache-refresh-timeout: 4
performance.io-thread-count: 32
performance.low-prio-threads: 32
cluster.server-quorum-type: server
cluster.quorum-type: auto
client.event-threads: 4
server.event-threads: 4
cluster.self-heal-window-size: 256
features.shard-block-size: 512MB
features.shard: on
performance.readdir-ahead: off
vna$ gluster volume start test3
2. Immediately remove the vng brick:
vna$ gluster volume remove-brick test3 replica 2
vng.proxmox.softlog:/vmdata/test3 force
vna$ gluster volume info test3
Volume Name: test3
Type: Replicate
Volume ID: 36421a23-68c4-455d-8d4c-e21d9428e1da
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: vnb.proxmox.softlog:/vmdata/test3
Brick2: vna.proxmox.softlog:/vmdata/test3
Options Reconfigured:
cluster.data-self-heal-algorithm: full
network.remote-dio: enable
cluster.eager-lock: enable
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.stat-prefetch: off
performance.strict-write-ordering: on
performance.write-behind: off
nfs.enable-ino32: off
nfs.addr-namelookup: off
nfs.disable: on
performance.cache-refresh-timeout: 4
performance.io-thread-count: 32
performance.low-prio-threads: 32
cluster.server-quorum-type: server
cluster.quorum-type: auto
client.event-threads: 4
server.event-threads: 4
cluster.self-heal-window-size: 256
features.shard-block-size: 512MB
features.shard: on
performance.readdir-ahead: off
3. Then run a full heal:
vna$ gluster volume heal test3 full
*Commit failed on vng.proxmox.softlog. Please check log file for
details.*
Weird, because of cause the vng brick has been removed. This happens
every time.
I have preserved the glustershd logs from vna & vng if needed. There
were no heal logs.
--
Lindsay Mathieson
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160123/eed703e3/attachment.html>
Krutika Dhananjay
2016-Jan-25 07:40 UTC
[Gluster-users] More Peculiar heal behaviour after removing brick
Could you share the logs? I'd like to look at the glustershd logs and etc-glusterfs-glusterd.vol.log files. -Krutika ----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "gluster-users" <Gluster-users at gluster.org> > Sent: Saturday, January 23, 2016 9:20:52 AM > Subject: [Gluster-users] More Peculiar heal behaviour after removing brick> Maybe I'm doing some wrong here but I'm not sure what, or maybe this is > normal behaviour?> All of the following is performed from my vna node which has the highest > numbered uuid. Indenting applied by me for readability.> Spoiler, because it happens at the end: removing the vng brick followed by a > full heal gives this error: > " Commit failed on vng.proxmox.softlog. Please check log file for details."> Sets to recreate: > 1. Create a test volume:> > vna$ gluster volume create test3 rep 3 transport tcp > > vnb.proxmox.softlog:/vmdata/test3 vng.proxmox.softlog:/vmdata/test3 > > vna.proxmox.softlog:/vmdata/test3 > > > vna$ gluster volume set test3 group softlog > > > vna$ gluster volume info test3 >> > > Volume Name: test3 > > > > > > Type: Replicate > > > > > > Volume ID: 0be89d63-775c-4eb5-9d98-0a4a87f30fbf > > > > > > Status: Created > > > > > > Number of Bricks: 1 x 3 = 3 > > > > > > Transport-type: tcp > > > > > > Bricks: > > > > > > Brick1: vnb.proxmox.softlog:/vmdata/test3 > > > > > > Brick2: vng.proxmox.softlog:/vmdata/test3 > > > > > > Brick3: vna.proxmox.softlog:/vmdata/test3 > > > > > > Options Reconfigured: > > > > > > cluster.data-self-heal-algorithm: full > > > > > > network.remote-dio: enable > > > > > > cluster.eager-lock: enable > > > > > > performance.io-cache: off > > > > > > performance.read-ahead: off > > > > > > performance.quick-read: off > > > > > > performance.stat-prefetch: off > > > > > > performance.strict-write-ordering: on > > > > > > performance.write-behind: off > > > > > > nfs.enable-ino32: off > > > > > > nfs.addr-namelookup: off > > > > > > nfs.disable: on > > > > > > performance.cache-refresh-timeout: 4 > > > > > > performance.io-thread-count: 32 > > > > > > performance.low-prio-threads: 32 > > > > > > cluster.server-quorum-type: server > > > > > > cluster.quorum-type: auto > > > > > > client.event-threads: 4 > > > > > > server.event-threads: 4 > > > > > > cluster.self-heal-window-size: 256 > > > > > > features.shard-block-size: 512MB > > > > > > features.shard: on > > > > > > performance.readdir-ahead: off > > >> > vna$ gluster volume start test3 >> 2. Immediately remove the vng brick:> > vna$ gluster volume remove-brick test3 replica 2 > > vng.proxmox.softlog:/vmdata/test3 force > > > vna$ gluster volume info test3 >> > > Volume Name: test3 > > > > > > Type: Replicate > > > > > > Volume ID: 36421a23-68c4-455d-8d4c-e21d9428e1da > > > > > > Status: Started > > > > > > Number of Bricks: 1 x 2 = 2 > > > > > > Transport-type: tcp > > > > > > Bricks: > > > > > > Brick1: vnb.proxmox.softlog:/vmdata/test3 > > > > > > Brick2: vna.proxmox.softlog:/vmdata/test3 > > > > > > Options Reconfigured: > > > > > > cluster.data-self-heal-algorithm: full > > > > > > network.remote-dio: enable > > > > > > cluster.eager-lock: enable > > > > > > performance.io-cache: off > > > > > > performance.read-ahead: off > > > > > > performance.quick-read: off > > > > > > performance.stat-prefetch: off > > > > > > performance.strict-write-ordering: on > > > > > > performance.write-behind: off > > > > > > nfs.enable-ino32: off > > > > > > nfs.addr-namelookup: off > > > > > > nfs.disable: on > > > > > > performance.cache-refresh-timeout: 4 > > > > > > performance.io-thread-count: 32 > > > > > > performance.low-prio-threads: 32 > > > > > > cluster.server-quorum-type: server > > > > > > cluster.quorum-type: auto > > > > > > client.event-threads: 4 > > > > > > server.event-threads: 4 > > > > > > cluster.self-heal-window-size: 256 > > > > > > features.shard-block-size: 512MB > > > > > > features.shard: on > > > > > > performance.readdir-ahead: off > > >> 3. Then run a full heal:> > vna$ gluster volume heal test3 full > > > Commit failed on vng.proxmox.softlog. Please check log file for details. >> Weird, because of cause the vng brick has been removed. This happens every > time.> I have preserved the glustershd logs from vna & vng if needed. There were no > heal logs.> -- > Lindsay Mathieson> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160125/64424380/attachment.html>
Anuradha Talur
2016-Jan-25 07:41 UTC
[Gluster-users] More Peculiar heal behaviour after removing brick
----- Original Message -----> From: "Lindsay Mathieson" <lindsay.mathieson at gmail.com> > To: "gluster-users" <Gluster-users at gluster.org> > Sent: Saturday, January 23, 2016 9:20:52 AM > Subject: [Gluster-users] More Peculiar heal behaviour after removing brick > > Maybe I'm doing some wrong here but I'm not sure what, or maybe this is > normal behaviour? > > > All of the following is performed from my vna node which has the highest > numbered uuid. Indenting applied by me for readability. > > Spoiler, because it happens at the end: removing the vng brick followed by a > full heal gives this error: > " Commit failed on vng.proxmox.softlog. Please check log file for details." > > > Sets to recreate: > 1. Create a test volume: > > > vna$ gluster volume create test3 rep 3 transport tcp > vnb.proxmox.softlog:/vmdata/test3 vng.proxmox.softlog:/vmdata/test3 > vna.proxmox.softlog:/vmdata/test3 > vna$ gluster volume set test3 group softlog > vna$ gluster volume info test3 > > > Volume Name: test3 > Type: Replicate > Volume ID: 0be89d63-775c-4eb5-9d98-0a4a87f30fbf > Status: Created > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: vnb.proxmox.softlog:/vmdata/test3 > Brick2: vng.proxmox.softlog:/vmdata/test3 > Brick3: vna.proxmox.softlog:/vmdata/test3 > Options Reconfigured: > cluster.data-self-heal-algorithm: full > network.remote-dio: enable > cluster.eager-lock: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.stat-prefetch: off > performance.strict-write-ordering: on > performance.write-behind: off > nfs.enable-ino32: off > nfs.addr-namelookup: off > nfs.disable: on > performance.cache-refresh-timeout: 4 > performance.io-thread-count: 32 > performance.low-prio-threads: 32 > cluster.server-quorum-type: server > cluster.quorum-type: auto > client.event-threads: 4 > server.event-threads: 4 > cluster.self-heal-window-size: 256 > features.shard-block-size: 512MB > features.shard: on > performance.readdir-ahead: off > vna$ gluster volume start test3 > > 2. Immediately remove the vng brick: > > > vna$ gluster volume remove-brick test3 replica 2 > vng.proxmox.softlog:/vmdata/test3 force > vna$ gluster volume info test3 > > > Volume Name: test3 > Type: Replicate > Volume ID: 36421a23-68c4-455d-8d4c-e21d9428e1da > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: vnb.proxmox.softlog:/vmdata/test3 > Brick2: vna.proxmox.softlog:/vmdata/test3 > Options Reconfigured: > cluster.data-self-heal-algorithm: full > network.remote-dio: enable > cluster.eager-lock: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.stat-prefetch: off > performance.strict-write-ordering: on > performance.write-behind: off > nfs.enable-ino32: off > nfs.addr-namelookup: off > nfs.disable: on > performance.cache-refresh-timeout: 4 > performance.io-thread-count: 32 > performance.low-prio-threads: 32 > cluster.server-quorum-type: server > cluster.quorum-type: auto > client.event-threads: 4 > server.event-threads: 4 > cluster.self-heal-window-size: 256 > features.shard-block-size: 512MB > features.shard: on > performance.readdir-ahead: off > > 3. Then run a full heal: > > > > vna$ gluster volume heal test3 full > Commit failed on vng.proxmox.softlog. Please check log file for details. > >Hi, Could you provide glusterd logs from vna? It will be in the same directory as glustershd logs.> Weird, because of cause the vng brick has been removed. This happens every > time. > > I have preserved the glustershd logs from vna & vng if needed. There were no > heal logs. > > > > > > > > > -- > Lindsay Mathieson > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-- Thanks, Anuradha.