thr3ads.net - Gluster users - [Gluster-users] Replicated striped data lose [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Mahdi Adnan

2016-Mar-15 08:15 UTC

[Gluster-users] Replicated striped data lose

Okay, here's what i did;

Volume Name: v
Type: Distributed-Replicate
Volume ID: b348fd8e-b117-469d-bcc0-56a56bdfc930
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: gfs001:/bricks/b001/v
Brick2: gfs001:/bricks/b002/v
Brick3: gfs001:/bricks/b003/v
Brick4: gfs002:/bricks/b004/v
Brick5: gfs002:/bricks/b005/v
Brick6: gfs002:/bricks/b006/v
Options Reconfigured:
features.shard-block-size: 128MB
features.shard: enable
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
performance.readdir-ahead: on


same error.
and still mounting using glusterfs will work just fine.

Respectfully*
**Mahdi A. Mahdi*
<mailto:mahdi.adnan at outlook.com>

On 03/15/2016 11:04 AM, Krutika Dhananjay wrote:> OK but what if you use it with replication? Do you still see the 
> error? I think not.
> Could you give it a try and tell me what you find?
>
> -Krutika
>
> On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan 
> <mahdi.adnan at earthlinktele.com <mailto:mahdi.adnan at
earthlinktele.com>>
> wrote:
>
>     Hi,
>
>     I have created the following volume;
>
>     Volume Name: v
>     Type: Distribute
>     Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043
>     Status: Started
>     Number of Bricks: 3
>     Transport-type: tcp
>     Bricks:
>     Brick1: gfs001:/bricks/b001/v
>     Brick2: gfs001:/bricks/b002/v
>     Brick3: gfs001:/bricks/b003/v
>     Options Reconfigured:
>     features.shard-block-size: 128MB
>     features.shard: enable
>     cluster.server-quorum-type: server
>     cluster.quorum-type: auto
>     network.remote-dio: enable
>     cluster.eager-lock: enable
>     performance.stat-prefetch: off
>     performance.io-cache: off
>     performance.read-ahead: off
>     performance.quick-read: off
>     performance.readdir-ahead: on
>
>     and after mounting it in ESXi and trying to clone a VM to it, i
>     got the same error.
>
>
>     Respectfully*
>     **Mahdi A. Mahdi*
>
>
>     On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:
>>     Hi,
>>
>>     Do not use sharding and stripe together in the same volume because
>>     a) It is not recommended and there is no point in using both.
>>     Using sharding alone on your volume should work fine.
>>     b) Nobody tested it.
>>     c) Like Niels said, stripe feature is virtually deprecated.
>>
>>     I would suggest that you create an nx3 volume where n is the
>>     number of distribute subvols you prefer, enable group virt
>>     options on it, and enable sharding on it,
>>     set the shard-block-size that you feel appropriate and then just
>>     start off with VM image creation etc.
>>     If you run into any issues even after you do this, let us know
>>     and we'll help you out.
>>
>>     -Krutika
>>
>>     On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan
>>     <mahdi.adnan at earthlinktele.com
>>     <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>
>>         Thanks Krutika,
>>
>>         I have deleted the volume and created a new one.
>>         I found that it may be an issue with the NFS itself, i have
>>         created a new striped volume and enabled sharding and mounted
>>         it via glusterfs and it worked just fine, if i mount it with
>>         nfs it will fail and gives me the same errors.
>>
>>         Respectfully*
>>         **Mahdi A. Mahdi*
>>
>>         On 03/15/2016 06:24 AM, Krutika Dhananjay wrote:
>>>         Hi,
>>>
>>>         So could you share the xattrs associated with the file at
>>>        
<BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>
>>>         Here's what you need to execute:
>>>
>>>         # getfattr -d -m . -e hex
>>>        
/mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>         on the first node and
>>>
>>>         # getfattr -d -m . -e hex
>>>        
/mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>         on the second.
>>>
>>>
>>>         Also, it is normally advised to use a replica 3 volume as
>>>         opposed to replica 2 volume to guard against split-brains.
>>>
>>>         -Krutika
>>>
>>>         On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan
>>>         <mahdi.adnan at earthlinktele.com
>>>         <mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>
>>>             sorry for serial posting but, i got new logs it might
help..
>>>
>>>             the message appear during the migration;
>>>
>>>             /var/log/glusterfs/nfs.log
>>>
>>>
>>>             [2016-03-14 09:45:04.573765] I [MSGID: 109036]
>>>             [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
>>>             0-testv-dht: Setting layout of /New Virtual Machine_1
>>>             with [Subvol_name: testv-stripe-0, Err: -1 , Start: 0 ,
>>>             Stop: 4294967295 , Hash: 1 ],
>>>             [2016-03-14 09:45:04.957499] E
>>>             [shard.c:369:shard_modify_size_and_block_count]
>>>            
(-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
>>>             [0x7f27a13c067f]
>>>            
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
>>>             [0x7f27a116681c]
>>>            
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
>>>             [0x7f27a116584d] ) 0-testv-shard: Failed to get
>>>             trusted.glusterfs.shard.file-size for
>>>             c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>             [2016-03-14 09:45:04.957577] W [MSGID: 112199]
>>>             [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3:
>>>             /New Virtual Machine_1/New Virtual Machine-flat.vmdk
=>
>>>             (XID: 3fec5a26, SETATTR: NFS: 22(Invalid argument for
>>>             operation), POSIX: 22(Invalid argument)) [Invalid
argument]
>>>             [2016-03-14 09:45:05.079657] E [MSGID: 112069]
>>>             [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such
>>>             file or directory: (192.168.221.52:826
>>>             <http://192.168.221.52:826>) testv :
>>>             00000000-0000-0000-0000-000000000001
>>>
>>>
>>>
>>>             Respectfully*
>>>             **Mahdi A. Mahd
>>>
>>>             *
>>>             On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>>>>             So i have deployed a new server "Cisco UCS
C220M4" and
>>>>             created a new volume;
>>>>
>>>>             Volume Name: testv
>>>>             Type: Stripe
>>>>             Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
>>>>             Status: Started
>>>>             Number of Bricks: 1 x 2 = 2
>>>>             Transport-type: tcp
>>>>             Bricks:
>>>>             Brick1: 10.70.0.250:/mnt/b1/v
>>>>             Brick2: 10.70.0.250:/mnt/b2/v
>>>>             Options Reconfigured:
>>>>             nfs.disable: off
>>>>             features.shard-block-size: 64MB
>>>>             features.shard: enable
>>>>             cluster.server-quorum-type: server
>>>>             cluster.quorum-type: auto
>>>>             network.remote-dio: enable
>>>>             cluster.eager-lock: enable
>>>>             performance.stat-prefetch: off
>>>>             performance.io-cache: off
>>>>             performance.read-ahead: off
>>>>             performance.quick-read: off
>>>>             performance.readdir-ahead: off
>>>>
>>>>             same error ..
>>>>
>>>>             can anyone share with me the info of a working
striped
>>>>             volume ?
>>>>
>>>>             On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>>>>>             I have a pool of two bricks in the same server;
>>>>>
>>>>>             Volume Name: k
>>>>>             Type: Stripe
>>>>>             Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>>>>>             Status: Started
>>>>>             Number of Bricks: 1 x 2 = 2
>>>>>             Transport-type: tcp
>>>>>             Bricks:
>>>>>             Brick1: gfs001:/bricks/t1/k
>>>>>             Brick2: gfs001:/bricks/t2/k
>>>>>             Options Reconfigured:
>>>>>             features.shard-block-size: 64MB
>>>>>             features.shard: on
>>>>>             cluster.server-quorum-type: server
>>>>>             cluster.quorum-type: auto
>>>>>             network.remote-dio: enable
>>>>>             cluster.eager-lock: enable
>>>>>             performance.stat-prefetch: off
>>>>>             performance.io-cache: off
>>>>>             performance.read-ahead: off
>>>>>             performance.quick-read: off
>>>>>             performance.readdir-ahead: off
>>>>>
>>>>>             same issue ...
>>>>>             glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>>>>
>>>>>
>>>>>             Respectfully*
>>>>>             **Mahdi A. Mahdi*
>>>>>
>>>>>             Systems Administrator
>>>>>             IT. Department
>>>>>             Earthlink Telecommunications
>>>>>             <https://www.facebook.com/earthlinktele>
>>>>>
>>>>>             Cell: 07903316180
>>>>>             Work: 3352
>>>>>             Skype: mahdi.adnan at outlook.com
>>>>>             <mailto:mahdi.adnan at outlook.com>
>>>>>             On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>>>>>             On Mon, Mar 14, 2016 at 08:12:27AM +0530,
Krutika Dhananjay wrote:
>>>>>>>             It would be better to use sharding over
stripe for your vm use case. It
>>>>>>>             offers better distribution and
utilisation of bricks and better heal
>>>>>>>             performance.
>>>>>>>             And it is well tested.
>>>>>>             Basically the "striping" feature
is deprecated, "sharding" is its
>>>>>>             improved replacement. I expect to see
"striping" completely dropped in
>>>>>>             the next major release.
>>>>>>
>>>>>>             Niels
>>>>>>
>>>>>>
>>>>>>>             Couple of things to note before you do
that:
>>>>>>>             1. Most of the bug fixes in sharding
have gone into 3.7.8. So it is advised
>>>>>>>             that you use 3.7.8 or above.
>>>>>>>             2. When you enable sharding on a
volume, already existing files in the
>>>>>>>             volume do not get sharded. Only the
files that are newly created from the
>>>>>>>             time sharding is enabled will.
>>>>>>>                  If you do want to shard the
existing files, then you would need to cp
>>>>>>>             them to a temp name within the volume,
and then rename them back to the
>>>>>>>             original file name.
>>>>>>>
>>>>>>>             HTH,
>>>>>>>             Krutika
>>>>>>>
>>>>>>>             On Sun, Mar 13, 2016 at 11:49 PM, Mahdi
Adnan <mahdi.adnan at earthlinktele.com
>>>>>>>             <mailto:mahdi.adnan at
earthlinktele.com>
>>>>>>>>             wrote:
>>>>>>>>             I couldn't find anything
related to cache in the HBAs.
>>>>>>>>             what logs are useful in my case ? i
see only bricks logs which contains
>>>>>>>>             nothing during the failure.
>>>>>>>>
>>>>>>>>             ###
>>>>>>>>             [2016-03-13 18:05:19.728614] E
[MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>             0-vmware-posix: mknod on
>>>>>>>>            
/bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>>>>>>             [File exists]
>>>>>>>>             [2016-03-13 18:07:23.337086] E
[MSGID: 113022] [posix.c:1232:posix_mknod]
>>>>>>>>             0-vmware-posix: mknod on
>>>>>>>>            
/bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>>>>>>             [File exists]
>>>>>>>>             [2016-03-13 18:07:55.027600] W
[trash.c:1922:trash_rmdir] 0-vmware-trash:
>>>>>>>>             rmdir issued on /.trashcan/, which
is not permitted
>>>>>>>>             [2016-03-13 18:07:55.027635] I
[MSGID: 115056]
>>>>>>>>            
[server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR
>>>>>>>>             /.trashcan/internal_op
(00000000-0000-0000-0000-000000000005/internal_op)
>>>>>>>>             ==> (Operation not permitted)
[Operation not permitted]
>>>>>>>>             [2016-03-13 18:11:34.353441] I
[login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>             user names:
c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>             [2016-03-13 18:11:34.353463] I
[MSGID: 115029]
>>>>>>>>            
[server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>             from
gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
>>>>>>>>             3.7.8)
>>>>>>>>             [2016-03-13 18:11:34.591139] I
[login.c:81:gf_auth] 0-auth/login: allowed
>>>>>>>>             user names:
c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>>>>>             [2016-03-13 18:11:34.591173] I
[MSGID: 115029]
>>>>>>>>            
[server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
>>>>>>>>             from
gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
>>>>>>>>             3.7.8)
>>>>>>>>             ###
>>>>>>>>
>>>>>>>>             ESXi just keeps telling me
"Cannot clone T: The virtual disk is either
>>>>>>>>             corrupted or not a supported
format.
>>>>>>>>             error
>>>>>>>>             3/13/2016 9:06:20 PM
>>>>>>>>             Clone virtual machine
>>>>>>>>             T
>>>>>>>>             VCENTER.LOCAL\Administrator
>>>>>>>>             "
>>>>>>>>
>>>>>>>>             My setup is 2 servers with a
floating ip controlled by CTDB and my ESXi
>>>>>>>>             server mount the NFS via the
floating ip.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>             On 03/13/2016 08:40 PM, pkoelle
wrote:
>>>>>>>>
>>>>>>>>>             Am 13.03.2016 um 18:22 schrieb
David Gossage:
>>>>>>>>>
>>>>>>>>>>             On Sun, Mar 13, 2016 at
11:07 AM, Mahdi Adnan <
>>>>>>>>>>             mahdi.adnan at
earthlinktele.com
>>>>>>>>>>             <mailto:mahdi.adnan at
earthlinktele.com>
>>>>>>>>>>
>>>>>>>>>>>             wrote:
>>>>>>>>>>>
>>>>>>>>>>             My HBAs are LSISAS1068E,
and the filesystem is XFS.
>>>>>>>>>>>             I tried EXT4 and it did
not help.
>>>>>>>>>>>             I have created a
stripted volume in one server with two bricks, same
>>>>>>>>>>>             issue.
>>>>>>>>>>>             and i tried a
replicated volume with just "sharding enabled" same issue,
>>>>>>>>>>>             as soon as i disable
the sharding it works just fine, niether sharding
>>>>>>>>>>>             nor
>>>>>>>>>>>             striping works for me.
>>>>>>>>>>>             i did follow up with
some of threads in the mailing list and tried some
>>>>>>>>>>>             of
>>>>>>>>>>>             the fixes that worked
with the others, none worked for me. :(
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>             Is it possible the LSI has
write-cache enabled?
>>>>>>>>>>
>>>>>>>>>             Why is that relevant? Even the
backing filesystem has no idea if there is
>>>>>>>>>             a RAID or write cache or
whatever. There are blocks and sync(), end of
>>>>>>>>>             story.
>>>>>>>>>             If you lose power and screw up
your recovery OR do funky stuff with SAS
>>>>>>>>>             multipathing that might be an
issue with a controller cache. AFAIK thats
>>>>>>>>>             not what we are talking about.
>>>>>>>>>
>>>>>>>>>             I'm afraid but unless the
OP has some logs from the server, a
>>>>>>>>>             reproducible testcase or a
backtrace from client or server this isn't
>>>>>>>>>             getting us anywhere.
>>>>>>>>>
>>>>>>>>>             cheers
>>>>>>>>>             Paul
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>             On 03/13/2016 06:54 PM,
David Gossage wrote:
>>>>>>>>>>>             On Sun, Mar 13, 2016 at
8:16 AM, Mahdi Adnan <
>>>>>>>>>>>             mahdi.adnan at
earthlinktele.com
>>>>>>>>>>>             <mailto:mahdi.adnan
at earthlinktele.com>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>             Okay so i have enabled
shard in my test volume and it did not help,
>>>>>>>>>>>>             stupidly enough, i
have enabled it in a production volume
>>>>>>>>>>>>            
"Distributed-Replicate" and it currpted  half of my VMs.
>>>>>>>>>>>>             I have updated
Gluster to the latest and nothing seems to be changed in
>>>>>>>>>>>>             my situation.
>>>>>>>>>>>>             below the info of
my volume;
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>             I was pointing at the
settings in that email as an example for
>>>>>>>>>>>             corruption
>>>>>>>>>>>             fixing. I wouldn't
recommend enabling sharding if you haven't gotten the
>>>>>>>>>>>             base working yet on
that cluster. What HBA's are you using and what is
>>>>>>>>>>>             layout of filesystem
for bricks?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>             Number of Bricks: 3 x 2
= 6
>>>>>>>>>>>>             Transport-type: tcp
>>>>>>>>>>>>             Bricks:
>>>>>>>>>>>>             Brick1:
gfs001:/bricks/b001/vmware
>>>>>>>>>>>>             Brick2:
gfs002:/bricks/b004/vmware
>>>>>>>>>>>>             Brick3:
gfs001:/bricks/b002/vmware
>>>>>>>>>>>>             Brick4:
gfs002:/bricks/b005/vmware
>>>>>>>>>>>>             Brick5:
gfs001:/bricks/b003/vmware
>>>>>>>>>>>>             Brick6:
gfs002:/bricks/b006/vmware
>>>>>>>>>>>>             Options
Reconfigured:
>>>>>>>>>>>>            
performance.strict-write-ordering: on
>>>>>>>>>>>>            
cluster.server-quorum-type: server
>>>>>>>>>>>>            
cluster.quorum-type: auto
>>>>>>>>>>>>             network.remote-dio:
enable
>>>>>>>>>>>>            
performance.stat-prefetch: disable
>>>>>>>>>>>>            
performance.io-cache: off
>>>>>>>>>>>>            
performance.read-ahead: off
>>>>>>>>>>>>            
performance.quick-read: off
>>>>>>>>>>>>             cluster.eager-lock:
enable
>>>>>>>>>>>>            
features.shard-block-size: 16MB
>>>>>>>>>>>>             features.shard: on
>>>>>>>>>>>>            
performance.readdir-ahead: off
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>             On 03/12/2016 08:11
PM, David Gossage wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>             On Sat, Mar 12,
2016 at 10:21 AM, Mahdi Adnan <
>>>>>>>>>>>>             <mahdi.adnan at
earthlinktele.com>
>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>             Both servers have
HBA no RAIDs and i can setup a replicated or
>>>>>>>>>>>>>             dispensers
without any issues.
>>>>>>>>>>>>>             Logs are clean
and when i tried to migrate a vm and got the error,
>>>>>>>>>>>>>             nothing showed
up in the logs.
>>>>>>>>>>>>>             i tried
mounting the volume into my laptop and it mounted fine but,
>>>>>>>>>>>>>             if i
>>>>>>>>>>>>>             use dd to
create a data file it just hang and i cant cancel it, and i
>>>>>>>>>>>>>             cant
>>>>>>>>>>>>>             unmount it or
anything, i just have to reboot.
>>>>>>>>>>>>>             The same
servers have another volume on other bricks in a distributed
>>>>>>>>>>>>>             replicas, works
fine.
>>>>>>>>>>>>>             I have even
tried the same setup in a virtual environment (created two
>>>>>>>>>>>>>             vms and install
gluster and created a replicated striped) and again
>>>>>>>>>>>>>             same
>>>>>>>>>>>>>             thing, data
corruption.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>             I'd look
through mail archives for a topic "Shard in Production" I
>>>>>>>>>>>>             think
>>>>>>>>>>>>             it's called. 
The shard portion may not be relevant but it does discuss
>>>>>>>>>>>>             certain settings
that had to be applied with regards to avoiding
>>>>>>>>>>>>             corruption
>>>>>>>>>>>>             with VM's.  You
may want to try and disable the
>>>>>>>>>>>>            
performance.readdir-ahead
>>>>>>>>>>>>             also.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>             On 03/12/2016
07:02 PM, David Gossage wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>             On Sat, Mar 12,
2016 at 9:51 AM, Mahdi Adnan <
>>>>>>>>>>>>>             <mahdi.adnan
at earthlinktele.com>
>>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>             Thanks David,
>>>>>>>>>>>>>>             My settings
are all defaults, i have just created the pool and
>>>>>>>>>>>>>>             started
>>>>>>>>>>>>>>             it.
>>>>>>>>>>>>>>             I have set
the settings as your recommendation and it seems to be the
>>>>>>>>>>>>>>             same issue;
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             Type:
Striped-Replicate
>>>>>>>>>>>>>>             Volume ID:
44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>>>>>>>>>>>             Status:
Started
>>>>>>>>>>>>>>             Number of
Bricks: 1 x 2 x 2 = 4
>>>>>>>>>>>>>>            
Transport-type: tcp
>>>>>>>>>>>>>>             Bricks:
>>>>>>>>>>>>>>             Brick1:
gfs001:/bricks/t1/s
>>>>>>>>>>>>>>             Brick2:
gfs002:/bricks/t1/s
>>>>>>>>>>>>>>             Brick3:
gfs001:/bricks/t2/s
>>>>>>>>>>>>>>             Brick4:
gfs002:/bricks/t2/s
>>>>>>>>>>>>>>             Options
Reconfigured:
>>>>>>>>>>>>>>            
performance.stat-prefetch: off
>>>>>>>>>>>>>>            
network.remote-dio: on
>>>>>>>>>>>>>>            
cluster.eager-lock: enable
>>>>>>>>>>>>>>            
performance.io-cache: off
>>>>>>>>>>>>>>            
performance.read-ahead: off
>>>>>>>>>>>>>>            
performance.quick-read: off
>>>>>>>>>>>>>>            
performance.readdir-ahead: on
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>             Is their a raid
controller perhaps doing any caching?
>>>>>>>>>>>>>
>>>>>>>>>>>>>             In the gluster
logs any errors being reported during migration
>>>>>>>>>>>>>             process?
>>>>>>>>>>>>>             Since they
aren't in use yet have you tested making just mirrored
>>>>>>>>>>>>>             bricks
>>>>>>>>>>>>>             using different
pairings of servers two at a time to see if problem
>>>>>>>>>>>>>             follows
>>>>>>>>>>>>>             certain machine
or network ports?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>             On
03/12/2016 03:25 PM, David Gossage wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             On Sat, Mar
12, 2016 at 1:55 AM, Mahdi Adnan <
>>>>>>>>>>>>>>            
<mahdi.adnan at earthlinktele.com>
>>>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com
>>>>>>>>>>>>>>            
<mailto:mahdi.adnan at earthlinktele.com>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             Dears,
>>>>>>>>>>>>>>>             I have
created a replicated striped volume with two bricks and two
>>>>>>>>>>>>>>>             servers
but I can't use it because when I mount it in ESXi and try
>>>>>>>>>>>>>>>             to
>>>>>>>>>>>>>>>             migrate
a VM to it, the data get corrupted.
>>>>>>>>>>>>>>>             Is any
one have any idea why is this happening ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>             Dell
2950 x2
>>>>>>>>>>>>>>>             Seagate
15k 600GB
>>>>>>>>>>>>>>>             CentOS
7.2
>>>>>>>>>>>>>>>             Gluster
3.7.8
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>            
Appreciate your help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>             Most
reports of this I have seen end up being settings related.  Post
>>>>>>>>>>>>>>             gluster
volume info. Below is what I have seen as most common
>>>>>>>>>>>>>>             recommended
>>>>>>>>>>>>>>             settings.
>>>>>>>>>>>>>>             I'd
hazard a guess you may have some the read ahead cache or prefetch
>>>>>>>>>>>>>>             on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>            
quick-read=off
>>>>>>>>>>>>>>            
read-ahead=off
>>>>>>>>>>>>>>            
io-cache=off
>>>>>>>>>>>>>>            
stat-prefetch=off
>>>>>>>>>>>>>>            
eager-lock=enable
>>>>>>>>>>>>>>            
remote-dio=on
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>             Mahdi
Adnan
>>>>>>>>>>>>>>>             System
Admin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>            
_______________________________________________
>>>>>>>>>>>>>>>            
Gluster-users mailing list
>>>>>>>>>>>>>>>            
<Gluster-users at gluster.org>
>>>>>>>>>>>>>>>            
<mailto:Gluster-users at gluster.org>Gluster-users at gluster.org
>>>>>>>>>>>>>>>            
<mailto:Gluster-users at gluster.org>
>>>>>>>>>>>>>>>            
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>            
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>            
_______________________________________________
>>>>>>>>>>             Gluster-users mailing list
>>>>>>>>>>             Gluster-users at
gluster.org
>>>>>>>>>>             <mailto:Gluster-users at
gluster.org>
>>>>>>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>            
_______________________________________________
>>>>>>>>>             Gluster-users mailing list
>>>>>>>>>             Gluster-users at gluster.org
>>>>>>>>>             <mailto:Gluster-users at
gluster.org>
>>>>>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>
>>>>>>>>            
_______________________________________________
>>>>>>>>             Gluster-users mailing list
>>>>>>>>             Gluster-users at gluster.org
>>>>>>>>             <mailto:Gluster-users at
gluster.org>
>>>>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>            
_______________________________________________
>>>>>>>             Gluster-users mailing list
>>>>>>>             Gluster-users at gluster.org
>>>>>>>             <mailto:Gluster-users at
gluster.org>
>>>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
>>>>>
>>>>>             _______________________________________________
>>>>>             Gluster-users mailing list
>>>>>             Gluster-users at gluster.org
>>>>>             <mailto:Gluster-users at gluster.org>
>>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>             _______________________________________________
>>>>             Gluster-users mailing list
>>>>             Gluster-users at gluster.org
>>>>             <mailto:Gluster-users at gluster.org>
>>>>            
http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>             _______________________________________________
>>>             Gluster-users mailing list
>>>             Gluster-users at gluster.org <mailto:Gluster-users
at gluster.org>
>>>             http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160315/476fc32b/attachment.html>

Krutika Dhananjay

2016-Mar-15 12:03 UTC

head link

[Gluster-users] Replicated striped data lose

Hmm ok. Could you share the nfs.log content?

-Krutika

On Tue, Mar 15, 2016 at 1:45 PM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com>
wrote:
> Okay, here's what i did;
>
> Volume Name: v
> Type: Distributed-Replicate
> Volume ID: b348fd8e-b117-469d-bcc0-56a56bdfc930
> Status: Started
> Number of Bricks: 3 x 2 = 6
> Transport-type: tcp
> Bricks:
> Brick1: gfs001:/bricks/b001/v
> Brick2: gfs001:/bricks/b002/v
> Brick3: gfs001:/bricks/b003/v
> Brick4: gfs002:/bricks/b004/v
> Brick5: gfs002:/bricks/b005/v
> Brick6: gfs002:/bricks/b006/v
> Options Reconfigured:
> features.shard-block-size: 128MB
> features.shard: enable
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: on
>
>
> same error.
> and still mounting using glusterfs will work just fine.
>
> Respectfully
> *Mahdi A. Mahdi*
> <mahdi.adnan at outlook.com>
>
> On 03/15/2016 11:04 AM, Krutika Dhananjay wrote:
>
> OK but what if you use it with replication? Do you still see the error? I
> think not.
> Could you give it a try and tell me what you find?
>
> -Krutika
>
> On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan <
> mahdi.adnan at earthlinktele.com> wrote:
>
>> Hi,
>>
>> I have created the following volume;
>>
>> Volume Name: v
>> Type: Distribute
>> Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043
>> Status: Started
>> Number of Bricks: 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs001:/bricks/b001/v
>> Brick2: gfs001:/bricks/b002/v
>> Brick3: gfs001:/bricks/b003/v
>> Options Reconfigured:
>> features.shard-block-size: 128MB
>> features.shard: enable
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.readdir-ahead: on
>>
>> and after mounting it in ESXi and trying to clone a VM to it, i got the
>> same error.
>>
>>
>> Respectfully
>> *Mahdi A. Mahdi*
>>
>>
>> On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:
>>
>> Hi,
>>
>> Do not use sharding and stripe together in the same volume because
>> a) It is not recommended and there is no point in using both. Using
>> sharding alone on your volume should work fine.
>> b) Nobody tested it.
>> c) Like Niels said, stripe feature is virtually deprecated.
>>
>> I would suggest that you create an nx3 volume where n is the number of
>> distribute subvols you prefer, enable group virt options on it, and
enable
>> sharding on it,
>> set the shard-block-size that you feel appropriate and then just start
>> off with VM image creation etc.
>> If you run into any issues even after you do this, let us know and
we'll
>> help you out.
>>
>> -Krutika
>>
>> On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan <
>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>
>>> Thanks Krutika,
>>>
>>> I have deleted the volume and created a new one.
>>> I found that it may be an issue with the NFS itself, i have created
a
>>> new striped volume and enabled sharding and mounted it via
glusterfs and it
>>> worked just fine, if i mount it with nfs it will fail and gives me
the same
>>> errors.
>>>
>>> Respectfully
>>> *Mahdi A. Mahdi*
>>>
>>> On 03/15/2016 06:24 AM, Krutika Dhananjay wrote:
>>>
>>> Hi,
>>>
>>> So could you share the xattrs associated with the file at
>>>
<BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>
>>> Here's what you need to execute:
>>>
>>> # getfattr -d -m . -e hex
>>> /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c    
on the
>>> first node and
>>>
>>> # getfattr -d -m . -e hex
>>> /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c    
on the
>>> second.
>>>
>>>
>>> Also, it is normally advised to use a replica 3 volume as opposed
to
>>> replica 2 volume to guard against split-brains.
>>>
>>> -Krutika
>>>
>>> On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan <
>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>
>>>> sorry for serial posting but, i got new logs it might help..
>>>>
>>>> the message appear during the migration;
>>>>
>>>> /var/log/glusterfs/nfs.log
>>>>
>>>>
>>>> [2016-03-14 09:45:04.573765] I [MSGID: 109036]
>>>> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal]
0-testv-dht:
>>>> Setting layout of /New Virtual Machine_1 with [Subvol_name:
testv-stripe-0,
>>>> Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
>>>> [2016-03-14 09:45:04.957499] E
>>>> [shard.c:369:shard_modify_size_and_block_count]
>>>>
(-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
>>>> [0x7f27a13c067f]
>>>>
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
>>>> [0x7f27a116681c]
>>>>
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
>>>> [0x7f27a116584d] ) 0-testv-shard: Failed to get
>>>> trusted.glusterfs.shard.file-size for
c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
>>>> [2016-03-14 09:45:04.957577] W [MSGID: 112199]
>>>> [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New
Virtual
>>>> Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26,
SETATTR: NFS:
>>>> 22(Invalid argument for operation), POSIX: 22(Invalid
argument)) [Invalid
>>>> argument]
>>>> [2016-03-14 09:45:05.079657] E [MSGID: 112069]
>>>> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or
directory: (
>>>> 192.168.221.52:826) testv :
00000000-0000-0000-0000-000000000001
>>>>
>>>>
>>>>
>>>> Respectfully
>>>>
>>>>
>>>> *Mahdi A. Mahd *
>>>> On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>>>>
>>>> So i have deployed a new server "Cisco UCS C220M4"
and created a new
>>>> volume;
>>>>
>>>> Volume Name: testv
>>>> Type: Stripe
>>>> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: 10.70.0.250:/mnt/b1/v
>>>> Brick2: 10.70.0.250:/mnt/b2/v
>>>> Options Reconfigured:
>>>> nfs.disable: off
>>>> features.shard-block-size: 64MB
>>>> features.shard: enable
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> network.remote-dio: enable
>>>> cluster.eager-lock: enable
>>>> performance.stat-prefetch: off
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> performance.readdir-ahead: off
>>>>
>>>> same error ..
>>>>
>>>> can anyone share with me the info of a working striped volume ?
>>>>
>>>> On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>>>>
>>>> I have a pool of two bricks in the same server;
>>>>
>>>> Volume Name: k
>>>> Type: Stripe
>>>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 = 2
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gfs001:/bricks/t1/k
>>>> Brick2: gfs001:/bricks/t2/k
>>>> Options Reconfigured:
>>>> features.shard-block-size: 64MB
>>>> features.shard: on
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> network.remote-dio: enable
>>>> cluster.eager-lock: enable
>>>> performance.stat-prefetch: off
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> performance.readdir-ahead: off
>>>>
>>>> same issue ...
>>>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>>>
>>>>
>>>> Respectfully
>>>> *Mahdi A. Mahdi*
>>>>
>>>> Systems Administrator
>>>> IT. Department
>>>> Earthlink Telecommunications
<https://www.facebook.com/earthlinktele>
>>>>
>>>> Cell: 07903316180
>>>> Work: 3352
>>>> Skype: <mahdi.adnan at outlook.com>mahdi.adnan at
outlook.com
>>>> On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>>>
>>>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay
wrote:
>>>>
>>>> It would be better to use sharding over stripe for your vm use
case. It
>>>> offers better distribution and utilisation of bricks and better
heal
>>>> performance.
>>>> And it is well tested.
>>>>
>>>> Basically the "striping" feature is deprecated,
"sharding" is its
>>>> improved replacement. I expect to see "striping"
completely dropped in
>>>> the next major release.
>>>>
>>>> Niels
>>>>
>>>>
>>>>
>>>> Couple of things to note before you do that:
>>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So
it is advised
>>>> that you use 3.7.8 or above.
>>>> 2. When you enable sharding on a volume, already existing files
in the
>>>> volume do not get sharded. Only the files that are newly
created from the
>>>> time sharding is enabled will.
>>>>     If you do want to shard the existing files, then you would
need to cp
>>>> them to a temp name within the volume, and then rename them
back to the
>>>> original file name.
>>>>
>>>> HTH,
>>>> Krutika
>>>>
>>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan
at earthlinktele.com
>>>>
>>>> wrote:
>>>>
>>>> I couldn't find anything related to cache in the HBAs.
>>>> what logs are useful in my case ? i see only bricks logs which
contains
>>>> nothing during the failure.
>>>>
>>>> ###
>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022]
[posix.c:1232:posix_mknod]
>>>> 0-vmware-posix: mknod on
>>>>
/bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>> [File exists]
>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022]
[posix.c:1232:posix_mknod]
>>>> 0-vmware-posix: mknod on
>>>>
/bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>> [File exists]
>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir]
0-vmware-trash:
>>>> rmdir issued on /.trashcan/, which is not permitted
>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056]
>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server:
41987: RMDIR
>>>> /.trashcan/internal_op
(00000000-0000-0000-0000-000000000005/internal_op)
>>>> ==> (Operation not permitted) [Operation not permitted]
>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth]
0-auth/login: allowed
>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029]
>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server:
accepted client
>>>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0
(version:
>>>> 3.7.8)
>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth]
0-auth/login: allowed
>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029]
>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server:
accepted client
>>>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0
(version:
>>>> 3.7.8)
>>>> ###
>>>>
>>>> ESXi just keeps telling me "Cannot clone T: The virtual
disk is either
>>>> corrupted or not a supported format.
>>>> error
>>>> 3/13/2016 9:06:20 PM
>>>> Clone virtual machine
>>>> T
>>>> VCENTER.LOCAL\Administrator
>>>> "
>>>>
>>>> My setup is 2 servers with a floating ip controlled by CTDB and
my ESXi
>>>> server mount the NFS via the floating ip.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 03/13/2016 08:40 PM, pkoelle wrote:
>>>>
>>>>
>>>> Am 13.03.2016 um 18:22 schrieb David Gossage:
>>>>
>>>>
>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan
at earthlinktele.com
>>>>
>>>> wrote:
>>>>
>>>>
>>>> My HBAs are LSISAS1068E, and the filesystem is XFS.
>>>>
>>>> I tried EXT4 and it did not help.
>>>> I have created a stripted volume in one server with two bricks,
same
>>>> issue.
>>>> and i tried a replicated volume with just "sharding
enabled" same issue,
>>>> as soon as i disable the sharding it works just fine, niether
sharding
>>>> nor
>>>> striping works for me.
>>>> i did follow up with some of threads in the mailing list and
tried some
>>>> of
>>>> the fixes that worked with the others, none worked for me. :(
>>>>
>>>>
>>>>
>>>> Is it possible the LSI has write-cache enabled?
>>>>
>>>>
>>>> Why is that relevant? Even the backing filesystem has no idea
if there is
>>>> a RAID or write cache or whatever. There are blocks and sync(),
end of
>>>> story.
>>>> If you lose power and screw up your recovery OR do funky stuff
with SAS
>>>> multipathing that might be an issue with a controller cache.
AFAIK thats
>>>> not what we are talking about.
>>>>
>>>> I'm afraid but unless the OP has some logs from the server,
a
>>>> reproducible testcase or a backtrace from client or server this
isn't
>>>> getting us anywhere.
>>>>
>>>> cheers
>>>> Paul
>>>>
>>>>
>>>>
>>>> On 03/13/2016 06:54 PM, David Gossage wrote:
>>>>
>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com> wrote:
>>>>
>>>> Okay so i have enabled shard in my test volume and it did not
help,
>>>>
>>>> stupidly enough, i have enabled it in a production volume
>>>> "Distributed-Replicate" and it currpted  half of my
VMs.
>>>> I have updated Gluster to the latest and nothing seems to be
changed in
>>>> my situation.
>>>> below the info of my volume;
>>>>
>>>>
>>>>
>>>> I was pointing at the settings in that email as an example for
>>>> corruption
>>>> fixing. I wouldn't recommend enabling sharding if you
haven't gotten the
>>>> base working yet on that cluster. What HBA's are you using
and what is
>>>> layout of filesystem for bricks?
>>>>
>>>>
>>>> Number of Bricks: 3 x 2 = 6
>>>>
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gfs001:/bricks/b001/vmware
>>>> Brick2: gfs002:/bricks/b004/vmware
>>>> Brick3: gfs001:/bricks/b002/vmware
>>>> Brick4: gfs002:/bricks/b005/vmware
>>>> Brick5: gfs001:/bricks/b003/vmware
>>>> Brick6: gfs002:/bricks/b006/vmware
>>>> Options Reconfigured:
>>>> performance.strict-write-ordering: on
>>>> cluster.server-quorum-type: server
>>>> cluster.quorum-type: auto
>>>> network.remote-dio: enable
>>>> performance.stat-prefetch: disable
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> cluster.eager-lock: enable
>>>> features.shard-block-size: 16MB
>>>> features.shard: on
>>>> performance.readdir-ahead: off
>>>>
>>>>
>>>> On 03/12/2016 08:11 PM, David Gossage wrote:
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan
<<mahdi.adnan at earthlinktele.com> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>
>>>> Both servers have HBA no RAIDs and i can setup a replicated or
>>>>
>>>> dispensers without any issues.
>>>> Logs are clean and when i tried to migrate a vm and got the
error,
>>>> nothing showed up in the logs.
>>>> i tried mounting the volume into my laptop and it mounted fine
but,
>>>> if i
>>>> use dd to create a data file it just hang and i cant cancel it,
and i
>>>> cant
>>>> unmount it or anything, i just have to reboot.
>>>> The same servers have another volume on other bricks in a
distributed
>>>> replicas, works fine.
>>>> I have even tried the same setup in a virtual environment
(created two
>>>> vms and install gluster and created a replicated striped) and
again
>>>> same
>>>> thing, data corruption.
>>>>
>>>>
>>>>
>>>> I'd look through mail archives for a topic "Shard in
Production" I
>>>> think
>>>> it's called.  The shard portion may not be relevant but it
does discuss
>>>> certain settings that had to be applied with regards to
avoiding
>>>> corruption
>>>> with VM's.  You may want to try and disable the
>>>> performance.readdir-ahead
>>>> also.
>>>>
>>>>
>>>>
>>>>
>>>> On 03/12/2016 07:02 PM, David Gossage wrote:
>>>>
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan
<<mahdi.adnan at earthlinktele.com> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>
>>>> Thanks David,
>>>>
>>>> My settings are all defaults, i have just created the pool and
>>>> started
>>>> it.
>>>> I have set the settings as your recommendation and it seems to
be the
>>>> same issue;
>>>>
>>>> Type: Striped-Replicate
>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gfs001:/bricks/t1/s
>>>> Brick2: gfs002:/bricks/t1/s
>>>> Brick3: gfs001:/bricks/t2/s
>>>> Brick4: gfs002:/bricks/t2/s
>>>> Options Reconfigured:
>>>> performance.stat-prefetch: off
>>>> network.remote-dio: on
>>>> cluster.eager-lock: enable
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> performance.readdir-ahead: on
>>>>
>>>>
>>>>
>>>> Is their a raid controller perhaps doing any caching?
>>>>
>>>> In the gluster logs any errors being reported during migration
>>>> process?
>>>> Since they aren't in use yet have you tested making just
mirrored
>>>> bricks
>>>> using different pairings of servers two at a time to see if
problem
>>>> follows
>>>> certain machine or network ports?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan
<<mahdi.adnan at earthlinktele.com> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>
>>>> Dears,
>>>>
>>>> I have created a replicated striped volume with two bricks and
two
>>>> servers but I can't use it because when I mount it in ESXi
and try
>>>> to
>>>> migrate a VM to it, the data get corrupted.
>>>> Is any one have any idea why is this happening ?
>>>>
>>>> Dell 2950 x2
>>>> Seagate 15k 600GB
>>>> CentOS 7.2
>>>> Gluster 3.7.8
>>>>
>>>> Appreciate your help.
>>>>
>>>>
>>>>
>>>> Most reports of this I have seen end up being settings related.
Post
>>>> gluster volume info. Below is what I have seen as most common
>>>> recommended
>>>> settings.
>>>> I'd hazard a guess you may have some the read ahead cache
or prefetch
>>>> on.
>>>>
>>>> quick-read=off
>>>> read-ahead=off
>>>> io-cache=off
>>>> stat-prefetch=off
>>>> eager-lock=enable
>>>> remote-dio=on
>>>>
>>>>
>>>>
>>>> Mahdi Adnan
>>>> System Admin
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list<Gluster-users at gluster.org>
<Gluster-users at gluster.org>Gluster-users at
gluster.org<http://www.gluster.org/mailman/listinfo/gluster-users>
<http://www.gluster.org/mailman/listinfo/gluster-users>http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>  _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>  _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>>
>>
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160315/daea7922/attachment.html>

Gluster users - Mar 2016 - Replicated striped data lose

[Gluster-users] Replicated striped data lose

[Gluster-users] Replicated striped data lose