thr3ads.net - Gluster users - [Gluster-users] Replicated striped data lose [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Mahdi Adnan

2016-Mar-14 09:47 UTC

[Gluster-users] Replicated striped data lose

sorry for serial posting but, i got new logs it might help..

the message appear during the migration;

/var/log/glusterfs/nfs.log


[2016-03-14 09:45:04.573765] I [MSGID: 109036] 
[dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-testv-dht: 
Setting layout of /New Virtual Machine_1 with [Subvol_name: 
testv-stripe-0, Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
[2016-03-14 09:45:04.957499] E 
[shard.c:369:shard_modify_size_and_block_count] 
(-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
[0x7f27a13c067f] 
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
[0x7f27a116681c] 
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
[0x7f27a116584d] ) 0-testv-shard: Failed to get 
trusted.glusterfs.shard.file-size for c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
[2016-03-14 09:45:04.957577] W [MSGID: 112199] 
[nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New Virtual 
Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS: 
22(Invalid argument for operation), POSIX: 22(Invalid argument)) 
[Invalid argument]
[2016-03-14 09:45:05.079657] E [MSGID: 112069] 
[nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or directory: 
(192.168.221.52:826) testv : 00000000-0000-0000-0000-000000000001



Respectfully*
**Mahdi A. Mahd

*
On 03/14/2016 11:14 AM, Mahdi Adnan wrote:> So i have deployed a new server "Cisco UCS C220M4" and created a
new
> volume;
>
> Volume Name: testv
> Type: Stripe
> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.70.0.250:/mnt/b1/v
> Brick2: 10.70.0.250:/mnt/b2/v
> Options Reconfigured:
> nfs.disable: off
> features.shard-block-size: 64MB
> features.shard: enable
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: off
>
> same error ..
>
> can anyone share with me the info of a working striped volume ?
>
> On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>> I have a pool of two bricks in the same server;
>>
>> Volume Name: k
>> Type: Stripe
>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
>> Status: Started
>> Number of Bricks: 1 x 2 = 2
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs001:/bricks/t1/k
>> Brick2: gfs001:/bricks/t2/k
>> Options Reconfigured:
>> features.shard-block-size: 64MB
>> features.shard: on
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> cluster.eager-lock: enable
>> performance.stat-prefetch: off
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> performance.readdir-ahead: off
>>
>> same issue ...
>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>>
>>
>> Respectfully*
>> **Mahdi A. Mahdi*
>>
>> Systems Administrator
>> IT. Department
>> Earthlink Telecommunications
<https://www.facebook.com/earthlinktele>
>>
>> Cell: 07903316180
>> Work: 3352
>> Skype: mahdi.adnan at outlook.com <mailto:mahdi.adnan at
outlook.com>
>> On 03/14/2016 08:11 AM, Niels de Vos wrote:
>>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote:
>>>> It would be better to use sharding over stripe for your vm use
case. It
>>>> offers better distribution and utilisation of bricks and better
heal
>>>> performance.
>>>> And it is well tested.
>>> Basically the "striping" feature is deprecated,
"sharding" is its
>>> improved replacement. I expect to see "striping"
completely dropped in
>>> the next major release.
>>>
>>> Niels
>>>
>>>
>>>> Couple of things to note before you do that:
>>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So
it is advised
>>>> that you use 3.7.8 or above.
>>>> 2. When you enable sharding on a volume, already existing files
in the
>>>> volume do not get sharded. Only the files that are newly
created from the
>>>> time sharding is enabled will.
>>>>      If you do want to shard the existing files, then you would
need to cp
>>>> them to a temp name within the volume, and then rename them
back to the
>>>> original file name.
>>>>
>>>> HTH,
>>>> Krutika
>>>>
>>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan
at earthlinktele.com
>>>>> wrote:
>>>>> I couldn't find anything related to cache in the HBAs.
>>>>> what logs are useful in my case ? i see only bricks logs
which contains
>>>>> nothing during the failure.
>>>>>
>>>>> ###
>>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022]
[posix.c:1232:posix_mknod]
>>>>> 0-vmware-posix: mknod on
>>>>>
/bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
>>>>> [File exists]
>>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022]
[posix.c:1232:posix_mknod]
>>>>> 0-vmware-posix: mknod on
>>>>>
/bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
>>>>> [File exists]
>>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir]
0-vmware-trash:
>>>>> rmdir issued on /.trashcan/, which is not permitted
>>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056]
>>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server:
41987: RMDIR
>>>>> /.trashcan/internal_op
(00000000-0000-0000-0000-000000000005/internal_op)
>>>>> ==> (Operation not permitted) [Operation not permitted]
>>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth]
0-auth/login: allowed
>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029]
>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server:
accepted client
>>>>> from
gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
>>>>> 3.7.8)
>>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth]
0-auth/login: allowed
>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
>>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029]
>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server:
accepted client
>>>>> from
gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
>>>>> 3.7.8)
>>>>> ###
>>>>>
>>>>> ESXi just keeps telling me "Cannot clone T: The
virtual disk is either
>>>>> corrupted or not a supported format.
>>>>> error
>>>>> 3/13/2016 9:06:20 PM
>>>>> Clone virtual machine
>>>>> T
>>>>> VCENTER.LOCAL\Administrator
>>>>> "
>>>>>
>>>>> My setup is 2 servers with a floating ip controlled by CTDB
and my ESXi
>>>>> server mount the NFS via the floating ip.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 03/13/2016 08:40 PM, pkoelle wrote:
>>>>>
>>>>>> Am 13.03.2016 um 18:22 schrieb David Gossage:
>>>>>>
>>>>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <
>>>>>>> mahdi.adnan at earthlinktele.com
>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>> My HBAs are LSISAS1068E, and the filesystem is XFS.
>>>>>>>> I tried EXT4 and it did not help.
>>>>>>>> I have created a stripted volume in one server
with two bricks, same
>>>>>>>> issue.
>>>>>>>> and i tried a replicated volume with just
"sharding enabled" same issue,
>>>>>>>> as soon as i disable the sharding it works just
fine, niether sharding
>>>>>>>> nor
>>>>>>>> striping works for me.
>>>>>>>> i did follow up with some of threads in the
mailing list and tried some
>>>>>>>> of
>>>>>>>> the fixes that worked with the others, none
worked for me. :(
>>>>>>>>
>>>>>>>>
>>>>>>> Is it possible the LSI has write-cache enabled?
>>>>>>>
>>>>>> Why is that relevant? Even the backing filesystem has
no idea if there is
>>>>>> a RAID or write cache or whatever. There are blocks and
sync(), end of
>>>>>> story.
>>>>>> If you lose power and screw up your recovery OR do
funky stuff with SAS
>>>>>> multipathing that might be an issue with a controller
cache. AFAIK thats
>>>>>> not what we are talking about.
>>>>>>
>>>>>> I'm afraid but unless the OP has some logs from the
server, a
>>>>>> reproducible testcase or a backtrace from client or
server this isn't
>>>>>> getting us anywhere.
>>>>>>
>>>>>> cheers
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>>> On 03/13/2016 06:54 PM, David Gossage wrote:
>>>>>>>>
>>>>>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan
<
>>>>>>>> mahdi.adnan at earthlinktele.com> wrote:
>>>>>>>>
>>>>>>>> Okay so i have enabled shard in my test volume
and it did not help,
>>>>>>>>> stupidly enough, i have enabled it in a
production volume
>>>>>>>>> "Distributed-Replicate" and it
currpted  half of my VMs.
>>>>>>>>> I have updated Gluster to the latest and
nothing seems to be changed in
>>>>>>>>> my situation.
>>>>>>>>> below the info of my volume;
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I was pointing at the settings in that email as
an example for
>>>>>>>> corruption
>>>>>>>> fixing. I wouldn't recommend enabling
sharding if you haven't gotten the
>>>>>>>> base working yet on that cluster. What
HBA's are you using and what is
>>>>>>>> layout of filesystem for bricks?
>>>>>>>>
>>>>>>>>
>>>>>>>> Number of Bricks: 3 x 2 = 6
>>>>>>>>> Transport-type: tcp
>>>>>>>>> Bricks:
>>>>>>>>> Brick1: gfs001:/bricks/b001/vmware
>>>>>>>>> Brick2: gfs002:/bricks/b004/vmware
>>>>>>>>> Brick3: gfs001:/bricks/b002/vmware
>>>>>>>>> Brick4: gfs002:/bricks/b005/vmware
>>>>>>>>> Brick5: gfs001:/bricks/b003/vmware
>>>>>>>>> Brick6: gfs002:/bricks/b006/vmware
>>>>>>>>> Options Reconfigured:
>>>>>>>>> performance.strict-write-ordering: on
>>>>>>>>> cluster.server-quorum-type: server
>>>>>>>>> cluster.quorum-type: auto
>>>>>>>>> network.remote-dio: enable
>>>>>>>>> performance.stat-prefetch: disable
>>>>>>>>> performance.io-cache: off
>>>>>>>>> performance.read-ahead: off
>>>>>>>>> performance.quick-read: off
>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>> features.shard-block-size: 16MB
>>>>>>>>> features.shard: on
>>>>>>>>> performance.readdir-ahead: off
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 03/12/2016 08:11 PM, David Gossage
wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi
Adnan <
>>>>>>>>> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>>>>>>
>>>>>>>>> Both servers have HBA no RAIDs and i can
setup a replicated or
>>>>>>>>>> dispensers without any issues.
>>>>>>>>>> Logs are clean and when i tried to
migrate a vm and got the error,
>>>>>>>>>> nothing showed up in the logs.
>>>>>>>>>> i tried mounting the volume into my
laptop and it mounted fine but,
>>>>>>>>>> if i
>>>>>>>>>> use dd to create a data file it just
hang and i cant cancel it, and i
>>>>>>>>>> cant
>>>>>>>>>> unmount it or anything, i just have to
reboot.
>>>>>>>>>> The same servers have another volume on
other bricks in a distributed
>>>>>>>>>> replicas, works fine.
>>>>>>>>>> I have even tried the same setup in a
virtual environment (created two
>>>>>>>>>> vms and install gluster and created a
replicated striped) and again
>>>>>>>>>> same
>>>>>>>>>> thing, data corruption.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I'd look through mail archives for a
topic "Shard in Production" I
>>>>>>>>> think
>>>>>>>>> it's called.  The shard portion may not
be relevant but it does discuss
>>>>>>>>> certain settings that had to be applied
with regards to avoiding
>>>>>>>>> corruption
>>>>>>>>> with VM's.  You may want to try and
disable the
>>>>>>>>> performance.readdir-ahead
>>>>>>>>> also.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On 03/12/2016 07:02 PM, David Gossage
wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi
Adnan <
>>>>>>>>>> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Thanks David,
>>>>>>>>>>> My settings are all defaults, i
have just created the pool and
>>>>>>>>>>> started
>>>>>>>>>>> it.
>>>>>>>>>>> I have set the settings as your
recommendation and it seems to be the
>>>>>>>>>>> same issue;
>>>>>>>>>>>
>>>>>>>>>>> Type: Striped-Replicate
>>>>>>>>>>> Volume ID:
44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>>>>>>>> Status: Started
>>>>>>>>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>>>>>>>>> Transport-type: tcp
>>>>>>>>>>> Bricks:
>>>>>>>>>>> Brick1: gfs001:/bricks/t1/s
>>>>>>>>>>> Brick2: gfs002:/bricks/t1/s
>>>>>>>>>>> Brick3: gfs001:/bricks/t2/s
>>>>>>>>>>> Brick4: gfs002:/bricks/t2/s
>>>>>>>>>>> Options Reconfigured:
>>>>>>>>>>> performance.stat-prefetch: off
>>>>>>>>>>> network.remote-dio: on
>>>>>>>>>>> cluster.eager-lock: enable
>>>>>>>>>>> performance.io-cache: off
>>>>>>>>>>> performance.read-ahead: off
>>>>>>>>>>> performance.quick-read: off
>>>>>>>>>>> performance.readdir-ahead: on
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Is their a raid controller perhaps
doing any caching?
>>>>>>>>>>
>>>>>>>>>> In the gluster logs any errors being
reported during migration
>>>>>>>>>> process?
>>>>>>>>>> Since they aren't in use yet have
you tested making just mirrored
>>>>>>>>>> bricks
>>>>>>>>>> using different pairings of servers two
at a time to see if problem
>>>>>>>>>> follows
>>>>>>>>>> certain machine or network ports?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 03/12/2016 03:25 PM, David
Gossage wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Mar 12, 2016 at 1:55 AM,
Mahdi Adnan <
>>>>>>>>>>> <mahdi.adnan at
earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Dears,
>>>>>>>>>>>> I have created a replicated
striped volume with two bricks and two
>>>>>>>>>>>> servers but I can't use it
because when I mount it in ESXi and try
>>>>>>>>>>>> to
>>>>>>>>>>>> migrate a VM to it, the data
get corrupted.
>>>>>>>>>>>> Is any one have any idea why is
this happening ?
>>>>>>>>>>>>
>>>>>>>>>>>> Dell 2950 x2
>>>>>>>>>>>> Seagate 15k 600GB
>>>>>>>>>>>> CentOS 7.2
>>>>>>>>>>>> Gluster 3.7.8
>>>>>>>>>>>>
>>>>>>>>>>>> Appreciate your help.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Most reports of this I have seen
end up being settings related.  Post
>>>>>>>>>>> gluster volume info. Below is what
I have seen as most common
>>>>>>>>>>> recommended
>>>>>>>>>>> settings.
>>>>>>>>>>> I'd hazard a guess you may have
some the read ahead cache or prefetch
>>>>>>>>>>> on.
>>>>>>>>>>>
>>>>>>>>>>> quick-read=off
>>>>>>>>>>> read-ahead=off
>>>>>>>>>>> io-cache=off
>>>>>>>>>>> stat-prefetch=off
>>>>>>>>>>> eager-lock=enable
>>>>>>>>>>> remote-dio=on
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Mahdi Adnan
>>>>>>>>>>>> System Admin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> Gluster-users mailing list
>>>>>>>>>>>> <Gluster-users at
gluster.org>Gluster-users at gluster.org
>>>>>>>>>>>>
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>>>>>>>>
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>
>>
>>
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160314/a2c8755e/attachment.html>

Krutika Dhananjay

2016-Mar-15 03:24 UTC

head link

[Gluster-users] Replicated striped data lose

Hi,

So could you share the xattrs associated with the file at
<BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c

Here's what you need to execute:

# getfattr -d -m . -e hex
/mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c      on the
first node and

# getfattr -d -m . -e hex
/mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c      on the
second.


Also, it is normally advised to use a replica 3 volume as opposed to
replica 2 volume to guard against split-brains.

-Krutika

On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com>
wrote:
> sorry for serial posting but, i got new logs it might help..
>
> the message appear during the migration;
>
> /var/log/glusterfs/nfs.log
>
>
> [2016-03-14 09:45:04.573765] I [MSGID: 109036]
> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-testv-dht:
> Setting layout of /New Virtual Machine_1 with [Subvol_name: testv-stripe-0,
> Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ],
> [2016-03-14 09:45:04.957499] E
> [shard.c:369:shard_modify_size_and_block_count]
>
(-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f)
> [0x7f27a13c067f]
>
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc)
> [0x7f27a116681c]
>
-->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd)
> [0x7f27a116584d] ) 0-testv-shard: Failed to get
> trusted.glusterfs.shard.file-size for c3e88cc1-7e0a-4d46-9685-2d12131a5e1c
> [2016-03-14 09:45:04.957577] W [MSGID: 112199]
> [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New Virtual
> Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS:
> 22(Invalid argument for operation), POSIX: 22(Invalid argument)) [Invalid
> argument]
> [2016-03-14 09:45:05.079657] E [MSGID: 112069]
> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or directory: (
> 192.168.221.52:826) testv : 00000000-0000-0000-0000-000000000001
>
>
>
> Respectfully
>
>
> *Mahdi A. Mahd *
> On 03/14/2016 11:14 AM, Mahdi Adnan wrote:
>
> So i have deployed a new server "Cisco UCS C220M4" and created a
new
> volume;
>
> Volume Name: testv
> Type: Stripe
> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.70.0.250:/mnt/b1/v
> Brick2: 10.70.0.250:/mnt/b2/v
> Options Reconfigured:
> nfs.disable: off
> features.shard-block-size: 64MB
> features.shard: enable
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: off
>
> same error ..
>
> can anyone share with me the info of a working striped volume ?
>
> On 03/14/2016 09:02 AM, Mahdi Adnan wrote:
>
> I have a pool of two bricks in the same server;
>
> Volume Name: k
> Type: Stripe
> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: gfs001:/bricks/t1/k
> Brick2: gfs001:/bricks/t2/k
> Options Reconfigured:
> features.shard-block-size: 64MB
> features.shard: on
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> cluster.eager-lock: enable
> performance.stat-prefetch: off
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: off
>
> same issue ...
> glusterfs 3.7.8 built on Mar 10 2016 20:20:45.
>
>
> Respectfully
> *Mahdi A. Mahdi*
>
> Systems Administrator
> IT. Department
> Earthlink Telecommunications <https://www.facebook.com/earthlinktele>
>
> Cell: 07903316180
> Work: 3352
> Skype: mahdi.adnan at outlook.com
> On 03/14/2016 08:11 AM, Niels de Vos wrote:
>
> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote:
>
> It would be better to use sharding over stripe for your vm use case. It
> offers better distribution and utilisation of bricks and better heal
> performance.
> And it is well tested.
>
> Basically the "striping" feature is deprecated,
"sharding" is its
> improved replacement. I expect to see "striping" completely
dropped in
> the next major release.
>
> Niels
>
>
>
> Couple of things to note before you do that:
> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised
> that you use 3.7.8 or above.
> 2. When you enable sharding on a volume, already existing files in the
> volume do not get sharded. Only the files that are newly created from the
> time sharding is enabled will.
>     If you do want to shard the existing files, then you would need to cp
> them to a temp name within the volume, and then rename them back to the
> original file name.
>
> HTH,
> Krutika
>
> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com
>
> wrote:
>
> I couldn't find anything related to cache in the HBAs.
> what logs are useful in my case ? i see only bricks logs which contains
> nothing during the failure.
>
> ###
> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod]
> 0-vmware-posix: mknod on
> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed
> [File exists]
> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod]
> 0-vmware-posix: mknod on
> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed
> [File exists]
> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash:
> rmdir issued on /.trashcan/, which is not permitted
> [2016-03-13 18:07:55.027635] I [MSGID: 115056]
> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR
> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op)
> ==> (Operation not permitted) [Operation not permitted]
> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed
> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
> [2016-03-13 18:11:34.353463] I [MSGID: 115029]
> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version:
> 3.7.8)
> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed
> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4
> [2016-03-13 18:11:34.591173] I [MSGID: 115029]
> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client
> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version:
> 3.7.8)
> ###
>
> ESXi just keeps telling me "Cannot clone T: The virtual disk is either
> corrupted or not a supported format.
> error
> 3/13/2016 9:06:20 PM
> Clone virtual machine
> T
> VCENTER.LOCAL\Administrator
> "
>
> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi
> server mount the NFS via the floating ip.
>
>
>
>
>
> On 03/13/2016 08:40 PM, pkoelle wrote:
>
>
> Am 13.03.2016 um 18:22 schrieb David Gossage:
>
>
> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com
>
> wrote:
>
>
> My HBAs are LSISAS1068E, and the filesystem is XFS.
>
> I tried EXT4 and it did not help.
> I have created a stripted volume in one server with two bricks, same
> issue.
> and i tried a replicated volume with just "sharding enabled" same
issue,
> as soon as i disable the sharding it works just fine, niether sharding
> nor
> striping works for me.
> i did follow up with some of threads in the mailing list and tried some
> of
> the fixes that worked with the others, none worked for me. :(
>
>
>
> Is it possible the LSI has write-cache enabled?
>
>
> Why is that relevant? Even the backing filesystem has no idea if there is
> a RAID or write cache or whatever. There are blocks and sync(), end of
> story.
> If you lose power and screw up your recovery OR do funky stuff with SAS
> multipathing that might be an issue with a controller cache. AFAIK thats
> not what we are talking about.
>
> I'm afraid but unless the OP has some logs from the server, a
> reproducible testcase or a backtrace from client or server this isn't
> getting us anywhere.
>
> cheers
> Paul
>
>
>
> On 03/13/2016 06:54 PM, David Gossage wrote:
>
>
> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com> wrote:
>
> Okay so i have enabled shard in my test volume and it did not help,
>
> stupidly enough, i have enabled it in a production volume
> "Distributed-Replicate" and it currpted  half of my VMs.
> I have updated Gluster to the latest and nothing seems to be changed in
> my situation.
> below the info of my volume;
>
>
>
> I was pointing at the settings in that email as an example for
> corruption
> fixing. I wouldn't recommend enabling sharding if you haven't
gotten the
> base working yet on that cluster. What HBA's are you using and what is
> layout of filesystem for bricks?
>
>
> Number of Bricks: 3 x 2 = 6
>
> Transport-type: tcp
> Bricks:
> Brick1: gfs001:/bricks/b001/vmware
> Brick2: gfs002:/bricks/b004/vmware
> Brick3: gfs001:/bricks/b002/vmware
> Brick4: gfs002:/bricks/b005/vmware
> Brick5: gfs001:/bricks/b003/vmware
> Brick6: gfs002:/bricks/b006/vmware
> Options Reconfigured:
> performance.strict-write-ordering: on
> cluster.server-quorum-type: server
> cluster.quorum-type: auto
> network.remote-dio: enable
> performance.stat-prefetch: disable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> cluster.eager-lock: enable
> features.shard-block-size: 16MB
> features.shard: on
> performance.readdir-ahead: off
>
>
> On 03/12/2016 08:11 PM, David Gossage wrote:
>
>
> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <<mahdi.adnan at
earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>
> Both servers have HBA no RAIDs and i can setup a replicated or
>
> dispensers without any issues.
> Logs are clean and when i tried to migrate a vm and got the error,
> nothing showed up in the logs.
> i tried mounting the volume into my laptop and it mounted fine but,
> if i
> use dd to create a data file it just hang and i cant cancel it, and i
> cant
> unmount it or anything, i just have to reboot.
> The same servers have another volume on other bricks in a distributed
> replicas, works fine.
> I have even tried the same setup in a virtual environment (created two
> vms and install gluster and created a replicated striped) and again
> same
> thing, data corruption.
>
>
>
> I'd look through mail archives for a topic "Shard in
Production" I
> think
> it's called.  The shard portion may not be relevant but it does discuss
> certain settings that had to be applied with regards to avoiding
> corruption
> with VM's.  You may want to try and disable the
> performance.readdir-ahead
> also.
>
>
>
>
> On 03/12/2016 07:02 PM, David Gossage wrote:
>
>
>
> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <<mahdi.adnan at
earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>
> Thanks David,
>
> My settings are all defaults, i have just created the pool and
> started
> it.
> I have set the settings as your recommendation and it seems to be the
> same issue;
>
> Type: Striped-Replicate
> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
> Status: Started
> Number of Bricks: 1 x 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: gfs001:/bricks/t1/s
> Brick2: gfs002:/bricks/t1/s
> Brick3: gfs001:/bricks/t2/s
> Brick4: gfs002:/bricks/t2/s
> Options Reconfigured:
> performance.stat-prefetch: off
> network.remote-dio: on
> cluster.eager-lock: enable
> performance.io-cache: off
> performance.read-ahead: off
> performance.quick-read: off
> performance.readdir-ahead: on
>
>
>
> Is their a raid controller perhaps doing any caching?
>
> In the gluster logs any errors being reported during migration
> process?
> Since they aren't in use yet have you tested making just mirrored
> bricks
> using different pairings of servers two at a time to see if problem
> follows
> certain machine or network ports?
>
>
>
>
>
>
> On 03/12/2016 03:25 PM, David Gossage wrote:
>
>
>
> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <<mahdi.adnan at
earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>
> Dears,
>
> I have created a replicated striped volume with two bricks and two
> servers but I can't use it because when I mount it in ESXi and try
> to
> migrate a VM to it, the data get corrupted.
> Is any one have any idea why is this happening ?
>
> Dell 2950 x2
> Seagate 15k 600GB
> CentOS 7.2
> Gluster 3.7.8
>
> Appreciate your help.
>
>
>
> Most reports of this I have seen end up being settings related.  Post
> gluster volume info. Below is what I have seen as most common
> recommended
> settings.
> I'd hazard a guess you may have some the read ahead cache or prefetch
> on.
>
> quick-read=off
> read-ahead=off
> io-cache=off
> stat-prefetch=off
> eager-lock=enable
> remote-dio=on
>
>
>
> Mahdi Adnan
> System Admin
>
>
> _______________________________________________
> Gluster-users mailing list<Gluster-users at gluster.org>
<Gluster-users at gluster.org>Gluster-users at
gluster.org<http://www.gluster.org/mailman/listinfo/gluster-users>
<http://www.gluster.org/mailman/listinfo/gluster-users>http://www.gluster.org/mailman/listinfo/gluster-users
>
>  _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>  _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> _______________________________________________
> Gluster-users mailing listGluster-users at
gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160315/06b93e41/attachment.html>

Gluster users - Mar 2016 - Replicated striped data lose

[Gluster-users] Replicated striped data lose

[Gluster-users] Replicated striped data lose