sorry for serial posting but, i got new logs it might help.. the message appear during the migration; /var/log/glusterfs/nfs.log [2016-03-14 09:45:04.573765] I [MSGID: 109036] [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-testv-dht: Setting layout of /New Virtual Machine_1 with [Subvol_name: testv-stripe-0, Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ], [2016-03-14 09:45:04.957499] E [shard.c:369:shard_modify_size_and_block_count] (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f) [0x7f27a13c067f] -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc) [0x7f27a116681c] -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd) [0x7f27a116584d] ) 0-testv-shard: Failed to get trusted.glusterfs.shard.file-size for c3e88cc1-7e0a-4d46-9685-2d12131a5e1c [2016-03-14 09:45:04.957577] W [MSGID: 112199] [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New Virtual Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS: 22(Invalid argument for operation), POSIX: 22(Invalid argument)) [Invalid argument] [2016-03-14 09:45:05.079657] E [MSGID: 112069] [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or directory: (192.168.221.52:826) testv : 00000000-0000-0000-0000-000000000001 Respectfully* **Mahdi A. Mahd * On 03/14/2016 11:14 AM, Mahdi Adnan wrote:> So i have deployed a new server "Cisco UCS C220M4" and created a new > volume; > > Volume Name: testv > Type: Stripe > Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.70.0.250:/mnt/b1/v > Brick2: 10.70.0.250:/mnt/b2/v > Options Reconfigured: > nfs.disable: off > features.shard-block-size: 64MB > features.shard: enable > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.readdir-ahead: off > > same error .. > > can anyone share with me the info of a working striped volume ? > > On 03/14/2016 09:02 AM, Mahdi Adnan wrote: >> I have a pool of two bricks in the same server; >> >> Volume Name: k >> Type: Stripe >> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b >> Status: Started >> Number of Bricks: 1 x 2 = 2 >> Transport-type: tcp >> Bricks: >> Brick1: gfs001:/bricks/t1/k >> Brick2: gfs001:/bricks/t2/k >> Options Reconfigured: >> features.shard-block-size: 64MB >> features.shard: on >> cluster.server-quorum-type: server >> cluster.quorum-type: auto >> network.remote-dio: enable >> cluster.eager-lock: enable >> performance.stat-prefetch: off >> performance.io-cache: off >> performance.read-ahead: off >> performance.quick-read: off >> performance.readdir-ahead: off >> >> same issue ... >> glusterfs 3.7.8 built on Mar 10 2016 20:20:45. >> >> >> Respectfully* >> **Mahdi A. Mahdi* >> >> Systems Administrator >> IT. Department >> Earthlink Telecommunications <https://www.facebook.com/earthlinktele> >> >> Cell: 07903316180 >> Work: 3352 >> Skype: mahdi.adnan at outlook.com <mailto:mahdi.adnan at outlook.com> >> On 03/14/2016 08:11 AM, Niels de Vos wrote: >>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote: >>>> It would be better to use sharding over stripe for your vm use case. It >>>> offers better distribution and utilisation of bricks and better heal >>>> performance. >>>> And it is well tested. >>> Basically the "striping" feature is deprecated, "sharding" is its >>> improved replacement. I expect to see "striping" completely dropped in >>> the next major release. >>> >>> Niels >>> >>> >>>> Couple of things to note before you do that: >>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised >>>> that you use 3.7.8 or above. >>>> 2. When you enable sharding on a volume, already existing files in the >>>> volume do not get sharded. Only the files that are newly created from the >>>> time sharding is enabled will. >>>> If you do want to shard the existing files, then you would need to cp >>>> them to a temp name within the volume, and then rename them back to the >>>> original file name. >>>> >>>> HTH, >>>> Krutika >>>> >>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com >>>>> wrote: >>>>> I couldn't find anything related to cache in the HBAs. >>>>> what logs are useful in my case ? i see only bricks logs which contains >>>>> nothing during the failure. >>>>> >>>>> ### >>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>>>> 0-vmware-posix: mknod on >>>>> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed >>>>> [File exists] >>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>>>> 0-vmware-posix: mknod on >>>>> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed >>>>> [File exists] >>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: >>>>> rmdir issued on /.trashcan/, which is not permitted >>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056] >>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR >>>>> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) >>>>> ==> (Operation not permitted) [Operation not permitted] >>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed >>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029] >>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>>>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: >>>>> 3.7.8) >>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed >>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029] >>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>>>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: >>>>> 3.7.8) >>>>> ### >>>>> >>>>> ESXi just keeps telling me "Cannot clone T: The virtual disk is either >>>>> corrupted or not a supported format. >>>>> error >>>>> 3/13/2016 9:06:20 PM >>>>> Clone virtual machine >>>>> T >>>>> VCENTER.LOCAL\Administrator >>>>> " >>>>> >>>>> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi >>>>> server mount the NFS via the floating ip. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On 03/13/2016 08:40 PM, pkoelle wrote: >>>>> >>>>>> Am 13.03.2016 um 18:22 schrieb David Gossage: >>>>>> >>>>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan < >>>>>>> mahdi.adnan at earthlinktele.com >>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>> My HBAs are LSISAS1068E, and the filesystem is XFS. >>>>>>>> I tried EXT4 and it did not help. >>>>>>>> I have created a stripted volume in one server with two bricks, same >>>>>>>> issue. >>>>>>>> and i tried a replicated volume with just "sharding enabled" same issue, >>>>>>>> as soon as i disable the sharding it works just fine, niether sharding >>>>>>>> nor >>>>>>>> striping works for me. >>>>>>>> i did follow up with some of threads in the mailing list and tried some >>>>>>>> of >>>>>>>> the fixes that worked with the others, none worked for me. :( >>>>>>>> >>>>>>>> >>>>>>> Is it possible the LSI has write-cache enabled? >>>>>>> >>>>>> Why is that relevant? Even the backing filesystem has no idea if there is >>>>>> a RAID or write cache or whatever. There are blocks and sync(), end of >>>>>> story. >>>>>> If you lose power and screw up your recovery OR do funky stuff with SAS >>>>>> multipathing that might be an issue with a controller cache. AFAIK thats >>>>>> not what we are talking about. >>>>>> >>>>>> I'm afraid but unless the OP has some logs from the server, a >>>>>> reproducible testcase or a backtrace from client or server this isn't >>>>>> getting us anywhere. >>>>>> >>>>>> cheers >>>>>> Paul >>>>>> >>>>>> >>>>>>> On 03/13/2016 06:54 PM, David Gossage wrote: >>>>>>>> >>>>>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan < >>>>>>>> mahdi.adnan at earthlinktele.com> wrote: >>>>>>>> >>>>>>>> Okay so i have enabled shard in my test volume and it did not help, >>>>>>>>> stupidly enough, i have enabled it in a production volume >>>>>>>>> "Distributed-Replicate" and it currpted half of my VMs. >>>>>>>>> I have updated Gluster to the latest and nothing seems to be changed in >>>>>>>>> my situation. >>>>>>>>> below the info of my volume; >>>>>>>>> >>>>>>>>> >>>>>>>> I was pointing at the settings in that email as an example for >>>>>>>> corruption >>>>>>>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the >>>>>>>> base working yet on that cluster. What HBA's are you using and what is >>>>>>>> layout of filesystem for bricks? >>>>>>>> >>>>>>>> >>>>>>>> Number of Bricks: 3 x 2 = 6 >>>>>>>>> Transport-type: tcp >>>>>>>>> Bricks: >>>>>>>>> Brick1: gfs001:/bricks/b001/vmware >>>>>>>>> Brick2: gfs002:/bricks/b004/vmware >>>>>>>>> Brick3: gfs001:/bricks/b002/vmware >>>>>>>>> Brick4: gfs002:/bricks/b005/vmware >>>>>>>>> Brick5: gfs001:/bricks/b003/vmware >>>>>>>>> Brick6: gfs002:/bricks/b006/vmware >>>>>>>>> Options Reconfigured: >>>>>>>>> performance.strict-write-ordering: on >>>>>>>>> cluster.server-quorum-type: server >>>>>>>>> cluster.quorum-type: auto >>>>>>>>> network.remote-dio: enable >>>>>>>>> performance.stat-prefetch: disable >>>>>>>>> performance.io-cache: off >>>>>>>>> performance.read-ahead: off >>>>>>>>> performance.quick-read: off >>>>>>>>> cluster.eager-lock: enable >>>>>>>>> features.shard-block-size: 16MB >>>>>>>>> features.shard: on >>>>>>>>> performance.readdir-ahead: off >>>>>>>>> >>>>>>>>> >>>>>>>>> On 03/12/2016 08:11 PM, David Gossage wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan < >>>>>>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>>>>>>>> >>>>>>>>> Both servers have HBA no RAIDs and i can setup a replicated or >>>>>>>>>> dispensers without any issues. >>>>>>>>>> Logs are clean and when i tried to migrate a vm and got the error, >>>>>>>>>> nothing showed up in the logs. >>>>>>>>>> i tried mounting the volume into my laptop and it mounted fine but, >>>>>>>>>> if i >>>>>>>>>> use dd to create a data file it just hang and i cant cancel it, and i >>>>>>>>>> cant >>>>>>>>>> unmount it or anything, i just have to reboot. >>>>>>>>>> The same servers have another volume on other bricks in a distributed >>>>>>>>>> replicas, works fine. >>>>>>>>>> I have even tried the same setup in a virtual environment (created two >>>>>>>>>> vms and install gluster and created a replicated striped) and again >>>>>>>>>> same >>>>>>>>>> thing, data corruption. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> I'd look through mail archives for a topic "Shard in Production" I >>>>>>>>> think >>>>>>>>> it's called. The shard portion may not be relevant but it does discuss >>>>>>>>> certain settings that had to be applied with regards to avoiding >>>>>>>>> corruption >>>>>>>>> with VM's. You may want to try and disable the >>>>>>>>> performance.readdir-ahead >>>>>>>>> also. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 03/12/2016 07:02 PM, David Gossage wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan < >>>>>>>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>>>>>>>>> >>>>>>>>>> Thanks David, >>>>>>>>>>> My settings are all defaults, i have just created the pool and >>>>>>>>>>> started >>>>>>>>>>> it. >>>>>>>>>>> I have set the settings as your recommendation and it seems to be the >>>>>>>>>>> same issue; >>>>>>>>>>> >>>>>>>>>>> Type: Striped-Replicate >>>>>>>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 >>>>>>>>>>> Status: Started >>>>>>>>>>> Number of Bricks: 1 x 2 x 2 = 4 >>>>>>>>>>> Transport-type: tcp >>>>>>>>>>> Bricks: >>>>>>>>>>> Brick1: gfs001:/bricks/t1/s >>>>>>>>>>> Brick2: gfs002:/bricks/t1/s >>>>>>>>>>> Brick3: gfs001:/bricks/t2/s >>>>>>>>>>> Brick4: gfs002:/bricks/t2/s >>>>>>>>>>> Options Reconfigured: >>>>>>>>>>> performance.stat-prefetch: off >>>>>>>>>>> network.remote-dio: on >>>>>>>>>>> cluster.eager-lock: enable >>>>>>>>>>> performance.io-cache: off >>>>>>>>>>> performance.read-ahead: off >>>>>>>>>>> performance.quick-read: off >>>>>>>>>>> performance.readdir-ahead: on >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> Is their a raid controller perhaps doing any caching? >>>>>>>>>> >>>>>>>>>> In the gluster logs any errors being reported during migration >>>>>>>>>> process? >>>>>>>>>> Since they aren't in use yet have you tested making just mirrored >>>>>>>>>> bricks >>>>>>>>>> using different pairings of servers two at a time to see if problem >>>>>>>>>> follows >>>>>>>>>> certain machine or network ports? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 03/12/2016 03:25 PM, David Gossage wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan < >>>>>>>>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Dears, >>>>>>>>>>>> I have created a replicated striped volume with two bricks and two >>>>>>>>>>>> servers but I can't use it because when I mount it in ESXi and try >>>>>>>>>>>> to >>>>>>>>>>>> migrate a VM to it, the data get corrupted. >>>>>>>>>>>> Is any one have any idea why is this happening ? >>>>>>>>>>>> >>>>>>>>>>>> Dell 2950 x2 >>>>>>>>>>>> Seagate 15k 600GB >>>>>>>>>>>> CentOS 7.2 >>>>>>>>>>>> Gluster 3.7.8 >>>>>>>>>>>> >>>>>>>>>>>> Appreciate your help. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> Most reports of this I have seen end up being settings related. Post >>>>>>>>>>> gluster volume info. Below is what I have seen as most common >>>>>>>>>>> recommended >>>>>>>>>>> settings. >>>>>>>>>>> I'd hazard a guess you may have some the read ahead cache or prefetch >>>>>>>>>>> on. >>>>>>>>>>> >>>>>>>>>>> quick-read=off >>>>>>>>>>> read-ahead=off >>>>>>>>>>> io-cache=off >>>>>>>>>>> stat-prefetch=off >>>>>>>>>>> eager-lock=enable >>>>>>>>>>> remote-dio=on >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Mahdi Adnan >>>>>>>>>>>> System Admin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>> <Gluster-users at gluster.org>Gluster-users at gluster.org >>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> >>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160314/a2c8755e/attachment.html>
Hi, So could you share the xattrs associated with the file at <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c Here's what you need to execute: # getfattr -d -m . -e hex /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c on the first node and # getfattr -d -m . -e hex /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c on the second. Also, it is normally advised to use a replica 3 volume as opposed to replica 2 volume to guard against split-brains. -Krutika On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com> wrote:> sorry for serial posting but, i got new logs it might help.. > > the message appear during the migration; > > /var/log/glusterfs/nfs.log > > > [2016-03-14 09:45:04.573765] I [MSGID: 109036] > [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-testv-dht: > Setting layout of /New Virtual Machine_1 with [Subvol_name: testv-stripe-0, > Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ], > [2016-03-14 09:45:04.957499] E > [shard.c:369:shard_modify_size_and_block_count] > (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f) > [0x7f27a13c067f] > -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc) > [0x7f27a116681c] > -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd) > [0x7f27a116584d] ) 0-testv-shard: Failed to get > trusted.glusterfs.shard.file-size for c3e88cc1-7e0a-4d46-9685-2d12131a5e1c > [2016-03-14 09:45:04.957577] W [MSGID: 112199] > [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New Virtual > Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS: > 22(Invalid argument for operation), POSIX: 22(Invalid argument)) [Invalid > argument] > [2016-03-14 09:45:05.079657] E [MSGID: 112069] > [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or directory: ( > 192.168.221.52:826) testv : 00000000-0000-0000-0000-000000000001 > > > > Respectfully > > > *Mahdi A. Mahd * > On 03/14/2016 11:14 AM, Mahdi Adnan wrote: > > So i have deployed a new server "Cisco UCS C220M4" and created a new > volume; > > Volume Name: testv > Type: Stripe > Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.70.0.250:/mnt/b1/v > Brick2: 10.70.0.250:/mnt/b2/v > Options Reconfigured: > nfs.disable: off > features.shard-block-size: 64MB > features.shard: enable > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.readdir-ahead: off > > same error .. > > can anyone share with me the info of a working striped volume ? > > On 03/14/2016 09:02 AM, Mahdi Adnan wrote: > > I have a pool of two bricks in the same server; > > Volume Name: k > Type: Stripe > Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: gfs001:/bricks/t1/k > Brick2: gfs001:/bricks/t2/k > Options Reconfigured: > features.shard-block-size: 64MB > features.shard: on > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.readdir-ahead: off > > same issue ... > glusterfs 3.7.8 built on Mar 10 2016 20:20:45. > > > Respectfully > *Mahdi A. Mahdi* > > Systems Administrator > IT. Department > Earthlink Telecommunications <https://www.facebook.com/earthlinktele> > > Cell: 07903316180 > Work: 3352 > Skype: mahdi.adnan at outlook.com > On 03/14/2016 08:11 AM, Niels de Vos wrote: > > On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote: > > It would be better to use sharding over stripe for your vm use case. It > offers better distribution and utilisation of bricks and better heal > performance. > And it is well tested. > > Basically the "striping" feature is deprecated, "sharding" is its > improved replacement. I expect to see "striping" completely dropped in > the next major release. > > Niels > > > > Couple of things to note before you do that: > 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised > that you use 3.7.8 or above. > 2. When you enable sharding on a volume, already existing files in the > volume do not get sharded. Only the files that are newly created from the > time sharding is enabled will. > If you do want to shard the existing files, then you would need to cp > them to a temp name within the volume, and then rename them back to the > original file name. > > HTH, > Krutika > > On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com > > wrote: > > I couldn't find anything related to cache in the HBAs. > what logs are useful in my case ? i see only bricks logs which contains > nothing during the failure. > > ### > [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] > 0-vmware-posix: mknod on > /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed > [File exists] > [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] > 0-vmware-posix: mknod on > /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed > [File exists] > [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: > rmdir issued on /.trashcan/, which is not permitted > [2016-03-13 18:07:55.027635] I [MSGID: 115056] > [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR > /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) > ==> (Operation not permitted) [Operation not permitted] > [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed > user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 > [2016-03-13 18:11:34.353463] I [MSGID: 115029] > [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client > from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: > 3.7.8) > [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed > user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 > [2016-03-13 18:11:34.591173] I [MSGID: 115029] > [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client > from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: > 3.7.8) > ### > > ESXi just keeps telling me "Cannot clone T: The virtual disk is either > corrupted or not a supported format. > error > 3/13/2016 9:06:20 PM > Clone virtual machine > T > VCENTER.LOCAL\Administrator > " > > My setup is 2 servers with a floating ip controlled by CTDB and my ESXi > server mount the NFS via the floating ip. > > > > > > On 03/13/2016 08:40 PM, pkoelle wrote: > > > Am 13.03.2016 um 18:22 schrieb David Gossage: > > > On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan at earthlinktele.com > > wrote: > > > My HBAs are LSISAS1068E, and the filesystem is XFS. > > I tried EXT4 and it did not help. > I have created a stripted volume in one server with two bricks, same > issue. > and i tried a replicated volume with just "sharding enabled" same issue, > as soon as i disable the sharding it works just fine, niether sharding > nor > striping works for me. > i did follow up with some of threads in the mailing list and tried some > of > the fixes that worked with the others, none worked for me. :( > > > > Is it possible the LSI has write-cache enabled? > > > Why is that relevant? Even the backing filesystem has no idea if there is > a RAID or write cache or whatever. There are blocks and sync(), end of > story. > If you lose power and screw up your recovery OR do funky stuff with SAS > multipathing that might be an issue with a controller cache. AFAIK thats > not what we are talking about. > > I'm afraid but unless the OP has some logs from the server, a > reproducible testcase or a backtrace from client or server this isn't > getting us anywhere. > > cheers > Paul > > > > On 03/13/2016 06:54 PM, David Gossage wrote: > > > On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <mahdi.adnan at earthlinktele.com> wrote: > > Okay so i have enabled shard in my test volume and it did not help, > > stupidly enough, i have enabled it in a production volume > "Distributed-Replicate" and it currpted half of my VMs. > I have updated Gluster to the latest and nothing seems to be changed in > my situation. > below the info of my volume; > > > > I was pointing at the settings in that email as an example for > corruption > fixing. I wouldn't recommend enabling sharding if you haven't gotten the > base working yet on that cluster. What HBA's are you using and what is > layout of filesystem for bricks? > > > Number of Bricks: 3 x 2 = 6 > > Transport-type: tcp > Bricks: > Brick1: gfs001:/bricks/b001/vmware > Brick2: gfs002:/bricks/b004/vmware > Brick3: gfs001:/bricks/b002/vmware > Brick4: gfs002:/bricks/b005/vmware > Brick5: gfs001:/bricks/b003/vmware > Brick6: gfs002:/bricks/b006/vmware > Options Reconfigured: > performance.strict-write-ordering: on > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > performance.stat-prefetch: disable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > cluster.eager-lock: enable > features.shard-block-size: 16MB > features.shard: on > performance.readdir-ahead: off > > > On 03/12/2016 08:11 PM, David Gossage wrote: > > > On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: > > Both servers have HBA no RAIDs and i can setup a replicated or > > dispensers without any issues. > Logs are clean and when i tried to migrate a vm and got the error, > nothing showed up in the logs. > i tried mounting the volume into my laptop and it mounted fine but, > if i > use dd to create a data file it just hang and i cant cancel it, and i > cant > unmount it or anything, i just have to reboot. > The same servers have another volume on other bricks in a distributed > replicas, works fine. > I have even tried the same setup in a virtual environment (created two > vms and install gluster and created a replicated striped) and again > same > thing, data corruption. > > > > I'd look through mail archives for a topic "Shard in Production" I > think > it's called. The shard portion may not be relevant but it does discuss > certain settings that had to be applied with regards to avoiding > corruption > with VM's. You may want to try and disable the > performance.readdir-ahead > also. > > > > > On 03/12/2016 07:02 PM, David Gossage wrote: > > > > On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: > > Thanks David, > > My settings are all defaults, i have just created the pool and > started > it. > I have set the settings as your recommendation and it seems to be the > same issue; > > Type: Striped-Replicate > Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 > Status: Started > Number of Bricks: 1 x 2 x 2 = 4 > Transport-type: tcp > Bricks: > Brick1: gfs001:/bricks/t1/s > Brick2: gfs002:/bricks/t1/s > Brick3: gfs001:/bricks/t2/s > Brick4: gfs002:/bricks/t2/s > Options Reconfigured: > performance.stat-prefetch: off > network.remote-dio: on > cluster.eager-lock: enable > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.readdir-ahead: on > > > > Is their a raid controller perhaps doing any caching? > > In the gluster logs any errors being reported during migration > process? > Since they aren't in use yet have you tested making just mirrored > bricks > using different pairings of servers two at a time to see if problem > follows > certain machine or network ports? > > > > > > > On 03/12/2016 03:25 PM, David Gossage wrote: > > > > On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: > > Dears, > > I have created a replicated striped volume with two bricks and two > servers but I can't use it because when I mount it in ESXi and try > to > migrate a VM to it, the data get corrupted. > Is any one have any idea why is this happening ? > > Dell 2950 x2 > Seagate 15k 600GB > CentOS 7.2 > Gluster 3.7.8 > > Appreciate your help. > > > > Most reports of this I have seen end up being settings related. Post > gluster volume info. Below is what I have seen as most common > recommended > settings. > I'd hazard a guess you may have some the read ahead cache or prefetch > on. > > quick-read=off > read-ahead=off > io-cache=off > stat-prefetch=off > eager-lock=enable > remote-dio=on > > > > Mahdi Adnan > System Admin > > > _______________________________________________ > Gluster-users mailing list<Gluster-users at gluster.org> <Gluster-users at gluster.org>Gluster-users at gluster.org<http://www.gluster.org/mailman/listinfo/gluster-users> <http://www.gluster.org/mailman/listinfo/gluster-users>http://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > > > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > > > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/06b93e41/attachment.html>