Hi, I have created the following volume; Volume Name: v Type: Distribute Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043 Status: Started Number of Bricks: 3 Transport-type: tcp Bricks: Brick1: gfs001:/bricks/b001/v Brick2: gfs001:/bricks/b002/v Brick3: gfs001:/bricks/b003/v Options Reconfigured: features.shard-block-size: 128MB features.shard: enable cluster.server-quorum-type: server cluster.quorum-type: auto network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off performance.readdir-ahead: on and after mounting it in ESXi and trying to clone a VM to it, i got the same error. Respectfully* **Mahdi A. Mahdi* On 03/15/2016 10:44 AM, Krutika Dhananjay wrote:> Hi, > > Do not use sharding and stripe together in the same volume because > a) It is not recommended and there is no point in using both. Using > sharding alone on your volume should work fine. > b) Nobody tested it. > c) Like Niels said, stripe feature is virtually deprecated. > > I would suggest that you create an nx3 volume where n is the number of > distribute subvols you prefer, enable group virt options on it, and > enable sharding on it, > set the shard-block-size that you feel appropriate and then just start > off with VM image creation etc. > If you run into any issues even after you do this, let us know and > we'll help you out. > > -Krutika > > On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan > <mahdi.adnan at earthlinktele.com <mailto:mahdi.adnan at earthlinktele.com>> > wrote: > > Thanks Krutika, > > I have deleted the volume and created a new one. > I found that it may be an issue with the NFS itself, i have > created a new striped volume and enabled sharding and mounted it > via glusterfs and it worked just fine, if i mount it with nfs it > will fail and gives me the same errors. > > Respectfully* > **Mahdi A. Mahdi* > > On 03/15/2016 06:24 AM, Krutika Dhananjay wrote: >> Hi, >> >> So could you share the xattrs associated with the file at >> <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >> >> Here's what you need to execute: >> >> # getfattr -d -m . -e hex >> /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >> on the first node and >> >> # getfattr -d -m . -e hex >> /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >> on the second. >> >> >> Also, it is normally advised to use a replica 3 volume as opposed >> to replica 2 volume to guard against split-brains. >> >> -Krutika >> >> On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan >> <mahdi.adnan at earthlinktele.com >> <mailto:mahdi.adnan at earthlinktele.com>> wrote: >> >> sorry for serial posting but, i got new logs it might help.. >> >> the message appear during the migration; >> >> /var/log/glusterfs/nfs.log >> >> >> [2016-03-14 09:45:04.573765] I [MSGID: 109036] >> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] >> 0-testv-dht: Setting layout of /New Virtual Machine_1 with >> [Subvol_name: testv-stripe-0, Err: -1 , Start: 0 , Stop: >> 4294967295 , Hash: 1 ], >> [2016-03-14 09:45:04.957499] E >> [shard.c:369:shard_modify_size_and_block_count] >> (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f) >> [0x7f27a13c067f] >> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc) >> [0x7f27a116681c] >> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd) >> [0x7f27a116584d] ) 0-testv-shard: Failed to get >> trusted.glusterfs.shard.file-size for >> c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >> [2016-03-14 09:45:04.957577] W [MSGID: 112199] >> [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New >> Virtual Machine_1/New Virtual Machine-flat.vmdk => (XID: >> 3fec5a26, SETATTR: NFS: 22(Invalid argument for operation), >> POSIX: 22(Invalid argument)) [Invalid argument] >> [2016-03-14 09:45:05.079657] E [MSGID: 112069] >> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or >> directory: (192.168.221.52:826 <http://192.168.221.52:826>) >> testv : 00000000-0000-0000-0000-000000000001 >> >> >> >> Respectfully* >> **Mahdi A. Mahd >> >> * >> On 03/14/2016 11:14 AM, Mahdi Adnan wrote: >>> So i have deployed a new server "Cisco UCS C220M4" and >>> created a new volume; >>> >>> Volume Name: testv >>> Type: Stripe >>> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b >>> Status: Started >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.70.0.250:/mnt/b1/v >>> Brick2: 10.70.0.250:/mnt/b2/v >>> Options Reconfigured: >>> nfs.disable: off >>> features.shard-block-size: 64MB >>> features.shard: enable >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> performance.stat-prefetch: off >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> performance.readdir-ahead: off >>> >>> same error .. >>> >>> can anyone share with me the info of a working striped volume ? >>> >>> On 03/14/2016 09:02 AM, Mahdi Adnan wrote: >>>> I have a pool of two bricks in the same server; >>>> >>>> Volume Name: k >>>> Type: Stripe >>>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b >>>> Status: Started >>>> Number of Bricks: 1 x 2 = 2 >>>> Transport-type: tcp >>>> Bricks: >>>> Brick1: gfs001:/bricks/t1/k >>>> Brick2: gfs001:/bricks/t2/k >>>> Options Reconfigured: >>>> features.shard-block-size: 64MB >>>> features.shard: on >>>> cluster.server-quorum-type: server >>>> cluster.quorum-type: auto >>>> network.remote-dio: enable >>>> cluster.eager-lock: enable >>>> performance.stat-prefetch: off >>>> performance.io-cache: off >>>> performance.read-ahead: off >>>> performance.quick-read: off >>>> performance.readdir-ahead: off >>>> >>>> same issue ... >>>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45. >>>> >>>> >>>> Respectfully* >>>> **Mahdi A. Mahdi* >>>> >>>> Systems Administrator >>>> IT. Department >>>> Earthlink Telecommunications >>>> <https://www.facebook.com/earthlinktele> >>>> >>>> Cell: 07903316180 >>>> Work: 3352 >>>> Skype: mahdi.adnan at outlook.com <mailto:mahdi.adnan at outlook.com> >>>> On 03/14/2016 08:11 AM, Niels de Vos wrote: >>>>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote: >>>>>> It would be better to use sharding over stripe for your vm use case. It >>>>>> offers better distribution and utilisation of bricks and better heal >>>>>> performance. >>>>>> And it is well tested. >>>>> Basically the "striping" feature is deprecated, "sharding" is its >>>>> improved replacement. I expect to see "striping" completely dropped in >>>>> the next major release. >>>>> >>>>> Niels >>>>> >>>>> >>>>>> Couple of things to note before you do that: >>>>>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised >>>>>> that you use 3.7.8 or above. >>>>>> 2. When you enable sharding on a volume, already existing files in the >>>>>> volume do not get sharded. Only the files that are newly created from the >>>>>> time sharding is enabled will. >>>>>> If you do want to shard the existing files, then you would need to cp >>>>>> them to a temp name within the volume, and then rename them back to the >>>>>> original file name. >>>>>> >>>>>> HTH, >>>>>> Krutika >>>>>> >>>>>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com >>>>>> <mailto:mahdi.adnan at earthlinktele.com> >>>>>>> wrote: >>>>>>> I couldn't find anything related to cache in the HBAs. >>>>>>> what logs are useful in my case ? i see only bricks logs which contains >>>>>>> nothing during the failure. >>>>>>> >>>>>>> ### >>>>>>> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>>>>>> 0-vmware-posix: mknod on >>>>>>> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed >>>>>>> [File exists] >>>>>>> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>>>>>> 0-vmware-posix: mknod on >>>>>>> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed >>>>>>> [File exists] >>>>>>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: >>>>>>> rmdir issued on /.trashcan/, which is not permitted >>>>>>> [2016-03-13 18:07:55.027635] I [MSGID: 115056] >>>>>>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR >>>>>>> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) >>>>>>> ==> (Operation not permitted) [Operation not permitted] >>>>>>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed >>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>>>>>> [2016-03-13 18:11:34.353463] I [MSGID: 115029] >>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>>>>>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: >>>>>>> 3.7.8) >>>>>>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed >>>>>>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>>>>>> [2016-03-13 18:11:34.591173] I [MSGID: 115029] >>>>>>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>>>>>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: >>>>>>> 3.7.8) >>>>>>> ### >>>>>>> >>>>>>> ESXi just keeps telling me "Cannot clone T: The virtual disk is either >>>>>>> corrupted or not a supported format. >>>>>>> error >>>>>>> 3/13/2016 9:06:20 PM >>>>>>> Clone virtual machine >>>>>>> T >>>>>>> VCENTER.LOCAL\Administrator >>>>>>> " >>>>>>> >>>>>>> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi >>>>>>> server mount the NFS via the floating ip. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 03/13/2016 08:40 PM, pkoelle wrote: >>>>>>> >>>>>>>> Am 13.03.2016 um 18:22 schrieb David Gossage: >>>>>>>> >>>>>>>>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan < >>>>>>>>> mahdi.adnan at earthlinktele.com >>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com> >>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>> My HBAs are LSISAS1068E, and the filesystem is XFS. >>>>>>>>>> I tried EXT4 and it did not help. >>>>>>>>>> I have created a stripted volume in one server with two bricks, same >>>>>>>>>> issue. >>>>>>>>>> and i tried a replicated volume with just "sharding enabled" same issue, >>>>>>>>>> as soon as i disable the sharding it works just fine, niether sharding >>>>>>>>>> nor >>>>>>>>>> striping works for me. >>>>>>>>>> i did follow up with some of threads in the mailing list and tried some >>>>>>>>>> of >>>>>>>>>> the fixes that worked with the others, none worked for me. :( >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Is it possible the LSI has write-cache enabled? >>>>>>>>> >>>>>>>> Why is that relevant? Even the backing filesystem has no idea if there is >>>>>>>> a RAID or write cache or whatever. There are blocks and sync(), end of >>>>>>>> story. >>>>>>>> If you lose power and screw up your recovery OR do funky stuff with SAS >>>>>>>> multipathing that might be an issue with a controller cache. AFAIK thats >>>>>>>> not what we are talking about. >>>>>>>> >>>>>>>> I'm afraid but unless the OP has some logs from the server, a >>>>>>>> reproducible testcase or a backtrace from client or server this isn't >>>>>>>> getting us anywhere. >>>>>>>> >>>>>>>> cheers >>>>>>>> Paul >>>>>>>> >>>>>>>> >>>>>>>>> On 03/13/2016 06:54 PM, David Gossage wrote: >>>>>>>>>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan < >>>>>>>>>> mahdi.adnan at earthlinktele.com >>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote: >>>>>>>>>> >>>>>>>>>> Okay so i have enabled shard in my test volume and it did not help, >>>>>>>>>>> stupidly enough, i have enabled it in a production volume >>>>>>>>>>> "Distributed-Replicate" and it currpted half of my VMs. >>>>>>>>>>> I have updated Gluster to the latest and nothing seems to be changed in >>>>>>>>>>> my situation. >>>>>>>>>>> below the info of my volume; >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> I was pointing at the settings in that email as an example for >>>>>>>>>> corruption >>>>>>>>>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the >>>>>>>>>> base working yet on that cluster. What HBA's are you using and what is >>>>>>>>>> layout of filesystem for bricks? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Number of Bricks: 3 x 2 = 6 >>>>>>>>>>> Transport-type: tcp >>>>>>>>>>> Bricks: >>>>>>>>>>> Brick1: gfs001:/bricks/b001/vmware >>>>>>>>>>> Brick2: gfs002:/bricks/b004/vmware >>>>>>>>>>> Brick3: gfs001:/bricks/b002/vmware >>>>>>>>>>> Brick4: gfs002:/bricks/b005/vmware >>>>>>>>>>> Brick5: gfs001:/bricks/b003/vmware >>>>>>>>>>> Brick6: gfs002:/bricks/b006/vmware >>>>>>>>>>> Options Reconfigured: >>>>>>>>>>> performance.strict-write-ordering: on >>>>>>>>>>> cluster.server-quorum-type: server >>>>>>>>>>> cluster.quorum-type: auto >>>>>>>>>>> network.remote-dio: enable >>>>>>>>>>> performance.stat-prefetch: disable >>>>>>>>>>> performance.io-cache: off >>>>>>>>>>> performance.read-ahead: off >>>>>>>>>>> performance.quick-read: off >>>>>>>>>>> cluster.eager-lock: enable >>>>>>>>>>> features.shard-block-size: 16MB >>>>>>>>>>> features.shard: on >>>>>>>>>>> performance.readdir-ahead: off >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 03/12/2016 08:11 PM, David Gossage wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan < >>>>>>>>>>> <mahdi.adnan at earthlinktele.com> >>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com >>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote: >>>>>>>>>>> >>>>>>>>>>> Both servers have HBA no RAIDs and i can setup a replicated or >>>>>>>>>>>> dispensers without any issues. >>>>>>>>>>>> Logs are clean and when i tried to migrate a vm and got the error, >>>>>>>>>>>> nothing showed up in the logs. >>>>>>>>>>>> i tried mounting the volume into my laptop and it mounted fine but, >>>>>>>>>>>> if i >>>>>>>>>>>> use dd to create a data file it just hang and i cant cancel it, and i >>>>>>>>>>>> cant >>>>>>>>>>>> unmount it or anything, i just have to reboot. >>>>>>>>>>>> The same servers have another volume on other bricks in a distributed >>>>>>>>>>>> replicas, works fine. >>>>>>>>>>>> I have even tried the same setup in a virtual environment (created two >>>>>>>>>>>> vms and install gluster and created a replicated striped) and again >>>>>>>>>>>> same >>>>>>>>>>>> thing, data corruption. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> I'd look through mail archives for a topic "Shard in Production" I >>>>>>>>>>> think >>>>>>>>>>> it's called. The shard portion may not be relevant but it does discuss >>>>>>>>>>> certain settings that had to be applied with regards to avoiding >>>>>>>>>>> corruption >>>>>>>>>>> with VM's. You may want to try and disable the >>>>>>>>>>> performance.readdir-ahead >>>>>>>>>>> also. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 03/12/2016 07:02 PM, David Gossage wrote: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan < >>>>>>>>>>>> <mahdi.adnan at earthlinktele.com> >>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com >>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thanks David, >>>>>>>>>>>>> My settings are all defaults, i have just created the pool and >>>>>>>>>>>>> started >>>>>>>>>>>>> it. >>>>>>>>>>>>> I have set the settings as your recommendation and it seems to be the >>>>>>>>>>>>> same issue; >>>>>>>>>>>>> >>>>>>>>>>>>> Type: Striped-Replicate >>>>>>>>>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 >>>>>>>>>>>>> Status: Started >>>>>>>>>>>>> Number of Bricks: 1 x 2 x 2 = 4 >>>>>>>>>>>>> Transport-type: tcp >>>>>>>>>>>>> Bricks: >>>>>>>>>>>>> Brick1: gfs001:/bricks/t1/s >>>>>>>>>>>>> Brick2: gfs002:/bricks/t1/s >>>>>>>>>>>>> Brick3: gfs001:/bricks/t2/s >>>>>>>>>>>>> Brick4: gfs002:/bricks/t2/s >>>>>>>>>>>>> Options Reconfigured: >>>>>>>>>>>>> performance.stat-prefetch: off >>>>>>>>>>>>> network.remote-dio: on >>>>>>>>>>>>> cluster.eager-lock: enable >>>>>>>>>>>>> performance.io-cache: off >>>>>>>>>>>>> performance.read-ahead: off >>>>>>>>>>>>> performance.quick-read: off >>>>>>>>>>>>> performance.readdir-ahead: on >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> Is their a raid controller perhaps doing any caching? >>>>>>>>>>>> >>>>>>>>>>>> In the gluster logs any errors being reported during migration >>>>>>>>>>>> process? >>>>>>>>>>>> Since they aren't in use yet have you tested making just mirrored >>>>>>>>>>>> bricks >>>>>>>>>>>> using different pairings of servers two at a time to see if problem >>>>>>>>>>>> follows >>>>>>>>>>>> certain machine or network ports? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 03/12/2016 03:25 PM, David Gossage wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan < >>>>>>>>>>>>> <mahdi.adnan at earthlinktele.com> >>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com >>>>>>>>>>>>> <mailto:mahdi.adnan at earthlinktele.com>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Dears, >>>>>>>>>>>>>> I have created a replicated striped volume with two bricks and two >>>>>>>>>>>>>> servers but I can't use it because when I mount it in ESXi and try >>>>>>>>>>>>>> to >>>>>>>>>>>>>> migrate a VM to it, the data get corrupted. >>>>>>>>>>>>>> Is any one have any idea why is this happening ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Dell 2950 x2 >>>>>>>>>>>>>> Seagate 15k 600GB >>>>>>>>>>>>>> CentOS 7.2 >>>>>>>>>>>>>> Gluster 3.7.8 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Appreciate your help. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> Most reports of this I have seen end up being settings related. Post >>>>>>>>>>>>> gluster volume info. Below is what I have seen as most common >>>>>>>>>>>>> recommended >>>>>>>>>>>>> settings. >>>>>>>>>>>>> I'd hazard a guess you may have some the read ahead cache or prefetch >>>>>>>>>>>>> on. >>>>>>>>>>>>> >>>>>>>>>>>>> quick-read=off >>>>>>>>>>>>> read-ahead=off >>>>>>>>>>>>> io-cache=off >>>>>>>>>>>>> stat-prefetch=off >>>>>>>>>>>>> eager-lock=enable >>>>>>>>>>>>> remote-dio=on >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Mahdi Adnan >>>>>>>>>>>>>> System Admin >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>>>>> <Gluster-users at gluster.org> >>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org>Gluster-users at gluster.org >>>>>>>>>>>>>> <mailto:Gluster-users at gluster.org> >>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> >>>>>>>>>>>>>> <http://www.gluster.org/mailman/listinfo/gluster-users> >>>>>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Gluster-users mailing list >>>>>>>>> Gluster-users at gluster.org >>>>>>>>> <mailto:Gluster-users at gluster.org> >>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Gluster-users mailing list >>>>>>>> Gluster-users at gluster.org >>>>>>>> <mailto:Gluster-users at gluster.org> >>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://www.gluster.org/mailman/listinfo/gluster-users >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/f69dc959/attachment.html>
OK but what if you use it with replication? Do you still see the error? I think not. Could you give it a try and tell me what you find? -Krutika On Tue, Mar 15, 2016 at 1:23 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com> wrote:> Hi, > > I have created the following volume; > > Volume Name: v > Type: Distribute > Volume ID: 90de6430-7f83-4eda-a98f-ad1fabcf1043 > Status: Started > Number of Bricks: 3 > Transport-type: tcp > Bricks: > Brick1: gfs001:/bricks/b001/v > Brick2: gfs001:/bricks/b002/v > Brick3: gfs001:/bricks/b003/v > Options Reconfigured: > features.shard-block-size: 128MB > features.shard: enable > cluster.server-quorum-type: server > cluster.quorum-type: auto > network.remote-dio: enable > cluster.eager-lock: enable > performance.stat-prefetch: off > performance.io-cache: off > performance.read-ahead: off > performance.quick-read: off > performance.readdir-ahead: on > > and after mounting it in ESXi and trying to clone a VM to it, i got the > same error. > > > Respectfully > *Mahdi A. Mahdi* > > > On 03/15/2016 10:44 AM, Krutika Dhananjay wrote: > > Hi, > > Do not use sharding and stripe together in the same volume because > a) It is not recommended and there is no point in using both. Using > sharding alone on your volume should work fine. > b) Nobody tested it. > c) Like Niels said, stripe feature is virtually deprecated. > > I would suggest that you create an nx3 volume where n is the number of > distribute subvols you prefer, enable group virt options on it, and enable > sharding on it, > set the shard-block-size that you feel appropriate and then just start off > with VM image creation etc. > If you run into any issues even after you do this, let us know and we'll > help you out. > > -Krutika > > On Tue, Mar 15, 2016 at 1:07 PM, Mahdi Adnan < > mahdi.adnan at earthlinktele.com> wrote: > >> Thanks Krutika, >> >> I have deleted the volume and created a new one. >> I found that it may be an issue with the NFS itself, i have created a new >> striped volume and enabled sharding and mounted it via glusterfs and it >> worked just fine, if i mount it with nfs it will fail and gives me the same >> errors. >> >> Respectfully >> *Mahdi A. Mahdi* >> >> On 03/15/2016 06:24 AM, Krutika Dhananjay wrote: >> >> Hi, >> >> So could you share the xattrs associated with the file at >> <BRICK_PATH>/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >> >> Here's what you need to execute: >> >> # getfattr -d -m . -e hex >> /mnt/b1/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c on the >> first node and >> >> # getfattr -d -m . -e hex >> /mnt/b2/v/.glusterfs/c3/e8/c3e88cc1-7e0a-4d46-9685-2d12131a5e1c on the >> second. >> >> >> Also, it is normally advised to use a replica 3 volume as opposed to >> replica 2 volume to guard against split-brains. >> >> -Krutika >> >> On Mon, Mar 14, 2016 at 3:17 PM, Mahdi Adnan < >> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >> >>> sorry for serial posting but, i got new logs it might help.. >>> >>> the message appear during the migration; >>> >>> /var/log/glusterfs/nfs.log >>> >>> >>> [2016-03-14 09:45:04.573765] I [MSGID: 109036] >>> [dht-common.c:8043:dht_log_new_layout_for_dir_selfheal] 0-testv-dht: >>> Setting layout of /New Virtual Machine_1 with [Subvol_name: testv-stripe-0, >>> Err: -1 , Start: 0 , Stop: 4294967295 , Hash: 1 ], >>> [2016-03-14 09:45:04.957499] E >>> [shard.c:369:shard_modify_size_and_block_count] >>> (-->/usr/lib64/glusterfs/3.7.8/xlator/cluster/distribute.so(dht_file_setattr_cbk+0x14f) >>> [0x7f27a13c067f] >>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_common_setattr_cbk+0xcc) >>> [0x7f27a116681c] >>> -->/usr/lib64/glusterfs/3.7.8/xlator/features/shard.so(shard_modify_size_and_block_count+0xdd) >>> [0x7f27a116584d] ) 0-testv-shard: Failed to get >>> trusted.glusterfs.shard.file-size for c3e88cc1-7e0a-4d46-9685-2d12131a5e1c >>> [2016-03-14 09:45:04.957577] W [MSGID: 112199] >>> [nfs3-helpers.c:3418:nfs3_log_common_res] 0-nfs-nfsv3: /New Virtual >>> Machine_1/New Virtual Machine-flat.vmdk => (XID: 3fec5a26, SETATTR: NFS: >>> 22(Invalid argument for operation), POSIX: 22(Invalid argument)) [Invalid >>> argument] >>> [2016-03-14 09:45:05.079657] E [MSGID: 112069] >>> [nfs3.c:3649:nfs3_rmdir_resume] 0-nfs-nfsv3: No such file or directory: ( >>> 192.168.221.52:826) testv : 00000000-0000-0000-0000-000000000001 >>> >>> >>> >>> Respectfully >>> >>> >>> *Mahdi A. Mahd * >>> On 03/14/2016 11:14 AM, Mahdi Adnan wrote: >>> >>> So i have deployed a new server "Cisco UCS C220M4" and created a new >>> volume; >>> >>> Volume Name: testv >>> Type: Stripe >>> Volume ID: 55cdac79-fe87-4f1f-90c0-15c9100fe00b >>> Status: Started >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.70.0.250:/mnt/b1/v >>> Brick2: 10.70.0.250:/mnt/b2/v >>> Options Reconfigured: >>> nfs.disable: off >>> features.shard-block-size: 64MB >>> features.shard: enable >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> performance.stat-prefetch: off >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> performance.readdir-ahead: off >>> >>> same error .. >>> >>> can anyone share with me the info of a working striped volume ? >>> >>> On 03/14/2016 09:02 AM, Mahdi Adnan wrote: >>> >>> I have a pool of two bricks in the same server; >>> >>> Volume Name: k >>> Type: Stripe >>> Volume ID: 1e9281ce-2a8b-44e8-a0c6-e3ebf7416b2b >>> Status: Started >>> Number of Bricks: 1 x 2 = 2 >>> Transport-type: tcp >>> Bricks: >>> Brick1: gfs001:/bricks/t1/k >>> Brick2: gfs001:/bricks/t2/k >>> Options Reconfigured: >>> features.shard-block-size: 64MB >>> features.shard: on >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> network.remote-dio: enable >>> cluster.eager-lock: enable >>> performance.stat-prefetch: off >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> performance.readdir-ahead: off >>> >>> same issue ... >>> glusterfs 3.7.8 built on Mar 10 2016 20:20:45. >>> >>> >>> Respectfully >>> *Mahdi A. Mahdi* >>> >>> Systems Administrator >>> IT. Department >>> Earthlink Telecommunications <https://www.facebook.com/earthlinktele> >>> >>> Cell: 07903316180 >>> Work: 3352 >>> Skype: <mahdi.adnan at outlook.com>mahdi.adnan at outlook.com >>> On 03/14/2016 08:11 AM, Niels de Vos wrote: >>> >>> On Mon, Mar 14, 2016 at 08:12:27AM +0530, Krutika Dhananjay wrote: >>> >>> It would be better to use sharding over stripe for your vm use case. It >>> offers better distribution and utilisation of bricks and better heal >>> performance. >>> And it is well tested. >>> >>> Basically the "striping" feature is deprecated, "sharding" is its >>> improved replacement. I expect to see "striping" completely dropped in >>> the next major release. >>> >>> Niels >>> >>> >>> >>> Couple of things to note before you do that: >>> 1. Most of the bug fixes in sharding have gone into 3.7.8. So it is advised >>> that you use 3.7.8 or above. >>> 2. When you enable sharding on a volume, already existing files in the >>> volume do not get sharded. Only the files that are newly created from the >>> time sharding is enabled will. >>> If you do want to shard the existing files, then you would need to cp >>> them to a temp name within the volume, and then rename them back to the >>> original file name. >>> >>> HTH, >>> Krutika >>> >>> On Sun, Mar 13, 2016 at 11:49 PM, Mahdi Adnan <mahdi.adnan at earthlinktele.com >>> >>> wrote: >>> >>> I couldn't find anything related to cache in the HBAs. >>> what logs are useful in my case ? i see only bricks logs which contains >>> nothing during the failure. >>> >>> ### >>> [2016-03-13 18:05:19.728614] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>> 0-vmware-posix: mknod on >>> /bricks/b003/vmware/.shard/17d75e20-16f1-405e-9fa5-99ee7b1bd7f1.511 failed >>> [File exists] >>> [2016-03-13 18:07:23.337086] E [MSGID: 113022] [posix.c:1232:posix_mknod] >>> 0-vmware-posix: mknod on >>> /bricks/b003/vmware/.shard/eef2d538-8eee-4e58-bc88-fbf7dc03b263.4095 failed >>> [File exists] >>> [2016-03-13 18:07:55.027600] W [trash.c:1922:trash_rmdir] 0-vmware-trash: >>> rmdir issued on /.trashcan/, which is not permitted >>> [2016-03-13 18:07:55.027635] I [MSGID: 115056] >>> [server-rpc-fops.c:459:server_rmdir_cbk] 0-vmware-server: 41987: RMDIR >>> /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_op) >>> ==> (Operation not permitted) [Operation not permitted] >>> [2016-03-13 18:11:34.353441] I [login.c:81:gf_auth] 0-auth/login: allowed >>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>> [2016-03-13 18:11:34.353463] I [MSGID: 115029] >>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>> from gfs002-2727-2016/03/13-20:17:43:613597-vmware-client-4-0-0 (version: >>> 3.7.8) >>> [2016-03-13 18:11:34.591139] I [login.c:81:gf_auth] 0-auth/login: allowed >>> user names: c0c72c37-477a-49a5-a305-3372c1c2f2b4 >>> [2016-03-13 18:11:34.591173] I [MSGID: 115029] >>> [server-handshake.c:612:server_setvolume] 0-vmware-server: accepted client >>> from gfs002-2719-2016/03/13-20:17:42:609388-vmware-client-4-0-0 (version: >>> 3.7.8) >>> ### >>> >>> ESXi just keeps telling me "Cannot clone T: The virtual disk is either >>> corrupted or not a supported format. >>> error >>> 3/13/2016 9:06:20 PM >>> Clone virtual machine >>> T >>> VCENTER.LOCAL\Administrator >>> " >>> >>> My setup is 2 servers with a floating ip controlled by CTDB and my ESXi >>> server mount the NFS via the floating ip. >>> >>> >>> >>> >>> >>> On 03/13/2016 08:40 PM, pkoelle wrote: >>> >>> >>> Am 13.03.2016 um 18:22 schrieb David Gossage: >>> >>> >>> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan at earthlinktele.com >>> >>> wrote: >>> >>> >>> My HBAs are LSISAS1068E, and the filesystem is XFS. >>> >>> I tried EXT4 and it did not help. >>> I have created a stripted volume in one server with two bricks, same >>> issue. >>> and i tried a replicated volume with just "sharding enabled" same issue, >>> as soon as i disable the sharding it works just fine, niether sharding >>> nor >>> striping works for me. >>> i did follow up with some of threads in the mailing list and tried some >>> of >>> the fixes that worked with the others, none worked for me. :( >>> >>> >>> >>> Is it possible the LSI has write-cache enabled? >>> >>> >>> Why is that relevant? Even the backing filesystem has no idea if there is >>> a RAID or write cache or whatever. There are blocks and sync(), end of >>> story. >>> If you lose power and screw up your recovery OR do funky stuff with SAS >>> multipathing that might be an issue with a controller cache. AFAIK thats >>> not what we are talking about. >>> >>> I'm afraid but unless the OP has some logs from the server, a >>> reproducible testcase or a backtrace from client or server this isn't >>> getting us anywhere. >>> >>> cheers >>> Paul >>> >>> >>> >>> On 03/13/2016 06:54 PM, David Gossage wrote: >>> >>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <mahdi.adnan at earthlinktele.com> wrote: >>> >>> Okay so i have enabled shard in my test volume and it did not help, >>> >>> stupidly enough, i have enabled it in a production volume >>> "Distributed-Replicate" and it currpted half of my VMs. >>> I have updated Gluster to the latest and nothing seems to be changed in >>> my situation. >>> below the info of my volume; >>> >>> >>> >>> I was pointing at the settings in that email as an example for >>> corruption >>> fixing. I wouldn't recommend enabling sharding if you haven't gotten the >>> base working yet on that cluster. What HBA's are you using and what is >>> layout of filesystem for bricks? >>> >>> >>> Number of Bricks: 3 x 2 = 6 >>> >>> Transport-type: tcp >>> Bricks: >>> Brick1: gfs001:/bricks/b001/vmware >>> Brick2: gfs002:/bricks/b004/vmware >>> Brick3: gfs001:/bricks/b002/vmware >>> Brick4: gfs002:/bricks/b005/vmware >>> Brick5: gfs001:/bricks/b003/vmware >>> Brick6: gfs002:/bricks/b006/vmware >>> Options Reconfigured: >>> performance.strict-write-ordering: on >>> cluster.server-quorum-type: server >>> cluster.quorum-type: auto >>> network.remote-dio: enable >>> performance.stat-prefetch: disable >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> cluster.eager-lock: enable >>> features.shard-block-size: 16MB >>> features.shard: on >>> performance.readdir-ahead: off >>> >>> >>> On 03/12/2016 08:11 PM, David Gossage wrote: >>> >>> >>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>> >>> Both servers have HBA no RAIDs and i can setup a replicated or >>> >>> dispensers without any issues. >>> Logs are clean and when i tried to migrate a vm and got the error, >>> nothing showed up in the logs. >>> i tried mounting the volume into my laptop and it mounted fine but, >>> if i >>> use dd to create a data file it just hang and i cant cancel it, and i >>> cant >>> unmount it or anything, i just have to reboot. >>> The same servers have another volume on other bricks in a distributed >>> replicas, works fine. >>> I have even tried the same setup in a virtual environment (created two >>> vms and install gluster and created a replicated striped) and again >>> same >>> thing, data corruption. >>> >>> >>> >>> I'd look through mail archives for a topic "Shard in Production" I >>> think >>> it's called. The shard portion may not be relevant but it does discuss >>> certain settings that had to be applied with regards to avoiding >>> corruption >>> with VM's. You may want to try and disable the >>> performance.readdir-ahead >>> also. >>> >>> >>> >>> >>> On 03/12/2016 07:02 PM, David Gossage wrote: >>> >>> >>> >>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>> >>> Thanks David, >>> >>> My settings are all defaults, i have just created the pool and >>> started >>> it. >>> I have set the settings as your recommendation and it seems to be the >>> same issue; >>> >>> Type: Striped-Replicate >>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14 >>> Status: Started >>> Number of Bricks: 1 x 2 x 2 = 4 >>> Transport-type: tcp >>> Bricks: >>> Brick1: gfs001:/bricks/t1/s >>> Brick2: gfs002:/bricks/t1/s >>> Brick3: gfs001:/bricks/t2/s >>> Brick4: gfs002:/bricks/t2/s >>> Options Reconfigured: >>> performance.stat-prefetch: off >>> network.remote-dio: on >>> cluster.eager-lock: enable >>> performance.io-cache: off >>> performance.read-ahead: off >>> performance.quick-read: off >>> performance.readdir-ahead: on >>> >>> >>> >>> Is their a raid controller perhaps doing any caching? >>> >>> In the gluster logs any errors being reported during migration >>> process? >>> Since they aren't in use yet have you tested making just mirrored >>> bricks >>> using different pairings of servers two at a time to see if problem >>> follows >>> certain machine or network ports? >>> >>> >>> >>> >>> >>> On 03/12/2016 03:25 PM, David Gossage wrote: >>> >>> >>> >>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <<mahdi.adnan at earthlinktele.com> <mahdi.adnan at earthlinktele.com>mahdi.adnan at earthlinktele.com> wrote: >>> >>> Dears, >>> >>> I have created a replicated striped volume with two bricks and two >>> servers but I can't use it because when I mount it in ESXi and try >>> to >>> migrate a VM to it, the data get corrupted. >>> Is any one have any idea why is this happening ? >>> >>> Dell 2950 x2 >>> Seagate 15k 600GB >>> CentOS 7.2 >>> Gluster 3.7.8 >>> >>> Appreciate your help. >>> >>> >>> >>> Most reports of this I have seen end up being settings related. Post >>> gluster volume info. Below is what I have seen as most common >>> recommended >>> settings. >>> I'd hazard a guess you may have some the read ahead cache or prefetch >>> on. >>> >>> quick-read=off >>> read-ahead=off >>> io-cache=off >>> stat-prefetch=off >>> eager-lock=enable >>> remote-dio=on >>> >>> >>> >>> Mahdi Adnan >>> System Admin >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list<Gluster-users at gluster.org> <Gluster-users at gluster.org>Gluster-users at gluster.org<http://www.gluster.org/mailman/listinfo/gluster-users> <http://www.gluster.org/mailman/listinfo/gluster-users>http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing listGluster-users at gluster.orghttp://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >> >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160315/5d29fb91/attachment-0001.html>