thr3ads.net - Gluster users - [Gluster-users] Replicated striped data lose [Mar 2016]

If this information is useful, please help other people find it:
Share via:

David Gossage

2016-Mar-13 17:22 UTC

[Gluster-users] Replicated striped data lose

On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com> wrote:
> My HBAs are LSISAS1068E, and the filesystem is XFS.
> I tried EXT4 and it did not help.
> I have created a stripted volume in one server with two bricks, same issue.
> and i tried a replicated volume with just "sharding enabled" same
issue,
> as soon as i disable the sharding it works just fine, niether sharding nor
> striping works for me.
> i did follow up with some of threads in the mailing list and tried some of
> the fixes that worked with the others, none worked for me. :(
>
Is it possible the LSI has write-cache enabled?



> On 03/13/2016 06:54 PM, David Gossage wrote:
>
>
>
>
> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <
> mahdi.adnan at earthlinktele.com> wrote:
>
>> Okay so i have enabled shard in my test volume and it did not help,
>> stupidly enough, i have enabled it in a production volume
>> "Distributed-Replicate" and it currpted  half of my VMs.
>> I have updated Gluster to the latest and nothing seems to be changed in
>> my situation.
>> below the info of my volume;
>>
>
> I was pointing at the settings in that email as an example for corruption
> fixing. I wouldn't recommend enabling sharding if you haven't
gotten the
> base working yet on that cluster. What HBA's are you using and what is
> layout of filesystem for bricks?
>
>
>> Number of Bricks: 3 x 2 = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: gfs001:/bricks/b001/vmware
>> Brick2: gfs002:/bricks/b004/vmware
>> Brick3: gfs001:/bricks/b002/vmware
>> Brick4: gfs002:/bricks/b005/vmware
>> Brick5: gfs001:/bricks/b003/vmware
>> Brick6: gfs002:/bricks/b006/vmware
>> Options Reconfigured:
>> performance.strict-write-ordering: on
>> cluster.server-quorum-type: server
>> cluster.quorum-type: auto
>> network.remote-dio: enable
>> performance.stat-prefetch: disable
>> performance.io-cache: off
>> performance.read-ahead: off
>> performance.quick-read: off
>> cluster.eager-lock: enable
>> features.shard-block-size: 16MB
>> features.shard: on
>> performance.readdir-ahead: off
>>
>>
>> On 03/12/2016 08:11 PM, David Gossage wrote:
>>
>>
>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <
>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>
>>> Both servers have HBA no RAIDs and i can setup a replicated or
>>> dispensers without any issues.
>>> Logs are clean and when i tried to migrate a vm and got the error,
>>> nothing showed up in the logs.
>>> i tried mounting the volume into my laptop and it mounted fine but,
if i
>>> use dd to create a data file it just hang and i cant cancel it, and
i cant
>>> unmount it or anything, i just have to reboot.
>>> The same servers have another volume on other bricks in a
distributed
>>> replicas, works fine.
>>> I have even tried the same setup in a virtual environment (created
two
>>> vms and install gluster and created a replicated striped) and again
same
>>> thing, data corruption.
>>>
>>
>> I'd look through mail archives for a topic "Shard in
Production" I think
>> it's called.  The shard portion may not be relevant but it does
discuss
>> certain settings that had to be applied with regards to avoiding
corruption
>> with VM's.  You may want to try and disable the 
performance.readdir-ahead
>> also.
>>
>>
>>>
>>> On 03/12/2016 07:02 PM, David Gossage wrote:
>>>
>>>
>>>
>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <
>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>
>>>> Thanks David,
>>>>
>>>> My settings are all defaults, i have just created the pool and
started
>>>> it.
>>>> I have set the settings as your recommendation and it seems to
be the
>>>> same issue;
>>>>
>>>> Type: Striped-Replicate
>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>> Status: Started
>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>> Transport-type: tcp
>>>> Bricks:
>>>> Brick1: gfs001:/bricks/t1/s
>>>> Brick2: gfs002:/bricks/t1/s
>>>> Brick3: gfs001:/bricks/t2/s
>>>> Brick4: gfs002:/bricks/t2/s
>>>> Options Reconfigured:
>>>> performance.stat-prefetch: off
>>>> network.remote-dio: on
>>>> cluster.eager-lock: enable
>>>> performance.io-cache: off
>>>> performance.read-ahead: off
>>>> performance.quick-read: off
>>>> performance.readdir-ahead: on
>>>>
>>>
>>>
>>> Is their a raid controller perhaps doing any caching?
>>>
>>> In the gluster logs any errors being reported during migration
process?
>>> Since they aren't in use yet have you tested making just
mirrored bricks
>>> using different pairings of servers two at a time to see if problem
follows
>>> certain machine or network ports?
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <
>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>>
>>>>> Dears,
>>>>>
>>>>> I have created a replicated striped volume with two bricks
and two
>>>>> servers but I can't use it because when I mount it in
ESXi and try to
>>>>> migrate a VM to it, the data get corrupted.
>>>>> Is any one have any idea why is this happening ?
>>>>>
>>>>> Dell 2950 x2
>>>>> Seagate 15k 600GB
>>>>> CentOS 7.2
>>>>> Gluster 3.7.8
>>>>>
>>>>> Appreciate your help.
>>>>>
>>>>
>>>> Most reports of this I have seen end up being settings related.
Post
>>>> gluster volume info. Below is what I have seen as most common
recommended
>>>> settings.
>>>> I'd hazard a guess you may have some the read ahead cache
or prefetch
>>>> on.
>>>>
>>>> quick-read=off
>>>> read-ahead=off
>>>> io-cache=off
>>>> stat-prefetch=off
>>>> eager-lock=enable
>>>> remote-dio=on
>>>>
>>>>>
>>>>> Mahdi Adnan
>>>>> System Admin
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> <Gluster-users at gluster.org>Gluster-users at
gluster.org
>>>>>
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160313/6bf0c0b2/attachment.html>

pkoelle

2016-Mar-13 17:40 UTC

head link

[Gluster-users] Replicated striped data lose

Am 13.03.2016 um 18:22 schrieb David Gossage:> On Sun, Mar 13, 2016 at 11:07 AM, Mahdi Adnan <mahdi.adnan at
earthlinktele.com
>> wrote:
>
>> My HBAs are LSISAS1068E, and the filesystem is XFS.
>> I tried EXT4 and it did not help.
>> I have created a stripted volume in one server with two bricks, same
issue.
>> and i tried a replicated volume with just "sharding enabled"
same issue,
>> as soon as i disable the sharding it works just fine, niether sharding
nor
>> striping works for me.
>> i did follow up with some of threads in the mailing list and tried some
of
>> the fixes that worked with the others, none worked for me. :(
>>
>
> Is it possible the LSI has write-cache enabled?Why is that relevant? Even the backing filesystem has no idea if there is a RAID
or write cache or whatever. There are blocks and sync(), end of story.
If you lose power and screw up your recovery OR do funky stuff with SAS 
multipathing that might be an issue with a controller cache. AFAIK thats not 
what we are talking about.

I'm afraid but unless the OP has some logs from the server, a reproducible 
testcase or a backtrace from client or server this isn't getting us
anywhere.

cheers
Paul
>
>
>
>
>> On 03/13/2016 06:54 PM, David Gossage wrote:
>>
>>
>>
>>
>> On Sun, Mar 13, 2016 at 8:16 AM, Mahdi Adnan <
>> mahdi.adnan at earthlinktele.com> wrote:
>>
>>> Okay so i have enabled shard in my test volume and it did not help,
>>> stupidly enough, i have enabled it in a production volume
>>> "Distributed-Replicate" and it currpted  half of my VMs.
>>> I have updated Gluster to the latest and nothing seems to be
changed in
>>> my situation.
>>> below the info of my volume;
>>>
>>
>> I was pointing at the settings in that email as an example for
corruption
>> fixing. I wouldn't recommend enabling sharding if you haven't
gotten the
>> base working yet on that cluster. What HBA's are you using and what
is
>> layout of filesystem for bricks?
>>
>>
>>> Number of Bricks: 3 x 2 = 6
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: gfs001:/bricks/b001/vmware
>>> Brick2: gfs002:/bricks/b004/vmware
>>> Brick3: gfs001:/bricks/b002/vmware
>>> Brick4: gfs002:/bricks/b005/vmware
>>> Brick5: gfs001:/bricks/b003/vmware
>>> Brick6: gfs002:/bricks/b006/vmware
>>> Options Reconfigured:
>>> performance.strict-write-ordering: on
>>> cluster.server-quorum-type: server
>>> cluster.quorum-type: auto
>>> network.remote-dio: enable
>>> performance.stat-prefetch: disable
>>> performance.io-cache: off
>>> performance.read-ahead: off
>>> performance.quick-read: off
>>> cluster.eager-lock: enable
>>> features.shard-block-size: 16MB
>>> features.shard: on
>>> performance.readdir-ahead: off
>>>
>>>
>>> On 03/12/2016 08:11 PM, David Gossage wrote:
>>>
>>>
>>> On Sat, Mar 12, 2016 at 10:21 AM, Mahdi Adnan <
>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>
>>>> Both servers have HBA no RAIDs and i can setup a replicated or
>>>> dispensers without any issues.
>>>> Logs are clean and when i tried to migrate a vm and got the
error,
>>>> nothing showed up in the logs.
>>>> i tried mounting the volume into my laptop and it mounted fine
but, if i
>>>> use dd to create a data file it just hang and i cant cancel it,
and i cant
>>>> unmount it or anything, i just have to reboot.
>>>> The same servers have another volume on other bricks in a
distributed
>>>> replicas, works fine.
>>>> I have even tried the same setup in a virtual environment
(created two
>>>> vms and install gluster and created a replicated striped) and
again same
>>>> thing, data corruption.
>>>>
>>>
>>> I'd look through mail archives for a topic "Shard in
Production" I think
>>> it's called.  The shard portion may not be relevant but it does
discuss
>>> certain settings that had to be applied with regards to avoiding
corruption
>>> with VM's.  You may want to try and disable the 
performance.readdir-ahead
>>> also.
>>>
>>>
>>>>
>>>> On 03/12/2016 07:02 PM, David Gossage wrote:
>>>>
>>>>
>>>>
>>>> On Sat, Mar 12, 2016 at 9:51 AM, Mahdi Adnan <
>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>>
>>>>> Thanks David,
>>>>>
>>>>> My settings are all defaults, i have just created the pool
and started
>>>>> it.
>>>>> I have set the settings as your recommendation and it seems
to be the
>>>>> same issue;
>>>>>
>>>>> Type: Striped-Replicate
>>>>> Volume ID: 44adfd8c-2ed1-4aa5-b256-d12b64f7fc14
>>>>> Status: Started
>>>>> Number of Bricks: 1 x 2 x 2 = 4
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: gfs001:/bricks/t1/s
>>>>> Brick2: gfs002:/bricks/t1/s
>>>>> Brick3: gfs001:/bricks/t2/s
>>>>> Brick4: gfs002:/bricks/t2/s
>>>>> Options Reconfigured:
>>>>> performance.stat-prefetch: off
>>>>> network.remote-dio: on
>>>>> cluster.eager-lock: enable
>>>>> performance.io-cache: off
>>>>> performance.read-ahead: off
>>>>> performance.quick-read: off
>>>>> performance.readdir-ahead: on
>>>>>
>>>>
>>>>
>>>> Is their a raid controller perhaps doing any caching?
>>>>
>>>> In the gluster logs any errors being reported during migration
process?
>>>> Since they aren't in use yet have you tested making just
mirrored bricks
>>>> using different pairings of servers two at a time to see if
problem follows
>>>> certain machine or network ports?
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 03/12/2016 03:25 PM, David Gossage wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Mar 12, 2016 at 1:55 AM, Mahdi Adnan <
>>>>> <mahdi.adnan at earthlinktele.com>mahdi.adnan at
earthlinktele.com> wrote:
>>>>>
>>>>>> Dears,
>>>>>>
>>>>>> I have created a replicated striped volume with two
bricks and two
>>>>>> servers but I can't use it because when I mount it
in ESXi and try to
>>>>>> migrate a VM to it, the data get corrupted.
>>>>>> Is any one have any idea why is this happening ?
>>>>>>
>>>>>> Dell 2950 x2
>>>>>> Seagate 15k 600GB
>>>>>> CentOS 7.2
>>>>>> Gluster 3.7.8
>>>>>>
>>>>>> Appreciate your help.
>>>>>>
>>>>>
>>>>> Most reports of this I have seen end up being settings
related.  Post
>>>>> gluster volume info. Below is what I have seen as most
common recommended
>>>>> settings.
>>>>> I'd hazard a guess you may have some the read ahead
cache or prefetch
>>>>> on.
>>>>>
>>>>> quick-read=off
>>>>> read-ahead=off
>>>>> io-cache=off
>>>>> stat-prefetch=off
>>>>> eager-lock=enable
>>>>> remote-dio=on
>>>>>
>>>>>>
>>>>>> Mahdi Adnan
>>>>>> System Admin
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> <Gluster-users at gluster.org>Gluster-users at
gluster.org
>>>>>>
<http://www.gluster.org/mailman/listinfo/gluster-users>
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Mar 2016 - Replicated striped data lose

[Gluster-users] Replicated striped data lose

[Gluster-users] Replicated striped data lose