thr3ads.net - Ocfs2 users - [Ocfs2-users] Diagnosing poor write performance [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Graeme Donaldson

2016-Mar-30 14:26 UTC

[Ocfs2-users] Diagnosing poor write performance

On 30 March 2016 at 14:24, Eric Ren <zren at suse.com> wrote:
> Hi,
>>
>> We're seeing very poor write performance on a cluster that was
built
>> roughly a year ago. I am by no means an expert on OCFS2, nor the DRBD
>> layer
>> that we have under it. We do have several clusters that are configured
in
>> much the same way via our puppet infrastructure, yet this particular
one
>> gives us write speeds around the 15 kilobyte/sec mark, where some of
our
>> other clients do 55 megabytes/sec on similar hardware.
>>
>
> How did you perform the testing? It really matters. If you write a file on
> shared disk from one node, and read this file from another node, without,
> or with very little interval, the writing IO speed could decrease by ~20
> times according my previous testing(just as a reference). It's a
extremely
> bad situation for 2 nodes cluster, isn't?
>
> But it's incredible that in your case writing speed drop by >3000
times!

I simply used 'dd' to create a file with /dev/zero as a source. If there
is
a better way to do this I am all ears.


> I realise that this is all very vague, so for now I am just hoping for
>> general pointers on where to start in diagnosing this, from which I can
do
>> more research and then hopefully revisit the thread with more detailed
>> questions and data.
>>
>> Some basic info to get started:
>>
>> O/S: Debian Wheezy
>> Kernel: Linux hostname 3.2.0-4-amd64 #1 SMP Debian 3.2.73-2+deb7u3
x86_64
>> GNU/Linux
>> ocfs2-tools: 1.6.4-1+deb7u1
>> 2 servers in the cluster. OCFS2 filesystem lives on a DRBD dual-primary
>> device, which itself is built on an LVM volume, whose VG lives on a
RAID1
>> pair of 1TB SATA HDDs.
>>
>
> Could you firstly do test on LVM, then DRBD, and then OCFS2? Let's
blame
> on them more fairly.
>
> Eric
>
>If I do a similar write of a file to a directory that exists on a LVM LV I
get roughly 100 megabytes/sec.

I can't write straight to the DRBD device, as that would entail wiping the
customer's OCFS2 filesystem, which I cannot do.

Graeme
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://oss.oracle.com/pipermail/ocfs2-users/attachments/20160330/168f5ae8/attachment.html

Eric Ren

2016-Mar-31 02:17 UTC

head link

[Ocfs2-users] Diagnosing poor write performance

Hi,
>>
>> How did you perform the testing? It really matters. If you write a file
on
>> shared disk from one node, and read this file from another node,
without,
>> or with very little interval, the writing IO speed could decrease by
~20
>> times according my previous testing(just as a reference). It's a
extremely
>> bad situation for 2 nodes cluster, isn't?
>>
>> But it's incredible that in your case writing speed drop by
>3000 times!
>
>
> I simply used 'dd' to create a file with /dev/zero as a source. If
there is
> a better way to do this I am all ears.
Alright, you just did a local IO on ocfs2, then the performance 
shouldn't be that bad. I guess the ocfs2 volume has been used over 60%? 
or seriously fragmented?
Please give info with `df -h`, and super block with debugfs.ocfs2, and 
also the exact `dd` command you performed. Additionally, perform `dd` on 
each node.

You know, ocfs2 is a shared disk fs. So 3 basic testing cases I can 
think of are:
1. only one node of cluster do IO;
2. more than one nodes of cluster perform IO, but each nodes just 
read/write its own file on shared disk;
3. like 2), but some nodes read and some write a same file on shared disk.

The above model is much theoretically simplified though. The practical 
scenarios could be much more complicated, like fragmentation issue that 
your case much likely is.

>> Could you firstly do test on LVM, then DRBD, and then OCFS2? Let's
blame
>> on them more fairly.
>>
> If I do a similar write of a file to a directory that exists on a LVM LV I
> get roughly 100 megabytes/sec.
>
> I can't write straight to the DRBD device, as that would entail wiping
the
> customer's OCFS2 filesystem, which I cannot do.
OK, it's product environment. I can understand.

Eric

Ocfs2 users - Mar 2016 - Diagnosing poor write performance

[Ocfs2-users] Diagnosing poor write performance

[Ocfs2-users] Diagnosing poor write performance