thr3ads.net - Ocfs2 users - [Ocfs2-users] ocfs2 slows down with more servers in a cluster [Jan 2009]

If this information is useful, please help other people find it:
Share via:

David Schüler

2009-Jan-20 09:45 UTC

[Ocfs2-users] ocfs2 slows down with more servers in a cluster

Hello to everybody on the list,

I have a problem regarding the file operations per second on an ocfs2 volume. I
mean not the read or write speeds just the operations per second that the
filesystem can handle.

Let's start at the beginning, here's what I have and what I'm doing:
I have a big fibre storage device with around 6TB of space, raid6 on sata-2
drives. I have 8 servers in my cluster connected to the storage via two fibre
switches. Switches and HBAs are QLogic 4GBps. The Storage is an Infortrend
EONStor. The servers are connected through GBit network switches. I think the
hardware is not what causes the problem.

I'm using Ubuntu Server 8.04.1 LTS on all the machines. I create one 6TB
partion with parted using the gpt disklabel. I format the partition with ocfs2.
On all servers the o2cb service is running configured with the same heartbeat,
network and so on timeout values. The same cluster.conf on every server as well.

I mount the ocfs2 volume on one machnine and everything works fine. I did some
bonnie++ testing and got a write speed of 70MB/s and a read speed of 140MB/s.
I mount the volume on all servers and everything still works good. I can do
concurrent reads and writes, no errors, no fencing. Nevertheless everythings
feels slow. I did a rsync to bring 1,4TB of data to the volume. With the volume
mounted on one server this takes around 1 1/2 day. With the volume mounted on
all the servers I stopped it after two days and not more than 200GB synced. This
made me wonder so I started some more tests.
bonnie++ with the volume mounted on all servers still reported a write speed of
70MB/s and a read speed of 130MB/s. So this seems not to be the problem. I took
a look closer at the bonnie tests. The file operations per second are tested as
well and I saw that with the ocfs2 volume mounted on one server it reaches
around 2.500 operations/s with the volume mounted on all the server it slows
down to 16 operations/s. For the mathematics: With a write speed of 70MB/s I
should be able to get written 70 Files of 1MB size in just one second but I
don't get it because not more than 16 files can be handled by the metadata.
OK, still 16MB/s. Now, I don't have 1MB files, I have files of around 100kB
each, so it slows my highspeed fibre storage down to 1,6MB/s (or even less).

I did some more testing and found out the more servers are in the cluster the
slower everything gets. After reading nearly everything about ocfs2 I could find
I did additional tests. I reduced the volume size to 500GB no longer using gpt
as volume label. I tried different cluster and block sizes. I reduced the number
of node slots. I used -T mail and -T datafiles. All these options in nearly
every combination. Nothing helped. I even switched to Ubuntu Server 8.10 because
of the newer ocfs2 kernel module but nothing changed.

I think I must be doing something wrong because I never read about such a
problem before and if it was because of a bug more people would be reporting it
I think.

Here are my bonnie++ tests:
Volume mounted on one server:
root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03b ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
upload1 8G 40434 76 74376 27 42018 18 41361 76 137827 25 538.8 1
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
upload1,8G,40434,76,74376,27,42018,18,41361,76,137827,25,538.8,1,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

Volume mounted on all 8 servers:
root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
Using uid:0, gid:0.
Writing with putc()...done
Writing intelligently...done
Rewriting...done
Reading with getc()...done
Reading intelligently...done
start 'em...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version 1.03b ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
upload1 8G 39445 74 75634 29 42245 20 42198 76 128275 22 573.1 0
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
1 16 1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
upload1,8G,39445,74,75634,29,42245,20,42198,76,128275,22,573.1,0,1,16,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++

It's this Create/sec line where my problem with slow speeds seems to come
from.

In some more tests I found out that Create/sec goes up to 2.500 with one server
mounting the volume (the -n option has to be raised from 1 to 10 to see it).
I formated the volume with ext3 and got 70.000 Create/sec with ext3, so it seems
that even with one server mounting the volume something is not right. With
tunefs.ocfs2 I switched the ocfs2 volume to 'local' and got Create/sec
up to 3.500 but this still seems very slow to me.

With this in mind I started testing on one machine with ocfs2 on a local drive
but I can't get more than 2.500 Create/sec, no matter which cluster and
block size I'm using. I can't test it with this volume on more than one
server because I can't export the local drive to the other machines, but
I'm sure it would get slower with more servers mounting the volume.

I did one last test yesterday. I installed CentOS 5.2 with the latest ocfs2
module and tools from oracle for EL5. I still can't get more than 2.500
Create/sec with one server mounting the volume. Next I'll do some testing
with more than one CentOS server but perhaps someone has a good idea for me or a
hint what I'm doing wrong. I'm sure ocfs2 can perform much better than
in my tests.

Oh, I forgot: I even tested on different hardware, a dual xeon machnine with 4GB
RAM as well as a core 2 duo machine with 2GB RAM, no changes. I used 32 bit and
64 bit versions of Ubuntu Server. As well, no changes.

I'm sorry for the long post, but I'm new to the list and I think every
little peace of information could be helpfull.

Kind regards,
David

____________
Virus checked by G DATA AntiVirusKit
Version: AVF 19.226 from 18.01.2009
Virus news: www.antiviruslab.com

Sunil Mushran

2009-Jan-20 23:24 UTC

head link

[Ocfs2-users] ocfs2 slows down with more servers in a cluster

In my run bonnie created over 20 thousand files in one directory.
This is problematic as ocfs2 currently lacks indexed directories.
Meaning during each create, it needs to scan the entire directory
to ensure there is no name clash.

The good news is that we are in the process of addressing this
shortcoming.

Sunil

David Sch?ler wrote:> Hello to everybody on the list,
>
> I have a problem regarding the file operations per second on an ocfs2
volume. I mean not the read or write speeds just the operations per second that
the filesystem can handle.
>
> Let's start at the beginning, here's what I have and what I'm
doing:
> I have a big fibre storage device with around 6TB of space, raid6 on sata-2
drives. I have 8 servers in my cluster connected to the storage via two fibre
switches. Switches and HBAs are QLogic 4GBps. The Storage is an Infortrend
EONStor. The servers are connected through GBit network switches. I think the
hardware is not what causes the problem.
>
> I'm using Ubuntu Server 8.04.1 LTS on all the machines. I create one
6TB partion with parted using the gpt disklabel. I format the partition with
ocfs2. On all servers the o2cb service is running configured with the same
heartbeat, network and so on timeout values. The same cluster.conf on every
server as well.
>
> I mount the ocfs2 volume on one machnine and everything works fine. I did
some bonnie++ testing and got a write speed of 70MB/s and a read speed of
140MB/s.
> I mount the volume on all servers and everything still works good. I can do
concurrent reads and writes, no errors, no fencing. Nevertheless everythings
feels slow. I did a rsync to bring 1,4TB of data to the volume. With the volume
mounted on one server this takes around 1 1/2 day. With the volume mounted on
all the servers I stopped it after two days and not more than 200GB synced. This
made me wonder so I started some more tests.
> bonnie++ with the volume mounted on all servers still reported a write
speed of 70MB/s and a read speed of 130MB/s. So this seems not to be the
problem. I took a look closer at the bonnie tests. The file operations per
second are tested as well and I saw that with the ocfs2 volume mounted on one
server it reaches around 2.500 operations/s with the volume mounted on all the
server it slows down to 16 operations/s. For the mathematics: With a write speed
of 70MB/s I should be able to get written 70 Files of 1MB size in just one
second but I don't get it because not more than 16 files can be handled by
the metadata. OK, still 16MB/s. Now, I don't have 1MB files, I have files of
around 100kB each, so it slows my highspeed fibre storage down to 1,6MB/s (or
even less).
>
> I did some more testing and found out the more servers are in the cluster
the slower everything gets. After reading nearly everything about ocfs2 I could
find I did additional tests. I reduced the volume size to 500GB no longer using
gpt as volume label. I tried different cluster and block sizes. I reduced the
number of node slots. I used -T mail and -T datafiles. All these options in
nearly every combination. Nothing helped. I even switched to Ubuntu Server 8.10
because of the newer ocfs2 kernel module but nothing changed.
>
> I think I must be doing something wrong because I never read about such a
problem before and if it was because of a bug more people would be reporting it
I think.
>
> Here are my bonnie++ tests:
> Volume mounted on one server:
> root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
> Using uid:0, gid:0.
> Writing with putc()...done
> Writing intelligently...done
> Rewriting...done
> Reading with getc()...done
> Reading intelligently...done
> start 'em...done...done...done...
> Create files in sequential order...done.
> Stat files in sequential order...done.
> Delete files in sequential order...done.
> Create files in random order...done.
> Stat files in random order...done.
> Delete files in random order...done.
> Version 1.03b       ------Sequential Output------ --Sequential Input-
--Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
> upload1          8G 40434  76 74376  27 42018  18 41361  76 137827  25
538.8   1
>                     ------Sequential Create------ --------Random
Create--------
>                     -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
>               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
>                   1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++
+++
>
upload1,8G,40434,76,74376,27,42018,18,41361,76,137827,25,538.8,1,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++
>
> Volume mounted on all 8 servers:
> root at upload1:/daten# bonnie++ -d /daten -n 1 -u 0 -g 0
> Using uid:0, gid:0.
> Writing with putc()...done
> Writing intelligently...done
> Rewriting...done
> Reading with getc()...done
> Reading intelligently...done
> start 'em...done...done...done...
> Create files in sequential order...done.
> Stat files in sequential order...done.
> Delete files in sequential order...done.
> Create files in random order...done.
> Stat files in random order...done.
> Delete files in random order...done.
> Version 1.03b       ------Sequential Output------ --Sequential Input-
--Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
--Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
%CP
> upload1          8G 39445  74 75634  29 42245  20 42198  76 128275  22
573.1   0
>                     ------Sequential Create------ --------Random
Create--------
>                     -Create-- --Read--- -Delete-- -Create-- --Read---
-Delete--
>               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
%CP
>                   1    16   1 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++
+++
>
upload1,8G,39445,74,75634,29,42245,20,42198,76,128275,22,573.1,0,1,16,1,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++
>
> It's this Create/sec line where my problem with slow speeds seems to
come from.
>
> In some more tests I found out that Create/sec goes up to 2.500 with one
server mounting the volume (the -n option has to be raised from 1 to 10 to see
it).
> I formated the volume with ext3 and got 70.000 Create/sec with ext3, so it
seems that even with one server mounting the volume something is not right. With
tunefs.ocfs2 I switched the ocfs2 volume to 'local' and got Create/sec
up to 3.500 but this still seems very slow to me.
>
> With this in mind I started testing on one machine with ocfs2 on a local
drive but I can't get more than 2.500 Create/sec, no matter which cluster
and block size I'm using. I can't test it with this volume on more than
one server because I can't export the local drive to the other machines, but
I'm sure it would get slower with more servers mounting the volume.
>
> I did one last test yesterday. I installed CentOS 5.2 with the latest ocfs2
module and tools from oracle for EL5. I still can't get more than 2.500
Create/sec with one server mounting the volume. Next I'll do some testing
with more than one CentOS server but perhaps someone has a good idea for me or a
hint what I'm doing wrong. I'm sure ocfs2 can perform much better than
in my tests.
>
> Oh, I forgot: I even tested on different hardware, a dual xeon machnine
with 4GB RAM as well as a core 2 duo machine with 2GB RAM, no changes. I used 32
bit and 64 bit versions of Ubuntu Server. As well, no changes.
>
> I'm sorry for the long post, but I'm new to the list and I think
every little peace of information could be helpfull.
>
> Kind regards,
> David
>
>
> ____________
> Virus checked by G DATA AntiVirusKit
> Version: AVF 19.226 from 18.01.2009
> Virus news: www.antiviruslab.com
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>

Ocfs2 users - Jan 2009 - ocfs2 slows down with more servers in a cluster

[Ocfs2-users] ocfs2 slows down with more servers in a cluster

[Ocfs2-users] ocfs2 slows down with more servers in a cluster