Hi Geoffrey, The file content deletion is because of 'vi editor' behaviour of truncating the file when writing the updated content. Regarding quota size/usage problem, can you please execute the script attached on each brick and provide us the output generated, this will help us analyse why quota list is showing wrong-size. The script basically crawls the directory given as argument. It collects quota "contri" and "size" extended attribute and also "block size" from stat call. Usage: ./quota-verify -b <brick_path> | tee brick_name.log Thanks, Vijay On Tuesday 09 June 2015 03:45 PM, Vijaikumar M wrote:> > > On Tuesday 09 June 2015 03:40 PM, Geoffrey Letessier wrote: >> Hi Vijay, >> >> Thanks for having replied. >> >> Unfortunately, i check each bricks on my stockage pool and dont find >> any backup file.. damage! > > Please check backup file on client machine where the file was edited > and on the home dir of a user (this is the user login used to edit a > file). > > Thanks, > Vijay > > >> >> Thank you again! >> Good luck and see you, >> Geoffrey >> ------------------------------------------------------ >> Geoffrey Letessier >> Responsable informatique & ing?nieur syst?me >> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >> Institut de Biologie Physico-Chimique >> 13, rue Pierre et Marie Curie - 75005 Paris >> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >> <mailto:geoffrey.letessier at ibpc.fr> >> >>> Le 9 juin 2015 ? 10:05, Vijaikumar M <vmallika at redhat.com >>> <mailto:vmallika at redhat.com>> a ?crit : >>> >>> >>> >>> On Tuesday 09 June 2015 01:08 PM, Geoffrey Letessier wrote: >>>> Hi, >>>> >>>> Yes of course: >>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -s >>>> /export/brick_home/brick*/amyloid_team >>>> cl-storage1: 1608522280/export/brick_home/brick1/amyloid_team >>>> cl-storage3: 1619630616/export/brick_home/brick1/amyloid_team >>>> cl-storage1: 1614057836/export/brick_home/brick2/amyloid_team >>>> cl-storage3: 1602653808/export/brick_home/brick2/amyloid_team >>>> >>>> The sum is: 6444864540 (around 6.4-6.5TB) while the quota list >>>> displays 7.7TB. >>>> So, the mistake is roughly 1.2-1.3TB, in other words around 16% >>>> -which is too huge, no? >>>> >>>> In addition, since the quota is exceeded, i note a lot of files >>>> like following: >>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] "cd >>>> /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/; >>>> ls -ail remd_100.sh 2> /dev/null" 2>/dev/null >>>> cl-storage3: 133325688 ---------T 2 tarus amyloid_team 0 16 f?vr. >>>> 10:20 remd_100.sh >>>> note the ?T? at the end of perms and the file size to 0B. >>>> >>>> And, yesterday, some files were duplicated but not anymore... >>>> >>>> The worst is, previously, all these files were OK. In other words, >>>> exceeding quota made file or content deletions or corruptions? What >>>> can I do to prevent to situation for the futur -because I guess i >>>> cannot do something to rollback this situation now, right? >>>> >>> >>> Hi Geoffrey, >>> >>> I tried re-creating the problem. >>> >>> Here is the behaviour of vi editor. >>> When a file is saved in vi editor, it creates a backup file under >>> home dir and opens the original file with 'O_TRUNC' flag and hence >>> file was truncated. >>> >>> >>> Here is the strace of vi editor when it gets 'EDQUOT' error: >>> >>> open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3 >>> write(3, "line one\nline two\n", 18) = 18 >>> fsync(3) = 0 >>> close(3) = -1 EDQUOT (Disk quota exceeded) >>> chmod("hello", 0100644) = 0 >>> open("/root/hello~", O_RDONLY) = 3 >>> *open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 7* >>> read(3, "line one\n", 256) = 9 >>> write(7, "line one\n", 9) = 9 >>> read(3, "", 256) = 0 >>> close(7) = -1 EDQUOT (Disk quota exceeded) >>> close(3) = 0 >>> >>> >>> To re-cover the truncated file, please find if there are any backup >>> file 'remd_115.sh~' under '~/' or on the same dir where this file >>> exists.If exists you can copy this file. >>> >>> Thanks, >>> Vijay >>> >>> >>>> Geoffrey >>>> ------------------------------------------------------ >>>> Geoffrey Letessier >>>> Responsable informatique & ing?nieur syst?me >>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>> Institut de Biologie Physico-Chimique >>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>> <mailto:geoffrey.letessier at ibpc.fr> >>>> >>>>> Le 9 juin 2015 ? 09:01, Vijaikumar M <vmallika at redhat.com >>>>> <mailto:vmallika at redhat.com>> a ?crit : >>>>> >>>>> >>>>> >>>>> On Monday 08 June 2015 07:11 PM, Geoffrey Letessier wrote: >>>>>> In addition, i notice a very big difference between the sum of DU >>>>>> on each brick and ? quota list ? display, as you can read below: >>>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -sh >>>>>> /export/brick_home/brick*/amyloid_team >>>>>> cl-storage1: 1,6T/export/brick_home/brick1/amyloid_team >>>>>> cl-storage3: 1,6T/export/brick_home/brick1/amyloid_team >>>>>> cl-storage1: 1,6T/export/brick_home/brick2/amyloid_team >>>>>> cl-storage3: 1,6T/export/brick_home/brick2/amyloid_team >>>>>> [root at lucifer ~]# gluster volume quota vol_home list /amyloid_team >>>>>> Path Hard-limit Soft-limit >>>>>> Used Available >>>>>> -------------------------------------------------------------------------------- >>>>>> /amyloid_team 9.0TB 90% 7.8TB 1.2TB >>>>>> >>>>>> As you can notice, the sum of all bricks gives me roughly 6.4TB >>>>>> and ? quota list ? around 7.8TB; so there is a difference of >>>>>> 1.4TB i?m not able to explain? Do you have any idea? >>>>>> >>>>> >>>>> There were few issues when quota accounting the size, we have >>>>> fixed some of these issues in 3.7 >>>>> 'df -h' will round off the values, can you please provide the >>>>> output of 'df' without -h option? >>>>> >>>>> >>>>> >>>>> >>>>>> Thanks, >>>>>> Geoffrey >>>>>> ------------------------------------------------------ >>>>>> Geoffrey Letessier >>>>>> Responsable informatique & ing?nieur syst?me >>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>> Institut de Biologie Physico-Chimique >>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>> <mailto:geoffrey.letessier at ibpc.fr> >>>>>> >>>>>>> Le 8 juin 2015 ? 14:30, Geoffrey Letessier >>>>>>> <geoffrey.letessier at cnrs.fr <mailto:geoffrey.letessier at cnrs.fr>> >>>>>>> a ?crit : >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Concerning the 3.5.3 version of GlusterFS, I met this morning a >>>>>>> strange issue writing file when quota is exceeded. >>>>>>> >>>>>>> One person of my lab, whose her quota is exceeded (but she >>>>>>> didn?t know about) try to modify a file but, because of exceeded >>>>>>> quota, she was unable to and decided to exit VI. Now, her file >>>>>>> is empty/blank as you can read below: >>>>> we suspect 'vi' might have created tmp file before writing to a >>>>> file. We are working on re-creating this problem and will update >>>>> you on the same. >>>>> >>>>> >>>>>>> pdsh at lucifer: cl-storage3: ssh exited with exit code 2 >>>>>>> cl-storage1: ---------T 2 tarus amyloid_team 0 19 f?vr. 12:34 >>>>>>> /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh >>>>>>> cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0 8 juin 12:38 >>>>>>> /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh >>>>>>> >>>>>>> In addition, i dont understand why, my volume being a >>>>>>> distributed volume inside replica (cl-storage[1,3] is replicated >>>>>>> only on cl-storage[2,4]), i have 2 ? same ? files (complete >>>>>>> path) in 2 different bricks (as you can read above). >>>>>>> >>>>>>> Thanks by advance for your help and clarification. >>>>>>> Geoffrey >>>>>>> ------------------------------------------------------ >>>>>>> Geoffrey Letessier >>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>>> Institut de Biologie Physico-Chimique >>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>>> <mailto:geoffrey.letessier at ibpc.fr> >>>>>>> >>>>>>>> Le 2 juin 2015 ? 23:45, Geoffrey Letessier >>>>>>>> <geoffrey.letessier at cnrs.fr >>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>> a ?crit : >>>>>>>> >>>>>>>> Hi Ben, >>>>>>>> >>>>>>>> I just check my messages log files, both on client and server, >>>>>>>> and I dont find any hung task you notice on yours.. >>>>>>>> >>>>>>>> As you can read below, i dont note the performance issue in a >>>>>>>> simple DD but I think my issue is concerning a set of small >>>>>>>> files (tens of thousands nay more)? >>>>>>>> >>>>>>>> [root at nisus test]# ddt -t 10g /mnt/test/ >>>>>>>> Writing to /mnt/test/ddt.8362 ... syncing ... done. >>>>>>>> sleeping 10 seconds ... done. >>>>>>>> Reading from /mnt/test/ddt.8362 ... done. >>>>>>>> 10240MiB KiB/s CPU% >>>>>>>> Write 114770 4 >>>>>>>> Read 40675 4 >>>>>>>> >>>>>>>> for info: /mnt/test concerns the single v2 GlFS volume >>>>>>>> >>>>>>>> [root at nisus test]# ddt -t 10g /mnt/fhgfs/ >>>>>>>> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done. >>>>>>>> sleeping 10 seconds ... done. >>>>>>>> Reading from /mnt/fhgfs/ddt.8380 ... done. >>>>>>>> 10240MiB KiB/s CPU% >>>>>>>> Write 102591 1 >>>>>>>> Read 98079 2 >>>>>>>> >>>>>>>> Do you have a idea how to tune/optimize performance settings? >>>>>>>> and/or TCP settings (MTU, etc.)? >>>>>>>> >>>>>>>> --------------------------------------------------------------- >>>>>>>> | | UNTAR | DU | FIND | TAR | RM | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s | >>>>>>>> --------------------------------------------------------------- >>>>>>>> for info: >>>>>>>> -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and >>>>>>>> 2 servers) >>>>>>>> - single (v2): simple gluster volume with default settings >>>>>>>> >>>>>>>> I also note I obtain the same tar/untar performance issue with >>>>>>>> FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK. >>>>>>>> >>>>>>>> Thank you very much for your reply and help. >>>>>>>> Geoffrey >>>>>>>> ----------------------------------------------- >>>>>>>> Geoffrey Letessier >>>>>>>> >>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr >>>>>>>> <mailto:geoffrey.letessier at cnrs.fr> >>>>>>>> >>>>>>>> Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com >>>>>>>> <mailto:bturner at redhat.com>> a ?crit : >>>>>>>> >>>>>>>>> I am seeing problems on 3.7 as well. Can you check >>>>>>>>> /var/log/messages on both the clients and servers for hung >>>>>>>>> tasks like: >>>>>>>>> >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: "echo 0 > >>>>>>>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: iozone D >>>>>>>>> 0000000000000001 0 21999 1 0x00000080 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 >>>>>>>>> 0000000000000082 ffff880611321c18 ffffffffa027236e >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 >>>>>>>>> ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 >>>>>>>>> ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: Call Trace: >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? >>>>>>>>> rpc_make_runnable+0x7e/0x80 [sunrpc] >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? >>>>>>>>> rpc_execute+0x50/0xa0 [sunrpc] >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? >>>>>>>>> ktime_get_ts+0xb1/0xf0 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? >>>>>>>>> sync_page+0x0/0x50 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] >>>>>>>>> io_schedule+0x73/0xc0 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] >>>>>>>>> sync_page+0x3d/0x50 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] >>>>>>>>> __wait_on_bit+0x5f/0x90 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] >>>>>>>>> wait_on_page_bit+0x73/0x80 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? >>>>>>>>> wake_bit_function+0x0/0x50 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? >>>>>>>>> pagevec_lookup_tag+0x25/0x40 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] >>>>>>>>> wait_on_page_writeback_range+0xfb/0x190 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] >>>>>>>>> filemap_write_and_wait_range+0x78/0x90 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] >>>>>>>>> vfs_fsync_range+0x7e/0x100 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] >>>>>>>>> vfs_fsync+0x1d/0x20 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] >>>>>>>>> do_fsync+0x3e/0x60 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] >>>>>>>>> sys_fsync+0x10/0x20 >>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] >>>>>>>>> system_call_fastpath+0x16/0x1b >>>>>>>>> >>>>>>>>> Do you see a perf problem with just a simple DD or do you need >>>>>>>>> a more complex workload to hit the issue? I think I saw an >>>>>>>>> issue with metadata performance that I am trying to run down, >>>>>>>>> let me know if you can see the problem with simple DD reads / >>>>>>>>> writes or if we need to do some sort of dir / metadata access >>>>>>>>> as well. >>>>>>>>> >>>>>>>>> -b >>>>>>>>> >>>>>>>>> ----- Original Message ----- >>>>>>>>>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr >>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr>> >>>>>>>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com >>>>>>>>>> <mailto:pkarampu at redhat.com>> >>>>>>>>>> Cc:gluster-users at gluster.org <mailto:gluster-users at gluster.org> >>>>>>>>>> Sent: Tuesday, June 2, 2015 8:09:04 AM >>>>>>>>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor >>>>>>>>>> performances >>>>>>>>>> >>>>>>>>>> Hi Pranith, >>>>>>>>>> >>>>>>>>>> I?m sorry but I cannot bring you any comparison because >>>>>>>>>> comparison will be >>>>>>>>>> distorted by the fact in my HPC cluster in production the >>>>>>>>>> network technology >>>>>>>>>> is InfiniBand QDR and my volumes are quite different (brick >>>>>>>>>> in RAID6 >>>>>>>>>> (12x2TB), 2 bricks per server and 4 servers into my pool) >>>>>>>>>> >>>>>>>>>> Concerning your demand, in attachments you can find all >>>>>>>>>> expected results >>>>>>>>>> hoping it can help you to solve this serious performance >>>>>>>>>> issue (maybe I need >>>>>>>>>> play with glusterfs parameters?). >>>>>>>>>> >>>>>>>>>> Thank you very much by advance, >>>>>>>>>> Geoffrey >>>>>>>>>> ------------------------------------------------------ >>>>>>>>>> Geoffrey Letessier >>>>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>>>>>> <mailto:geoffrey.letessier at ibpc.fr> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < >>>>>>>>>> pkarampu at redhat.com <mailto:pkarampu at redhat.com> > a >>>>>>>>>> ?crit : >>>>>>>>>> >>>>>>>>>> hi Geoffrey, >>>>>>>>>> Since you are saying it happens on all types of volumes, lets >>>>>>>>>> do the >>>>>>>>>> following: >>>>>>>>>> 1) Create a dist-repl volume >>>>>>>>>> 2) Set the options etc you need. >>>>>>>>>> 3) enable gluster volume profile using "gluster volume >>>>>>>>>> profile <volname> >>>>>>>>>> start" >>>>>>>>>> 4) run the work load >>>>>>>>>> 5) give output of "gluster volume profile <volname> info" >>>>>>>>>> >>>>>>>>>> Repeat the steps above on new and old version you are >>>>>>>>>> comparing this with. >>>>>>>>>> That should give us insight into what could be causing the >>>>>>>>>> slowness. >>>>>>>>>> >>>>>>>>>> Pranith >>>>>>>>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Dear all, >>>>>>>>>> >>>>>>>>>> I have a crash test cluster where i?ve tested the new version >>>>>>>>>> of GlusterFS >>>>>>>>>> (v3.7) before upgrading my HPC cluster in production. >>>>>>>>>> But? all my tests show me very very low performances. >>>>>>>>>> >>>>>>>>>> For my benches, as you can read below, I do some actions >>>>>>>>>> (untar, du, find, >>>>>>>>>> tar, rm) with linux kernel sources, dropping cache, each on >>>>>>>>>> distributed, >>>>>>>>>> replicated, distributed-replicated, single (single brick) >>>>>>>>>> volumes and the >>>>>>>>>> native FS of one brick. >>>>>>>>>> >>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf >>>>>>>>>> ~/linux-4.1-rc5.tar.xz; >>>>>>>>>> sync; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh >>>>>>>>>> linux-4.1-rc5/; echo 3 > >>>>>>>>>> /proc/sys/vm/drop_caches) >>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; find >>>>>>>>>> linux-4.1-rc5/|wc -l; echo 3 >>>>>>>>>>> /proc/sys/vm/drop_caches) >>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf >>>>>>>>>> linux-4.1-rc5.tgz >>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf >>>>>>>>>> linux-4.1-rc5.tgz >>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>> >>>>>>>>>> And here are the process times: >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | | UNTAR | DU | FIND | TAR | RM | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>> >>>>>>>>>> I get the same results, whether with default configurations >>>>>>>>>> with custom >>>>>>>>>> configurations. >>>>>>>>>> >>>>>>>>>> if I look at the side of the ifstat command, I can note my IO >>>>>>>>>> write processes >>>>>>>>>> never exceed 3MBs... >>>>>>>>>> >>>>>>>>>> EXT4 native FS seems to be faster (roughly 15-20% but no >>>>>>>>>> more) than XFS one >>>>>>>>>> >>>>>>>>>> My [test] storage cluster config is composed by 2 identical >>>>>>>>>> servers (biCPU >>>>>>>>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb >>>>>>>>>> ethernet) >>>>>>>>>> >>>>>>>>>> My volume settings: >>>>>>>>>> single: 1server 1 brick >>>>>>>>>> replicated: 2 servers 1 brick each >>>>>>>>>> distributed: 2 servers 2 bricks each >>>>>>>>>> dist-repl: 2 bricks in the same server and replica 2 >>>>>>>>>> >>>>>>>>>> All seems to be OK in gluster status command line. >>>>>>>>>> >>>>>>>>>> Do you have an idea why I obtain so bad results? >>>>>>>>>> Thanks in advance. >>>>>>>>>> Geoffrey >>>>>>>>>> ----------------------------------------------- >>>>>>>>>> Geoffrey Letessier >>>>>>>>>> >>>>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>>>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr >>>>>>>>>> <mailto:geoffrey.letessier at cnrs.fr> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Gluster-users mailing list Gluster-users at gluster.org >>>>>>>>>> <mailto:Gluster-users at gluster.org> >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Gluster-users mailing list >>>>>>>>>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150609/39d6d7c6/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: quota-verify.gz Type: application/gzip Size: 1929 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150609/39d6d7c6/attachment.bin>
Hello Vijay, Quota-verify is still running since a couple of hours (more than 10) and each output file sizes (4 files because 4 bricks per replica) are very huge: around 800MB per file in the first server and 5GB per file in the second one. Do your still want these? How can I send it to you? Nice night (in France) Geoffrey ------------------------------------------------------ Geoffrey Letessier Responsable informatique & ing?nieur syst?me UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique Institut de Biologie Physico-Chimique 13, rue Pierre et Marie Curie - 75005 Paris Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr Le 9 juin 2015 ? 12:46, Vijaikumar M <vmallika at redhat.com> a ?crit :> Hi Geoffrey, > > The file content deletion is because of 'vi editor' behaviour of truncating the file when writing the updated content. > > Regarding quota size/usage problem, can you please execute the script attached on each brick and provide us the output generated, this will help us analyse why quota list is showing wrong-size. > The script basically crawls the directory given as argument. > It collects quota "contri" and "size" extended attribute and also "block size" from stat call. > > Usage: > > ./quota-verify -b <brick_path> | tee brick_name.log > > > Thanks, > Vijay > > > > On Tuesday 09 June 2015 03:45 PM, Vijaikumar M wrote: >> >> >> On Tuesday 09 June 2015 03:40 PM, Geoffrey Letessier wrote: >>> Hi Vijay, >>> >>> Thanks for having replied. >>> >>> Unfortunately, i check each bricks on my stockage pool and dont find any backup file.. damage! >> >> Please check backup file on client machine where the file was edited and on the home dir of a user (this is the user login used to edit a file). >> >> Thanks, >> Vijay >> >> >>> >>> Thank you again! >>> Good luck and see you, >>> Geoffrey >>> ------------------------------------------------------ >>> Geoffrey Letessier >>> Responsable informatique & ing?nieur syst?me >>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>> Institut de Biologie Physico-Chimique >>> 13, rue Pierre et Marie Curie - 75005 Paris >>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>> >>>> Le 9 juin 2015 ? 10:05, Vijaikumar M <vmallika at redhat.com> a ?crit : >>>> >>>> >>>> >>>> On Tuesday 09 June 2015 01:08 PM, Geoffrey Letessier wrote: >>>>> Hi, >>>>> >>>>> Yes of course: >>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -s /export/brick_home/brick*/amyloid_team >>>>> cl-storage1: 1608522280 /export/brick_home/brick1/amyloid_team >>>>> cl-storage3: 1619630616 /export/brick_home/brick1/amyloid_team >>>>> cl-storage1: 1614057836 /export/brick_home/brick2/amyloid_team >>>>> cl-storage3: 1602653808 /export/brick_home/brick2/amyloid_team >>>>> >>>>> The sum is: 6444864540 (around 6.4-6.5TB) while the quota list displays 7.7TB. >>>>> So, the mistake is roughly 1.2-1.3TB, in other words around 16% -which is too huge, no? >>>>> >>>>> In addition, since the quota is exceeded, i note a lot of files like following: >>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] "cd /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/; ls -ail remd_100.sh 2> /dev/null" 2>/dev/null >>>>> cl-storage3: 133325688 ---------T 2 tarus amyloid_team 0 16 f?vr. 10:20 remd_100.sh >>>>> note the ?T? at the end of perms and the file size to 0B. >>>>> >>>>> And, yesterday, some files were duplicated but not anymore... >>>>> >>>>> The worst is, previously, all these files were OK. In other words, exceeding quota made file or content deletions or corruptions? What can I do to prevent to situation for the futur -because I guess i cannot do something to rollback this situation now, right? >>>>> >>>> >>>> Hi Geoffrey, >>>> >>>> I tried re-creating the problem. >>>> >>>> Here is the behaviour of vi editor. >>>> When a file is saved in vi editor, it creates a backup file under home dir and opens the original file with 'O_TRUNC' flag and hence file was truncated. >>>> >>>> >>>> Here is the strace of vi editor when it gets 'EDQUOT' error: >>>> >>>> open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 3 >>>> write(3, "line one\nline two\n", 18) = 18 >>>> fsync(3) = 0 >>>> close(3) = -1 EDQUOT (Disk quota exceeded) >>>> chmod("hello", 0100644) = 0 >>>> open("/root/hello~", O_RDONLY) = 3 >>>> open("hello", O_WRONLY|O_CREAT|O_TRUNC, 0644) = 7 >>>> read(3, "line one\n", 256) = 9 >>>> write(7, "line one\n", 9) = 9 >>>> read(3, "", 256) = 0 >>>> close(7) = -1 EDQUOT (Disk quota exceeded) >>>> close(3) = 0 >>>> >>>> >>>> To re-cover the truncated file, please find if there are any backup file 'remd_115.sh~' under '~/' or on the same dir where this file exists. If exists you can copy this file. >>>> >>>> Thanks, >>>> Vijay >>>> >>>> >>>>> Geoffrey >>>>> ------------------------------------------------------ >>>>> Geoffrey Letessier >>>>> Responsable informatique & ing?nieur syst?me >>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>> Institut de Biologie Physico-Chimique >>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>> >>>>>> Le 9 juin 2015 ? 09:01, Vijaikumar M <vmallika at redhat.com> a ?crit : >>>>>> >>>>>> >>>>>> >>>>>> On Monday 08 June 2015 07:11 PM, Geoffrey Letessier wrote: >>>>>>> In addition, i notice a very big difference between the sum of DU on each brick and ? quota list ? display, as you can read below: >>>>>>> [root at lucifer ~]# pdsh -w cl-storage[1,3] du -sh /export/brick_home/brick*/amyloid_team >>>>>>> cl-storage1: 1,6T /export/brick_home/brick1/amyloid_team >>>>>>> cl-storage3: 1,6T /export/brick_home/brick1/amyloid_team >>>>>>> cl-storage1: 1,6T /export/brick_home/brick2/amyloid_team >>>>>>> cl-storage3: 1,6T /export/brick_home/brick2/amyloid_team >>>>>>> [root at lucifer ~]# gluster volume quota vol_home list /amyloid_team >>>>>>> Path Hard-limit Soft-limit Used Available >>>>>>> -------------------------------------------------------------------------------- >>>>>>> /amyloid_team 9.0TB 90% 7.8TB 1.2TB >>>>>>> >>>>>>> As you can notice, the sum of all bricks gives me roughly 6.4TB and ? quota list ? around 7.8TB; so there is a difference of 1.4TB i?m not able to explain? Do you have any idea? >>>>>>> >>>>>> >>>>>> There were few issues when quota accounting the size, we have fixed some of these issues in 3.7 >>>>>> 'df -h' will round off the values, can you please provide the output of 'df' without -h option? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Geoffrey >>>>>>> ------------------------------------------------------ >>>>>>> Geoffrey Letessier >>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>>> Institut de Biologie Physico-Chimique >>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>>> >>>>>>>> Le 8 juin 2015 ? 14:30, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> Concerning the 3.5.3 version of GlusterFS, I met this morning a strange issue writing file when quota is exceeded. >>>>>>>> >>>>>>>> One person of my lab, whose her quota is exceeded (but she didn?t know about) try to modify a file but, because of exceeded quota, she was unable to and decided to exit VI. Now, her file is empty/blank as you can read below: >>>>>> we suspect 'vi' might have created tmp file before writing to a file. We are working on re-creating this problem and will update you on the same. >>>>>> >>>>>> >>>>>>>> pdsh at lucifer: cl-storage3: ssh exited with exit code 2 >>>>>>>> cl-storage1: ---------T 2 tarus amyloid_team 0 19 f?vr. 12:34 /export/brick_home/brick1/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh >>>>>>>> cl-storage1: -rwxrw-r-- 2 tarus amyloid_team 0 8 juin 12:38 /export/brick_home/brick2/amyloid_team/tarus/project/ab1-40-x1_sen304-x2_inh3-x2/remd_charmm22star_scripts/remd_115.sh >>>>>>>> >>>>>>>> In addition, i dont understand why, my volume being a distributed volume inside replica (cl-storage[1,3] is replicated only on cl-storage[2,4]), i have 2 ? same ? files (complete path) in 2 different bricks (as you can read above). >>>>>>>> >>>>>>>> Thanks by advance for your help and clarification. >>>>>>>> Geoffrey >>>>>>>> ------------------------------------------------------ >>>>>>>> Geoffrey Letessier >>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>>>> >>>>>>>>> Le 2 juin 2015 ? 23:45, Geoffrey Letessier <geoffrey.letessier at cnrs.fr> a ?crit : >>>>>>>>> >>>>>>>>> Hi Ben, >>>>>>>>> >>>>>>>>> I just check my messages log files, both on client and server, and I dont find any hung task you notice on yours.. >>>>>>>>> >>>>>>>>> As you can read below, i dont note the performance issue in a simple DD but I think my issue is concerning a set of small files (tens of thousands nay more)? >>>>>>>>> >>>>>>>>> [root at nisus test]# ddt -t 10g /mnt/test/ >>>>>>>>> Writing to /mnt/test/ddt.8362 ... syncing ... done. >>>>>>>>> sleeping 10 seconds ... done. >>>>>>>>> Reading from /mnt/test/ddt.8362 ... done. >>>>>>>>> 10240MiB KiB/s CPU% >>>>>>>>> Write 114770 4 >>>>>>>>> Read 40675 4 >>>>>>>>> >>>>>>>>> for info: /mnt/test concerns the single v2 GlFS volume >>>>>>>>> >>>>>>>>> [root at nisus test]# ddt -t 10g /mnt/fhgfs/ >>>>>>>>> Writing to /mnt/fhgfs/ddt.8380 ... syncing ... done. >>>>>>>>> sleeping 10 seconds ... done. >>>>>>>>> Reading from /mnt/fhgfs/ddt.8380 ... done. >>>>>>>>> 10240MiB KiB/s CPU% >>>>>>>>> Write 102591 1 >>>>>>>>> Read 98079 2 >>>>>>>>> >>>>>>>>> Do you have a idea how to tune/optimize performance settings? and/or TCP settings (MTU, etc.)? >>>>>>>>> >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | | UNTAR | DU | FIND | TAR | RM | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | BeeGFS | ~3m43s | ~15s | ~3s | ~1m33s | ~46s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> | single (v2) | ~3m6s | ~14s | ~32s | ~1m2s | ~44s | >>>>>>>>> --------------------------------------------------------------- >>>>>>>>> for info: >>>>>>>>> -BeeGFS is a distributed FS (4 bricks, 2 bricks per server and 2 servers) >>>>>>>>> - single (v2): simple gluster volume with default settings >>>>>>>>> >>>>>>>>> I also note I obtain the same tar/untar performance issue with FhGFS/BeeGFS but the rest (DU, FIND, RM) looks like to be OK. >>>>>>>>> >>>>>>>>> Thank you very much for your reply and help. >>>>>>>>> Geoffrey >>>>>>>>> ----------------------------------------------- >>>>>>>>> Geoffrey Letessier >>>>>>>>> >>>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr >>>>>>>>> >>>>>>>>> Le 2 juin 2015 ? 21:53, Ben Turner <bturner at redhat.com> a ?crit : >>>>>>>>> >>>>>>>>>> I am seeing problems on 3.7 as well. Can you check /var/log/messages on both the clients and servers for hung tasks like: >>>>>>>>>> >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: iozone D 0000000000000001 0 21999 1 0x00000080 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321cc8 0000000000000082 ffff880611321c18 ffffffffa027236e >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff880611321c48 ffffffffa0272c10 ffff88052bd1e040 ffff880611321c78 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: ffff88052bd1e0f0 ffff88062080c7a0 ffff880625addaf8 ffff880611321fd8 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: Call Trace: >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa027236e>] ? rpc_make_runnable+0x7e/0x80 [sunrpc] >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffffa0272c10>] ? rpc_execute+0x50/0xa0 [sunrpc] >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff810aaa21>] ? ktime_get_ts+0xb1/0xf0 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811242d0>] ? sync_page+0x0/0x50 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152a1b3>] io_schedule+0x73/0xc0 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112430d>] sync_page+0x3d/0x50 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8152ac7f>] __wait_on_bit+0x5f/0x90 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124543>] wait_on_page_bit+0x73/0x80 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8109eb80>] ? wake_bit_function+0x0/0x50 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8113a525>] ? pagevec_lookup_tag+0x25/0x40 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8112496b>] wait_on_page_writeback_range+0xfb/0x190 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff81124b38>] filemap_write_and_wait_range+0x78/0x90 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c07ce>] vfs_fsync_range+0x7e/0x100 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08bd>] vfs_fsync+0x1d/0x20 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c08fe>] do_fsync+0x3e/0x60 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff811c0950>] sys_fsync+0x10/0x20 >>>>>>>>>> Jun 2 15:23:14 gqac006 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b >>>>>>>>>> >>>>>>>>>> Do you see a perf problem with just a simple DD or do you need a more complex workload to hit the issue? I think I saw an issue with metadata performance that I am trying to run down, let me know if you can see the problem with simple DD reads / writes or if we need to do some sort of dir / metadata access as well. >>>>>>>>>> >>>>>>>>>> -b >>>>>>>>>> >>>>>>>>>> ----- Original Message ----- >>>>>>>>>>> From: "Geoffrey Letessier" <geoffrey.letessier at cnrs.fr> >>>>>>>>>>> To: "Pranith Kumar Karampuri" <pkarampu at redhat.com> >>>>>>>>>>> Cc: gluster-users at gluster.org >>>>>>>>>>> Sent: Tuesday, June 2, 2015 8:09:04 AM >>>>>>>>>>> Subject: Re: [Gluster-users] GlusterFS 3.7 - slow/poor performances >>>>>>>>>>> >>>>>>>>>>> Hi Pranith, >>>>>>>>>>> >>>>>>>>>>> I?m sorry but I cannot bring you any comparison because comparison will be >>>>>>>>>>> distorted by the fact in my HPC cluster in production the network technology >>>>>>>>>>> is InfiniBand QDR and my volumes are quite different (brick in RAID6 >>>>>>>>>>> (12x2TB), 2 bricks per server and 4 servers into my pool) >>>>>>>>>>> >>>>>>>>>>> Concerning your demand, in attachments you can find all expected results >>>>>>>>>>> hoping it can help you to solve this serious performance issue (maybe I need >>>>>>>>>>> play with glusterfs parameters?). >>>>>>>>>>> >>>>>>>>>>> Thank you very much by advance, >>>>>>>>>>> Geoffrey >>>>>>>>>>> ------------------------------------------------------ >>>>>>>>>>> Geoffrey Letessier >>>>>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>>>>> UPR 9080 - CNRS - Laboratoire de Biochimie Th?orique >>>>>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at ibpc.fr >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Le 2 juin 2015 ? 10:09, Pranith Kumar Karampuri < pkarampu at redhat.com > a >>>>>>>>>>> ?crit : >>>>>>>>>>> >>>>>>>>>>> hi Geoffrey, >>>>>>>>>>> Since you are saying it happens on all types of volumes, lets do the >>>>>>>>>>> following: >>>>>>>>>>> 1) Create a dist-repl volume >>>>>>>>>>> 2) Set the options etc you need. >>>>>>>>>>> 3) enable gluster volume profile using "gluster volume profile <volname> >>>>>>>>>>> start" >>>>>>>>>>> 4) run the work load >>>>>>>>>>> 5) give output of "gluster volume profile <volname> info" >>>>>>>>>>> >>>>>>>>>>> Repeat the steps above on new and old version you are comparing this with. >>>>>>>>>>> That should give us insight into what could be causing the slowness. >>>>>>>>>>> >>>>>>>>>>> Pranith >>>>>>>>>>> On 06/02/2015 03:22 AM, Geoffrey Letessier wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Dear all, >>>>>>>>>>> >>>>>>>>>>> I have a crash test cluster where i?ve tested the new version of GlusterFS >>>>>>>>>>> (v3.7) before upgrading my HPC cluster in production. >>>>>>>>>>> But? all my tests show me very very low performances. >>>>>>>>>>> >>>>>>>>>>> For my benches, as you can read below, I do some actions (untar, du, find, >>>>>>>>>>> tar, rm) with linux kernel sources, dropping cache, each on distributed, >>>>>>>>>>> replicated, distributed-replicated, single (single brick) volumes and the >>>>>>>>>>> native FS of one brick. >>>>>>>>>>> >>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar xJf ~/linux-4.1-rc5.tar.xz; >>>>>>>>>>> sync; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; du -sh linux-4.1-rc5/; echo 3 > >>>>>>>>>>> /proc/sys/vm/drop_caches) >>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; find linux-4.1-rc5/|wc -l; echo 3 >>>>>>>>>>>> /proc/sys/vm/drop_caches) >>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; tar czf linux-4.1-rc5.tgz >>>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>>> # time (echo 3 > /proc/sys/vm/drop_caches; rm -rf linux-4.1-rc5.tgz >>>>>>>>>>> linux-4.1-rc5/; echo 3 > /proc/sys/vm/drop_caches) >>>>>>>>>>> >>>>>>>>>>> And here are the process times: >>>>>>>>>>> >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | | UNTAR | DU | FIND | TAR | RM | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | single | ~3m45s | ~43s | ~47s | ~3m10s | ~3m15s | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | replicated | ~5m10s | ~59s | ~1m6s | ~1m19s | ~1m49s | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | distributed | ~4m18s | ~41s | ~57s | ~2m24s | ~1m38s | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | dist-repl | ~8m18s | ~1m4s | ~1m11s | ~1m24s | ~2m40s | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> | native FS | ~11s | ~4s | ~2s | ~56s | ~10s | >>>>>>>>>>> --------------------------------------------------------------- >>>>>>>>>>> >>>>>>>>>>> I get the same results, whether with default configurations with custom >>>>>>>>>>> configurations. >>>>>>>>>>> >>>>>>>>>>> if I look at the side of the ifstat command, I can note my IO write processes >>>>>>>>>>> never exceed 3MBs... >>>>>>>>>>> >>>>>>>>>>> EXT4 native FS seems to be faster (roughly 15-20% but no more) than XFS one >>>>>>>>>>> >>>>>>>>>>> My [test] storage cluster config is composed by 2 identical servers (biCPU >>>>>>>>>>> Intel Xeon X5355, 8GB of RAM, 2x2TB HDD (no-RAID) and Gb ethernet) >>>>>>>>>>> >>>>>>>>>>> My volume settings: >>>>>>>>>>> single: 1server 1 brick >>>>>>>>>>> replicated: 2 servers 1 brick each >>>>>>>>>>> distributed: 2 servers 2 bricks each >>>>>>>>>>> dist-repl: 2 bricks in the same server and replica 2 >>>>>>>>>>> >>>>>>>>>>> All seems to be OK in gluster status command line. >>>>>>>>>>> >>>>>>>>>>> Do you have an idea why I obtain so bad results? >>>>>>>>>>> Thanks in advance. >>>>>>>>>>> Geoffrey >>>>>>>>>>> ----------------------------------------------- >>>>>>>>>>> Geoffrey Letessier >>>>>>>>>>> >>>>>>>>>>> Responsable informatique & ing?nieur syst?me >>>>>>>>>>> CNRS - UPR 9080 - Laboratoire de Biochimie Th?orique >>>>>>>>>>> Institut de Biologie Physico-Chimique >>>>>>>>>>> 13, rue Pierre et Marie Curie - 75005 Paris >>>>>>>>>>> Tel: 01 58 41 50 93 - eMail: geoffrey.letessier at cnrs.fr >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list Gluster-users at gluster.org >>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Gluster-users mailing list >>>>>>>>>>> Gluster-users at gluster.org >>>>>>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org >>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>> >>> >> > > <quota-verify.gz>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150610/b6a5d553/attachment.html>