gregoire.pichon at bull.net
2012-Jan-30  10:09 UTC
[Lustre-discuss] low performance maybe related to quota
Hi,
If someone could have a look, this would be very helpful. I have no idea 
what to look at.
I am running a performance test (ES4) on a Lustre file-system, installed 
with Lustre 2.1 plus a few Bull patches, and I observe very low throughput 
compared to what I usually measure on the same hardware.
Write bandwidth is varying between 150MB/s and 500 MB/s running with a 
standard user. With the exact same parameters and configuration, but 
running under the root user, I get around 2000 MB/s write bandwidth. This 
second value is what I observe usually.
The profiling of the Lustre client indicates more than 50% of time is 
spent in osc_quota_chkdq() routine. So this seems related to the quota 
subsystem and certainly explains why root user is not impacted by the 
problem.
The quota are disabled on the client::
# lfs quota /b9
user quotas are not enabled.
group quotas are not enabled
There is no quota parameter stored on the MDT, nor on the 15 OSTs:
# tunefs.lustre /dev/loop1
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
   Read previous values:
Target:     b9-MDT0000
Index:      0
Lustre FS:  b9
Mount type: ldiskfs
Flags:      0x1
              (MDT )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
lov.stripecount=2 lov.stripesize=1048576 network=o2ib0
# for dev in `mount -t lustre | cut -d'' '' -f1`; do
tunefs.lustre $dev |
grep "^Parameters" | sort -u; done
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=60.64.0.37 at o2ib failover.node=60.64.0.39 at o2ib 
failover.node=60.64.0.36 at o2ib network=o2ib0
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=60.64.0.37 at o2ib failover.node=60.64.0.39 at o2ib 
failover.node=60.64.0.36 at o2ib network=o2ib0
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2 
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2 
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=160.64.0.39 at o2ib1 failover.node=160.64.0.36 at o2ib1 
failover.node=160.64.0.37 at o2ib1 network=o2ib1
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=161.64.0.36 at o2ib3 failover.node=161.64.0.37 at o2ib3 
failover.node=161.64.0.39 at o2ib3 network=o2ib3
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=160.64.0.37 at o2ib1 failover.node=160.64.0.39 at o2ib1 
failover.node=160.64.0.36 at o2ib1 network=o2ib1
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=60.64.0.39 at o2ib failover.node=60.64.0.36 at o2ib 
failover.node=60.64.0.37 at o2ib network=o2ib0
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=160.64.0.36 at o2ib1 failover.node=160.64.0.37 at o2ib1 
failover.node=160.64.0.39 at o2ib1 network=o2ib1
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2 
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=160.64.0.37 at o2ib1 failover.node=160.64.0.39 at o2ib1 
failover.node=160.64.0.36 at o2ib1 network=o2ib1
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=161.64.0.36 at o2ib3 failover.node=161.64.0.37 at o2ib3 
failover.node=161.64.0.39 at o2ib3 network=o2ib3
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=161.64.0.37 at o2ib3 failover.node=161.64.0.39 at o2ib3 
failover.node=161.64.0.36 at o2ib3 network=o2ib3
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=161.64.0.39 at o2ib3 failover.node=161.64.0.36 at o2ib3 
failover.node=161.64.0.37 at o2ib3 network=o2ib3
Parameters: 
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84
at o2ib3
failover.node=61.64.0.39 at o2ib2 failover.node=61.64.0.36 at o2ib2 
failover.node=61.64.0.37 at o2ib2 network=o2ib2
Thanks in advance,
Gr?goire.
--
Gr?goire PICHON
Software Developer, Lustre - Extreme Computing R&D
Bull, Architect of an Open World
Phone: +33 4 76 29 70 63
http://www.bull.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120130/2bb3f298/attachment.html