thr3ads.net - Lustre devel - [Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs [Jan 2007]

If this information is useful, please help other people find it:
Share via:

cliffw@clusterfs.com

2007-Jan-17 15:07 UTC

[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs

Please don''t reply to lustre-devel. Instead, comment in Bugzilla by
using the following link:
https://bugzilla.lustre.org/show_bug.cgi?id=10112



Sandia remains concerned - Do we have any reports of performance
improvements/changes/regressions with 1.4.8 and O_DIRECT?

adilger@clusterfs.com

2007-Jan-25 02:20 UTC

head link

[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs

Please don''t reply to lustre-devel. Instead, comment in Bugzilla by
using the following link:
https://bugzilla.lustre.org/show_bug.cgi?id=10112

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Status Whiteboard|2006-12-01: Reopened for    |2007-01-25: Reopened for
                   |further investigation       |further investigation
                   |because it seems that this  |because it seems that this
                   |problem is not completely   |problem is not completely
                   |solved in 1.4.7; CFS to     |solved in 1.4.7; CPU usage
                   |respond to latest           |points to data copy as
                   |information from Sandia     |largest CPU user


(In reply to comment #43)> (In reply to comment #42)
> > (In reply to comment #31)
> > > first obvious thing is that you don''t use direct-io (-I
option to iozone)
> > 
> > Alex, our apps (in this case) aren''t currently using
direct-io, so I don''t
> > see the relevance of this?  Please explain.
> 
> exactly. I''m saying that because the apps don''t use
direct-io interface,
> they have to copy everything. this is very expensive. to achieve good
> throughput and have free cpu cycles you need to use direct-io.
>
> it also seems that you have 2 CPUs, but use only one. 1 CPU is definitely
> loaded to 100%. mostly by copying.
I agree with Alex here that the oprofile data is showing the CPU is mostly busy
doing data copies from kernel to user-space or vice versa (40% of one CPU on
reads, 65% of one CPU on writes), and a single CPU is 100% busy while the other
is nearly idle.

While O_DIRECT will avoid this data copy overhead, the drawback is that O_DIRECT
also enforces synchronous RPCs for EACH write() syscall so any pipelining of IOs
on the wire is lost.  You could likely improve O_DIRECT network utilization by
having much larger IO sizes (64MB) but that isn''t a very common IO
scenario for
many applications.

Lustre devel - Jan 2007 - [Bug 10112] high cpu usage on linux mounted fs

[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs

[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs