cliffw@clusterfs.com
2007-Jan-17 15:07 UTC
[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10112 Sandia remains concerned - Do we have any reports of performance improvements/changes/regressions with 1.4.8 and O_DIRECT?
adilger@clusterfs.com
2007-Jan-25 02:20 UTC
[Lustre-devel] [Bug 10112] high cpu usage on linux mounted fs
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10112 What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard|2006-12-01: Reopened for |2007-01-25: Reopened for |further investigation |further investigation |because it seems that this |because it seems that this |problem is not completely |problem is not completely |solved in 1.4.7; CFS to |solved in 1.4.7; CPU usage |respond to latest |points to data copy as |information from Sandia |largest CPU user (In reply to comment #43)> (In reply to comment #42) > > (In reply to comment #31) > > > first obvious thing is that you don''t use direct-io (-I option to iozone) > > > > Alex, our apps (in this case) aren''t currently using direct-io, so I don''t > > see the relevance of this? Please explain. > > exactly. I''m saying that because the apps don''t use direct-io interface, > they have to copy everything. this is very expensive. to achieve good > throughput and have free cpu cycles you need to use direct-io. > > it also seems that you have 2 CPUs, but use only one. 1 CPU is definitely > loaded to 100%. mostly by copying.I agree with Alex here that the oprofile data is showing the CPU is mostly busy doing data copies from kernel to user-space or vice versa (40% of one CPU on reads, 65% of one CPU on writes), and a single CPU is 100% busy while the other is nearly idle. While O_DIRECT will avoid this data copy overhead, the drawback is that O_DIRECT also enforces synchronous RPCs for EACH write() syscall so any pipelining of IOs on the wire is lost. You could likely improve O_DIRECT network utilization by having much larger IO sizes (64MB) but that isn''t a very common IO scenario for many applications.