i.vlad at yahoo.com
2007-Apr-02 12:57 UTC
parallel I/O on shared-memory multi-CPU machines
Dear ext3-users, I write scientific number-crunching codes which deal with large input and output files (Gb, tens of Gb, as much as hundreds of Gygabytes on occasions). Now there are these multi-core setups like Intel Core 2 Duo becoming available at a low cost. Disk I/O is one of the biggest bottlenecks. I would very much like to put to use the multiple processors to read or write in parallel (probably using OMP directives in a C program). My question is -- can the filesystem actually read/write to files truly in parallel (more processors -- faster read/write), or if I write such a code the I/O commands from the CPUs will just queue after each other and it would be the same thing as using a single CPU? If parallel I/O is possible, can it be accomplished entirely transparently, or using special libraries, or only in special circumstances, like reading in parallel with N CPUs from N different physical disks? Or only on some types of hardware? Is there a max nr of threads/processes that can write to disk in parallel? If ext3 does not do this, which (stable) Linux filesystem does it? I know that the vast majority of clusters use something else than ext3 (NFS, Lustre, etc), but the question still stands because: (1) individual nodes in commodity clusters do have very often individual ext3 disks that are used for temporary files (intermediate computational results); (2) grid computers made of standalone user machines are likely to have the most common filesystem, ext3; (3) There are scientific data processing steps that need to be done on a single shared-memory machine because of intensive data exchange between CPUs. (4) Software development is easier to do on a single machine (i.e. powerful multi-core laptop). Thank you, I. Vlad
On Apr 02, 2007 05:57 -0700, i.vlad at yahoo.com wrote:> My question is -- can the filesystem actually read/write to files truly > in parallel (more processors -- faster read/write), or if I write such > a code the I/O commands from the CPUs will just queue after each other > and it would be the same thing as using a single CPU? If parallel I/O > is possible, can it be accomplished entirely transparently, or using > special libraries, or only in special circumstances, like reading in > parallel with N CPUs from N different physical disks? Or only on some > types of hardware? Is there a max nr of threads/processes that can > write to disk in parallel? If ext3 does not do this, which (stable) > Linux filesystem does it?You can do this with Lustre (www.lustre.org), which is a GPL parallel, distributed filesystem that runs on top of ext3 (essentially ext4 though). Since Lustre does its own distributed extent locking on a file, it is possible to do fully parallel IO to a single file from multiple nodes.> I know that the vast majority of clusters use something else than ext3 > (NFS, Lustre, etc), but the question still stands because: (1) individual > nodes in commodity clusters do have very often individual ext3 disks that > are used for temporary files (intermediate computational results); (2) > grid computers made of standalone user machines are likely to have the > most common filesystem, ext3; (3) There are scientific data processing > steps that need to be done on a single shared-memory machine because of > intensive data exchange between CPUs. (4) Software development is easier > to do on a single machine (i.e. powerful multi-core laptop).We are looking to make even single-node file locking better than just i_mutex (i.e. fine-grained extent locking in the VFS) but this is not a high priority task for us, since we target large systems mostly. If anyone is interested in doing such work we would definitely want to help out with it. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
> I write scientific number-crunching codes which deal with large > input and output files (Gb, tens of Gb, as much as hundreds of > Gygabytes on occasions). Now there are these multi-core setups > like Intel Core 2 Duo becoming available at a low cost. Disk I/O > is one of the biggest bottlenecks. I would very much like to put > to use the multiple processors to read or write in parallel > (probably using OMP directives in a C program).How do you think more processors is going to help? Disk I/O doesn't take much processor. It's hard to imagine a realistic disk I/O application that was somehow limited by available CPU.> My question is -- can the filesystem actually read/write to files > truly in parallelYes.> (more processors -- faster read/write),No. Parallel I/O does not require more processors.> or if I > write such a code the I/O commands from the CPUs will just queue > after each other and it would be the same thing as using a single > CPU?Why do you think the I/O commands juse queue after each other on a single CPU? That is a very mistaken view. As further progress is capable to make on each operaiton, a single CPU will make the progress. But that doesn't mean it works one operation from start to finish before it gets to the next one.> If parallel I/O is possible, can it be accomplished entirely > transparently, or using special libraries, or only in special > circumstances, like reading in parallel with N CPUs from N > different physical disks? Or only on some types of hardware? Is > there a max nr of threads/processes that can write to disk in > parallel? If ext3 does not do this, which (stable) Linux > filesystem does it?You're asking the wrong questions because you asking about CPUs. One CPU can issue multiple I/O requests without waiting for each one to finish before issuing the next one. It is a legitimate question how you perform I/O asynchronously. There are a few ways such as threads and POSIX asynchronous I/O. DS