We have a SGE array task that we wish to run with elements 1-70000. Each task generates output and takes roughly 20 seconds to 4 minutes of CPU time. We''re doing them on a machine with about 144 8-core nodes, and we''ve divvied the job up to do about 500 at a time. So, we have 500 jobs at a time writing to the same ZFS partition. What is the best way to collect the results of the task? Currently we are having each task write to STDOUT and then are combining the results. This nails our ZFS partition to the wall and kills performance for other users of the system. We tried setting up a MySQL server to receive the results, but it couldn''t take 1000 simultaneous inbound connections. Jeff
Have each node record results locally, and then merge pair-wise until a single node is left with the final results? If you can do merges that way while reducing the size of the result set, then that''s probably going to be the most scalable way to generate overall results. On Thu, Jul 16, 2009 at 10:51 AM, Jeff Haferman<jeff at haferman.com> wrote:> > We have a SGE array task that we wish to run with elements 1-70000. > Each task generates output and takes roughly 20 seconds to 4 minutes > of CPU time. ?We''re doing them on a machine with about 144 8-core nodes, > and we''ve divvied the job up to do about 500 at a time. > > So, we have 500 jobs at a time writing to the same ZFS partition. > > What is the best way to collect the results of the task? Currently we > are having each task write to STDOUT and then are combining the > results. This nails our ZFS partition to the wall and kills > performance for other users of the system. ?We tried setting up a > MySQL server to receive the results, but it couldn''t take 1000 > simultaneous inbound connections. > > Jeff > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Thu, 2009-07-16 at 10:51 -0700, Jeff Haferman wrote:> We have a SGE array task that we wish to run with elements 1-70000. > Each task generates output and takes roughly 20 seconds to 4 minutes > of CPU time. We''re doing them on a machine with about 144 8-core nodes, > and we''ve divvied the job up to do about 500 at a time. > > So, we have 500 jobs at a time writing to the same ZFS partition.Sorry no answers, just some question that first came to mind. Where is your bottleneck? Is it drive I/O or Network? Are all nodes accessing/writing via NFS? Is this a NFS sync issue? Might a SSD ZIL help? -- Louis-Fr?d?ric Feuillette <jebnor at gmail.com>