There is a particular kind of application, single-client and serial process, for which a striped file system using RAM disk would be very useful. Consider reading small blocks at random locations on a hard disk. The latency of the HDD could be large, a few milliseconds. Adding more HDD''s does not solve the problem, unlike an application based on streaming. Adding more disks and parallelizing the program could be a solution but sometimes there is no time to parallelize the program. A possible solution is RAM disk. But if we put, for example, 64 GB of RAM on a single computer then that computer becomes specialized and expensive, whereas the need for a huge amount of RAM may be only temporary. An alternative is to use a cluster of nodes, a typical Beowulf cluster. For example, using a striped file system over 16 nodes where each node has 4 GB of RAM. Each node would have a normal amount of RAM and yet could provide the aggregate storage of 64 GB when the need arises. While we have not yet created this configuration, I suppose that Gbit Ethernet could provide 100 microsecond latency and Infiniband or Myrinet could provide 10 microsecond latency. Much, much less than the seek time of a HDD. The idea is so simple that I imagine it has already been done. I would be interested in learning from other sites that have used this method with the Lustre file system. best regards,
Sounds like the SiCortex Fabricache: http://sicortex.com/5832_newsletter/the_sicortex_fabricache/ the_sicortex_fabricache_measure_its_abilities_in_genomes_sec --bob On Nov 23, 2007, at 2:45 PM, Antonio Concas wrote:> There is a particular kind of application, single-client > and serial process, for which a striped file system using > RAM disk would be very useful. Consider reading small > blocks at random locations on a hard disk. The latency > of the HDD could be large, a few milliseconds. Adding more > HDD''s does not solve the problem, unlike an application based > on streaming. Adding more disks and parallelizing the program > could be a solution but sometimes there is no time > to parallelize the program. > > A possible solution is RAM disk. But if we put, for example, > 64 GB of RAM on a single computer then that computer becomes > specialized and expensive, whereas the need for a huge > amount of RAM may be only temporary. An alternative is to > use a cluster of nodes, a typical Beowulf cluster. For example, > using a striped file system over 16 nodes where each node has 4 GB > of RAM. Each node would have a normal amount of RAM and yet > could provide the aggregate storage of 64 GB when the need arises. > While we have not yet created this configuration, I suppose > that Gbit Ethernet could provide 100 microsecond latency and > Infiniband or Myrinet could provide 10 microsecond latency. > Much, much less than the seek time of a HDD. > > The idea is so simple that I imagine it has already been done. > I would be interested in learning from other sites that have > used this method with the Lustre file system. > > best regards, > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
There are a number of solid-state disk solutions on the market; I recall seeing the RAM SAN as early as 2001, and a quick visit to their site shows they are still going strong, and have IB-based and FC-based solutions, although this kind of solution tends to be somewhat pricey. Another possible solution is something like Fusion IO -- essentially, a PCI device loaded with memory chips that can be used as high-performance memory. Klaus ________________________________ From: lustre-discuss-bounces at clusterfs.com on behalf of Antonio Concas Sent: Fri 11/23/2007 12:45 PM To: Lustre-discuss Cc: Alan Scheinine Subject: [Lustre-discuss] ram disk There is a particular kind of application, single-client and serial process, for which a striped file system using RAM disk would be very useful. Consider reading small blocks at random locations on a hard disk. The latency of the HDD could be large, a few milliseconds. Adding more HDD''s does not solve the problem, unlike an application based on streaming. Adding more disks and parallelizing the program could be a solution but sometimes there is no time to parallelize the program. A possible solution is RAM disk. But if we put, for example, 64 GB of RAM on a single computer then that computer becomes specialized and expensive, whereas the need for a huge amount of RAM may be only temporary. An alternative is to use a cluster of nodes, a typical Beowulf cluster. For example, using a striped file system over 16 nodes where each node has 4 GB of RAM. Each node would have a normal amount of RAM and yet could provide the aggregate storage of 64 GB when the need arises. While we have not yet created this configuration, I suppose that Gbit Ethernet could provide 100 microsecond latency and Infiniband or Myrinet could provide 10 microsecond latency. Much, much less than the seek time of a HDD. The idea is so simple that I imagine it has already been done. I would be interested in learning from other sites that have used this method with the Lustre file system. best regards, _______________________________________________ Lustre-discuss mailing list Lustre-discuss at clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss