Hi all, I am just trying to consider my options for storing a large mass of data (tens of terrabytes of files) and one idea is to build a clustered FS of some kind. Has anybody had any experience with that? Any recommendations? Thanks in advance for any and all advice. Boris.
Boris wrote:> > I am just trying to consider my options for storing a large mass of > data (tens of terrabytes of files) and one idea is to build a > clustered FS of some kind. Has anybody had any experience with that? > Any recommendations?We've been looking at glusterfs here. It's under active development, has some problems, but it does work, and is in use a number of places around the world. mark
Boris Epstein wrote, On 06/16/2010 03:33 PM:> Hi all, > > I am just trying to consider my options for storing a large mass of > data (tens of terrabytes of files) and one idea is to build a > clustered FS of some kind. Has anybody had any experience with that? > Any recommendations? > > Thanks in advance for any and all advice. > > Boris.I have not used a cluster FS, but have seen some discussions of them over on the drbd list[1] , and you did not mention what kind of backing devices you were going to have for the filesystem. In the drbd documentation[2] they have some discussion of gfs and ocfs2 which may be of some help. In short if you are considering DRBD as a backing device, definitely ask over on their mailing list and I suspect that mailing list population has a higher percentage of folks who use cluster FSs. [1] http://lists.linbit.com/mailman/listinfo/drbd-user [2] http://www.drbd.org/docs/applications/ http://www.drbd.org/users-guide-emb/ch-gfs.html#s-gfs-primer http://www.drbd.org/users-guide-emb/ch-ocfs2.html#s-ocfs2-primer -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter
>I am just trying to consider my options for storing a large mass of >data (tens of terrabytes of files) and one idea is to build a >clustered FS of some kind. Has anybody had any experience with that? >Any recommendations?You haven't actually stated whether you want the backing devices distributed or have the file system support more than one mount? You likely don't need a cluster aware fs, if you need to access the data in more than one place any of several file sharing methodologies will work. I suspect as your storage need is large, you need to distribute it across more than one block device probably on several servers? DRBD is of no use here. Clarify what you're after... jlc
On Thu, Jun 17, 2010 at 1:03 AM, Boris Epstein <borepstein at gmail.com> wrote:> I am just trying to consider my options for storing a large mass of > data (tens of terrabytes of files) and one idea is to build a > clustered FS of some kind. Has anybody had any experience with that? > Any recommendations?You need a shared SAN back end to run traditional cluster file systems. If you environment is all Linux, then Lustre (lustre.org) works well. If you need other OS support, the commercial alternatives like Quantum StorNext and IBRIX (now acquired by HP) are good alternatives. - Raja
Boris Epstein sent a missive on?2010-06-16:> Hi all, > > I am just trying to consider my options for storing a large mass of > data (tens of terrabytes of files) and one idea is to build a > clustered FS of some kind. Has anybody had any experience with that? > Any recommendations? > > Thanks in advance for any and all advice.Take a look at hadoop http://hadoop.apache.org and specifically HDFS (hadoop distributed file system) http://hadoop.apache.org/hdfs/ I've used it in conjunction with nutch across 20 odd servers (circa 10TB). When I used it the down side was a single metadata node, but this may have changed by now. The data is stored redundantly across the nodes and doesn't seem to require any special hardware (I ran it on dell 1425's). HTH Simon.
On Wed, 16 Jun 2010 15:33:02 -0400 Boris Epstein <borepstein at gmail.com> wrote:> Hi all, > > I am just trying to consider my options for storing a large mass of > data (tens of terrabytes of files) and one idea is to build a > clustered FS of some kind. Has anybody had any experience with that? > Any recommendations? > > Thanks in advance for any and all advice. > > Boris.Hi, You can take a look at http://www.moosefs.org. It is a network, fault-tolerant FS, posix compliant, allows snapshots, uses fuse, your code doesn't need to be changed to access the FS. You can easily choose the number of replicas of files/dirs you want. It is easy to deploy, runs in user-space. Some people runs it successfully on 500+TB. Plus, I've made a CentOS?repo here: http://centos.kodros.fr/moosefs.repo Regards, Laurent -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: <http://lists.centos.org/pipermail/centos/attachments/20100617/d6c0aa0d/attachment-0001.sig>