Craig Thomson
2006-May-19 07:36 UTC
[Lustre-discuss] Lustre as a distributed replicated file store
I am looking for a distributed filesystem to manage data accross four sites with varying connection speeds (up to 2MBit Fibre and as low as 2Meg DSL for smaller site). THe setup i would like is as follows: 4 identical linux machines with equal sized hard disks in mirrored raid. All machines would be data stores and clients (if I understand a client correctly to be a kind of access point for the data). There would obviously be a Metadata server running on one of these and perhaps a failover. These machines would be networked by VPN. I would like data replicated to all sites and for data to propogate through the system. How I am thinking this will work is as follows: A user at any of the sites accesses the filesystem. It locks any files on the Meta server and performs the work to be done, reading and writing from the closest storage target (on the same machine as its client). When changes are applied the Metadata keeps track that the latest version of the file is on that particular storage target and when it is requested it is replicated to whoever requested it. Full replication of any oputstanding files can then be done overnight so that network bandwidth is kept down dusring the day. So basically i want to replicate on the fly only when required, then run a cron job or some other timed function to do the rest later. Hopefully you understand what I am trying to do. Is this possible? Is this possible with Lustre? Thanks in advance
Andreas Dilger
2006-May-19 07:36 UTC
[Lustre-discuss] Lustre as a distributed replicated file store
On Feb 06, 2006 09:50 -0600, Craig Thomson wrote:> I would like data replicated to all sites and for data to propogate > through the system. How I am thinking this will work is as follows:Lustre currently does not do data replication itself, though this is a feature we are working toward.> A user at any of the sites accesses the filesystem. It locks any files > on the Meta server and performs the work to be done, reading and writing > from the closest storage target (on the same machine as its client). > When changes are applied the Metadata keeps track that the latest > version of the file is on that particular storage target and when it is > requested it is replicated to whoever requested it. Full replication of > any oputstanding files can then be done overnight so that network > bandwidth is kept down dusring the day. > > So basically i want to replicate on the fly only when required, then run > a cron job or some other timed function to do the rest later. Hopefully > you understand what I am trying to do.What you describe is essentially the "InterMezzo" filesystem, which CFS''s predecessor Stelias developed.> Is this possible with Lustre?At some point in the future this will be possible with Lustre. Currently, it is not possible to work in this manner. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.