Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: HSM-imgs-cea-cfs-v1-web.pdf Type: application/pdf Size: 48908 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20060812/0274b230/HSM-imgs-cea-cfs-v1-web-0001.pdf -------------- next part -------------- A non-text attachment was scrubbed... Name: hsm-cea-reqs-v1-web.pdf Type: application/pdf Size: 97927 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20060812/0274b230/hsm-cea-reqs-v1-web-0001.pdf
On Sat, 12 Aug 2006, Peter J Braam wrote:> During the last few months, CEA and CFS have discussed HSM requirements and > architectural elements for Lustre. The goal of the discussions has been to > provide a setting which can address CEA''s operational requirements, can > adapt to different environments and can evolve to a more integrated Lustre > HSM solution. > > I attach a few slides with diagrams and a short requirements discussion. We > look forward to your comments.It seems that you have covered both the basic need of easily hooking into an existing HSM/tape-system by using an upcall and the more advanced needs of performance by enabling the ability to do parallell IO to/from tape for those large sites out there. For example, on our site the integration into the tape subsystem would be a simple script using the Tivoli Storage Manager Archive capability. Just to get the discussion going I''ll provide the "standard HSM problems" as we have experienced them. Whole-directory recalls: Even if you provide a tool to do it efficiently the user won''t use it if he can do a simple cp * /dest/ in order to get the job done. Ideally you would want to detect file recalls in opendir or sorted order and start prefetching and continuing doing so if requests keeps coming in that order. Admittedly, this is only a real problem if you allow small files to be migrated, ie files that are so large that the mount&seek delay is hidden in the actual restore time are OK. However, given increased tape transfer rates but moderately reduced seek times you quickly get a policy which doesn''t migrate as much to tape as you would want. It would be nice if the design enables the possibility to reduce/solve this problem, maybe it does already. Out of order requests: Another problem that usually shows up is requesting items out-of-order, ie not in the optimal recall-from-tape-order. There are multiple ways to solve this, the best solution when you have an intelligent system like TSM where you have no clue on tape affinity is usually to throw all requests at the tape management system and let it resolve them in the most efficient order. Again, whole-dir-recalls are a problem here due to its one-request-at-a-time nature. There is simply no chance to do any batching. In addition to the above "detect whole-dir-recall" you can simply recall the entire directory if it''s small. Migration of complete directories: Users usually group their data sets in different directories. It would be nice if the migration could be done with that in mind. For example, to have the LRU to be common for the entire directory, and migrate the entire directory when migration-day comes. This way, you get all relevant data together and doing a whole-directory recall suddenly doesn''t mean that much tape-swapping. Migration-schemes using the "migrate the largest file first"-philosophy on the other hand usually manages to spread the files of a directory on an impressive number of tapes making a whole-directory-recall an absolute nightmare. These are some of my experiences with an HSM system that users are allowed to place files on. Some might argue that they use it the wrong way, but at the same time HSM systems have done very little (if anything?) to adapt to the usage patterns of the average user. It would be very nice if the Lustre HSM support would excel in this area. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke@hpc2n.umu.se --------------------------------------------------------------------------- Some people think "asphalt" is a rectal disorder. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Hi Nikke, > -----Original Message----- > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] > Sent: Monday, August 14, 2006 2:49 AM > To: Peter J. Braam > Cc: lustre-devel@clusterfs.com > Subject: Re: [Lustre-devel] HSM plans > > On Sat, 12 Aug 2006, Peter J Braam wrote: > > > During the last few months, CEA and CFS have discussed HSM > > requirements and architectural elements for Lustre. The > goal of the > > discussions has been to provide a setting which can address CEA''s > > operational requirements, can adapt to different > environments and can > > evolve to a more integrated Lustre HSM solution. > > > > I attach a few slides with diagrams and a short requirements > > discussion. We look forward to your comments. > > It seems that you have covered both the basic need of easily > hooking into an existing HSM/tape-system by using an upcall > and the more advanced needs of performance by enabling the > ability to do parallell IO to/from tape for those large > sites out there. For example, on our site the integration > into the tape subsystem would be a simple script using the > Tivoli Storage Manager Archive capability. > > Just to get the discussion going I''ll provide the "standard > HSM problems" as we have experienced them. > > Whole-directory recalls: > > Even if you provide a tool to do it efficiently the user > won''t use it if he can do a simple cp * /dest/ in order to > get the job done. > Ideally you would want to detect file recalls in opendir or > sorted order and start prefetching and continuing doing so > if requests keeps coming in that order. > > Admittedly, this is only a real problem if you allow small > files to be migrated, ie files that are so large that the > mount&seek delay is hidden in the actual restore time are > OK. However, given increased tape transfer rates but > moderately reduced seek times you quickly get a policy which > doesn''t migrate as much to tape as you would want. > > It would be nice if the design enables the possibility to > reduce/solve this problem, maybe it does already. This was not yet added as a requirement, but it is a good idea to have this option. > > > Out of order requests: > > Another problem that usually shows up is requesting items > out-of-order, ie not in the optimal recall-from-tape-order. > There are > multiple ways to solve this, the best solution when you have an > intelligent system like TSM where you have no clue on tape > affinity is > usually to throw all requests at the tape management system > and let it > resolve them in the most efficient order. > > Again, whole-dir-recalls are a problem here due to its > one-request-at-a-time nature. There is simply no chance to do any > batching. In addition to the above "detect whole-dir-recall" you can > simply recall the entire directory if it''s small. Yes ... Again, nothing like this was mentioned, but I like this. What would you suggest we do about subdirectories? > > > Migration of complete directories: > > Users usually group their data sets in different > directories. It would > be nice if the migration could be done with that in mind. > For example, > to have the LRU to be common for the entire directory, and > migrate the > entire directory when migration-day comes. This way, you get all > relevant data together and doing a whole-directory recall suddenly > doesn''t mean that much tape-swapping. > > Migration-schemes using the "migrate the largest file > first"-philosophy on the other hand usually manages to spread the > files of a directory on an impressive number of tapes making a > whole-directory-recall an absolute nightmare. > Nothing like this was mentioned - but it is certainly a good idea again. (And fortunately, it doesn''t look that hard to me.) > These are some of my experiences with an HSM system that users are > allowed to place files on. Some might argue that they use it > the wrong > way, but at the same time HSM systems have done very little (if > anything?) to adapt to the usage patterns of the average user. It > would be very nice if the Lustre HSM support would excel in > this area. > Thanks a lot Nikke! > /Nikke > -- > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > =-=-=-=-=-=-=- > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | > nikke@hpc2n.umu.se > ------------------------------------------------------------- > -------------- > Some people think "asphalt" is a rectal disorder. > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > -=-=-=-=-=-=- > >
Peter, I just wanted to echo a few of Nikke''s points. I think the design should allow for multiple on-disk files to be packed in a single tape file. Furthermore, these groups of files should be recovered together (similar to the way a cache line works in a cache-based computer architecture). As Nikke pointed out, one of the critical issues with modern tape archive systems is doing some type of small file aggregation. HPSS is looking at how some of this could be implemented in their 7.1 release. However, I think it would be worthwhile for the Lustre HSM project to consider directly implementing something similar to this. If you would like a more detailed explanation, let me know I was curious how important the phase 2 deliverables are? It looks like a lot of work and I''m not sure I see the value for many of our applications. I could see it being useful if you stored HUGE files (10s of TB) and only need to read relatively small chunks. However, I''m not aware of a strong need for this from our users. --Shane -----Original Message----- From: lustre-devel-bounces@clusterfs.com [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of Peter J. Braam Sent: Monday, August 14, 2006 11:31 AM To: Niklas Edmundsson Cc: lustre-devel@clusterfs.com Subject: RE: [Lustre-devel] HSM plans Hi Nikke, > -----Original Message----- > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] > Sent: Monday, August 14, 2006 2:49 AM > To: Peter J. Braam > Cc: lustre-devel@clusterfs.com > Subject: Re: [Lustre-devel] HSM plans > > On Sat, 12 Aug 2006, Peter J Braam wrote: > > > During the last few months, CEA and CFS have discussed HSM > > requirements and architectural elements for Lustre. The > goal of the > > discussions has been to provide a setting which can address CEA''s > > operational requirements, can adapt to different > environments and can > > evolve to a more integrated Lustre HSM solution. > > > > I attach a few slides with diagrams and a short requirements > > discussion. We look forward to your comments. > > It seems that you have covered both the basic need of easily > hooking into an existing HSM/tape-system by using an upcall > and the more advanced needs of performance by enabling the > ability to do parallell IO to/from tape for those large > sites out there. For example, on our site the integration > into the tape subsystem would be a simple script using the > Tivoli Storage Manager Archive capability. > > Just to get the discussion going I''ll provide the "standard > HSM problems" as we have experienced them. > > Whole-directory recalls: > > Even if you provide a tool to do it efficiently the user > won''t use it if he can do a simple cp * /dest/ in order to > get the job done. > Ideally you would want to detect file recalls in opendir or > sorted order and start prefetching and continuing doing so > if requests keeps coming in that order. > > Admittedly, this is only a real problem if you allow small > files to be migrated, ie files that are so large that the > mount&seek delay is hidden in the actual restore time are > OK. However, given increased tape transfer rates but > moderately reduced seek times you quickly get a policy which > doesn''t migrate as much to tape as you would want. > > It would be nice if the design enables the possibility to > reduce/solve this problem, maybe it does already. This was not yet added as a requirement, but it is a good idea to have this option. > > > Out of order requests: > > Another problem that usually shows up is requesting items > out-of-order, ie not in the optimal recall-from-tape-order. > There are > multiple ways to solve this, the best solution when you have an > intelligent system like TSM where you have no clue on tape > affinity is > usually to throw all requests at the tape management system > and let it > resolve them in the most efficient order. > > Again, whole-dir-recalls are a problem here due to its > one-request-at-a-time nature. There is simply no chance to do any > batching. In addition to the above "detect whole-dir-recall" you can > simply recall the entire directory if it''s small. Yes ... Again, nothing like this was mentioned, but I like this. What would you suggest we do about subdirectories? > > > Migration of complete directories: > > Users usually group their data sets in different > directories. It would > be nice if the migration could be done with that in mind. > For example, > to have the LRU to be common for the entire directory, and > migrate the > entire directory when migration-day comes. This way, you get all > relevant data together and doing a whole-directory recall suddenly > doesn''t mean that much tape-swapping. > > Migration-schemes using the "migrate the largest file > first"-philosophy on the other hand usually manages to spread the > files of a directory on an impressive number of tapes making a > whole-directory-recall an absolute nightmare. > Nothing like this was mentioned - but it is certainly a good idea again. (And fortunately, it doesn''t look that hard to me.) > These are some of my experiences with an HSM system that users are > allowed to place files on. Some might argue that they use it > the wrong > way, but at the same time HSM systems have done very little (if > anything?) to adapt to the usage patterns of the average user. It > would be very nice if the Lustre HSM support would excel in > this area. > Thanks a lot Nikke! > /Nikke > -- > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > =-=-=-=-=-=-=- > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | > nikke@hpc2n.umu.se > ------------------------------------------------------------- > -------------- > Some people think "asphalt" is a rectal disorder. > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > -=-=-=-=-=-=- > > _______________________________________________ Lustre-devel mailing list Lustre-devel@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-devel
We at Sandia National Labs are very interested in this discussion. I think I can start with two basic questions: I have a long familiarity with the old Cray/SGI Data Migration Facility, and the recent HPSS project effort, with I find essentially equivalent. I would describe them both as sharing the same basic paradigm of running a user-space utility package that leverages the XDMS Data Migration API, which is a dependency for their respective software, and must be supported by the source filesystem. It appears to me, from the information I see, that Clusterfs is proposing a different, basic model. Also, and on more of a detailed level, I note a small bubble at the bottom of the fifth page in the HSM Illustrations PDF, references HPSS mover protocol. In general, this seems accurate to me. However, specifically for our cluster set-up, we are interested in a newer HPSS protocol option called Local File Movement. My guess is that, however the actual data moving protocol is specified or configured, it would not be a big problem to substitute this one. Do you see any issues making this a significant challenge? Marty Barnaby -----Original Message----- From: lustre-devel-bounces@clusterfs.com [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of Peter J. Braam Sent: Monday, August 14, 2006 9:31 AM To: Niklas Edmundsson Cc: lustre-devel@clusterfs.com Subject: RE: [Lustre-devel] HSM plans Hi Nikke, > -----Original Message----- > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] > Sent: Monday, August 14, 2006 2:49 AM > To: Peter J. Braam > Cc: lustre-devel@clusterfs.com > Subject: Re: [Lustre-devel] HSM plans >> On Sat, 12 Aug 2006, Peter J Braam wrote:> > > During the last few months, CEA and CFS have discussed HSM > > requirements and architectural elements for Lustre. The > goal of the> > discussions has been to provide a setting which can address CEA''s > > operational requirements, can adapt to different > environments andcan > > evolve to a more integrated Lustre HSM solution. > > > > I attach a few slides with diagrams and a short requirements > > discussion. We look forward to your comments. > > It seems that you have covered both the basic need of easily > hooking into an existing HSM/tape-system by using an upcall > and the more advanced needs of performance by enabling the > ability to do parallell IO to/from tape for those large > sites out there. For example, on our site the integration > into the tape subsystem would be a simple script using the > Tivoli Storage Manager Archive capability. > > Just to get the discussion going I''ll provide the "standard > HSM problems" as we have experienced them. > > Whole-directory recalls: > > Even if you provide a tool to do it efficiently the user > won''t use it if he can do a simple cp * /dest/ in order to > get the job done. > Ideally you would want to detect file recalls in opendir or > sorted order and start prefetching and continuing doing so > if requests keeps coming in that order. > > Admittedly, this is only a real problem if you allow small > files to be migrated, ie files that are so large that the > mount&seek delay is hidden in the actual restore time are > OK. However, given increased tape transfer rates but > moderately reduced seek times you quickly get a policy which > doesn''t migrate as much to tape as you would want. > > It would be nice if the design enables the possibility to > reduce/solve this problem, maybe it does already. This was not yet added as a requirement, but it is a good idea to have this option. > > > Out of order requests: > > Another problem that usually shows up is requesting items > out-of-order, ie not in the optimal recall-from-tape-order. > There are > multiple ways to solve this, the best solution when you have an > intelligent system like TSM where you have no clue on tape > affinity is > usually to throw all requests at the tape management system > and let it > resolve them in the most efficient order. > > Again, whole-dir-recalls are a problem here due to its > one-request-at-a-time nature. There is simply no chance to do any > batching. In addition to the above "detect whole-dir-recall" you can > simply recall the entire directory if it''s small. Yes ... Again, nothing like this was mentioned, but I like this. What would you suggest we do about subdirectories? > > > Migration of complete directories: > > Users usually group their data sets in different > directories. It would > be nice if the migration could be done with that in mind. > For example, > to have the LRU to be common for the entire directory, and > migrate the > entire directory when migration-day comes. This way, you get all> relevant data together and doing a whole-directory recall suddenly >doesn''t mean that much tape-swapping. > > Migration-schemes using the "migrate the largest file > first"-philosophy on the other hand usually manages to spread the > files of a directory on an impressive number of tapes making a > whole-directory-recall an absolute nightmare. > Nothing like this was mentioned - but it is certainly a good idea again. (And fortunately, it doesn''t look that hard to me.) > These are some of my experiences with an HSM system that users are > allowed to place files on. Some might argue that they use it > the wrong > way, but at the same time HSM systems have done very little (if> anything?) to adapt to the usage patterns of the average user. It >would be very nice if the Lustre HSM support would excel in > this area. > Thanks a lot Nikke! > /Nikke > -- > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > =-=-=-=-=-=-=- > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | > nikke@hpc2n.umu.se > ------------------------------------------------------------- > -------------- > Some people think "asphalt" is a rectal disorder. > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > -=-=-=-=-=-=- > > _______________________________________________ Lustre-devel mailing list Lustre-devel@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-devel
Hi Shane, I completely agree, and I immediately made note of Nikke''s comments. We''ll work it into the policy handler. - Peter - > -----Original Message----- > From: Canon, Richard Shane [mailto:canonrs@ornl.gov] > Sent: Monday, August 14, 2006 10:52 AM > To: Peter J. Braam; Niklas Edmundsson > Cc: lustre-devel@clusterfs.com > Subject: RE: [Lustre-devel] HSM plans > > > Peter, > > I just wanted to echo a few of Nikke''s points. I think the > design should allow for multiple on-disk files to be packed > in a single tape file. Furthermore, these groups of files > should be recovered together (similar to the way a cache > line works in a cache-based computer architecture). As > Nikke pointed out, one of the critical issues with modern > tape archive systems is doing some type of small file > aggregation. HPSS is looking at how some of this could be > implemented in their 7.1 release. However, I think it would > be worthwhile for the Lustre HSM project to consider > directly implementing something similar to this. If you > would like a more detailed explanation, let me know > > I was curious how important the phase 2 deliverables are? > It looks like a lot of work and I''m not sure I see the value > for many of our applications. I could see it being useful > if you stored HUGE files (10s of TB) and only need to read > relatively small chunks. However, I''m not aware of a strong > need for this from our users. > > --Shane > > -----Original Message----- > From: lustre-devel-bounces@clusterfs.com > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of > Peter J. Braam > Sent: Monday, August 14, 2006 11:31 AM > To: Niklas Edmundsson > Cc: lustre-devel@clusterfs.com > Subject: RE: [Lustre-devel] HSM plans > > Hi Nikke, > > > -----Original Message----- > > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] > > Sent: Monday, August 14, 2006 2:49 AM > To: Peter J. > Braam > Cc: lustre-devel@clusterfs.com > Subject: Re: > [Lustre-devel] HSM plans > > On Sat, 12 Aug 2006, Peter J > Braam wrote: > > > > > During the last few months, CEA and CFS have discussed > HSM > > requirements and architectural elements for Lustre. > The > goal of the > > discussions has been to provide a > setting which can address CEA''s > > operational > requirements, can adapt to different > environments and can > > > evolve to a more integrated Lustre HSM solution. > > > > > > I attach a few slides with diagrams and a short > requirements > > discussion. We look forward to your comments. > > > > It seems that you have covered both the basic need of > easily > hooking into an existing HSM/tape-system by using > an upcall > and the more advanced needs of performance by > enabling the > ability to do parallell IO to/from tape for > those large > sites out there. For example, on our site the > integration > into the tape subsystem would be a simple > script using the > Tivoli Storage Manager Archive capability. > > > > Just to get the discussion going I''ll provide the > "standard > HSM problems" as we have experienced them. > > > > Whole-directory recalls: > > > > Even if you provide a tool to do it efficiently the user > > won''t use it if he can do a simple cp * /dest/ in order to > > get the job done. > > Ideally you would want to detect file recalls in opendir > or > sorted order and start prefetching and continuing > doing so > if requests keeps coming in that order. > > > > Admittedly, this is only a real problem if you allow > small > files to be migrated, ie files that are so large > that the > mount&seek delay is hidden in the actual restore > time are > OK. However, given increased tape transfer rates > but > moderately reduced seek times you quickly get a > policy which > doesn''t migrate as much to tape as you would want. > > > > It would be nice if the design enables the possibility to > > reduce/solve this problem, maybe it does already. > > This was not yet added as a requirement, but it is a good > idea to have this option. > > > > > > > Out of order requests: > > > > Another problem that usually shows up is requesting items > > out-of-order, ie not in the optimal recall-from-tape-order. > > There are > > multiple ways to solve this, the best solution when you > have an > intelligent system like TSM where you have no > clue on tape > affinity is > usually to throw all requests > at the tape management system > and let it > resolve them > in the most efficient order. > > > > Again, whole-dir-recalls are a problem here due to its > > one-request-at-a-time nature. There is simply no chance to > do any > batching. In addition to the above "detect > whole-dir-recall" you can > simply recall the entire > directory if it''s small. > > Yes ... Again, nothing like this was mentioned, but I like this. > > What would you suggest we do about subdirectories? > > > > > > > Migration of complete directories: > > > > Users usually group their data sets in different > > directories. It would > be nice if the migration could be > done with that in mind. > > For example, > > to have the LRU to be common for the entire directory, > and > migrate the > entire directory when migration-day > comes. This way, you get all > relevant data together and > doing a whole-directory recall suddenly > doesn''t mean that > much tape-swapping. > > > > Migration-schemes using the "migrate the largest file > > first"-philosophy on the other hand usually manages to > spread the > files of a directory on an impressive number > of tapes making a > whole-directory-recall an absolute nightmare. > > > > Nothing like this was mentioned - but it is certainly a good > idea again. > (And fortunately, it doesn''t look that hard to me.) > > > > > These are some of my experiences with an HSM system that > users are > allowed to place files on. Some might argue > that they use it > the wrong > way, but at the same time > HSM systems have done very little (if > anything?) to adapt > to the usage patterns of the average user. It > would be > very nice if the Lustre HSM support would excel in > this area. > > > > Thanks a lot Nikke! > > > /Nikke > > -- > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > =-=-=-=-=-=-=- > > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | > > nikke@hpc2n.umu.se > > ------------------------------------------------------------- > > -------------- > > Some people think "asphalt" is a rectal disorder. > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > -=-=-=-=-=-=- > > > > > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel > >
On Mon, 14 Aug 2006, Peter J. Braam wrote:> > Out of order requests: > > > > Another problem that usually shows up is requesting items > > out-of-order, ie not in the optimal recall-from-tape-order. > > There are > > multiple ways to solve this, the best solution when you have an > > intelligent system like TSM where you have no clue on tape > > affinity is > > usually to throw all requests at the tape management system > > and let it > > resolve them in the most efficient order. > > > > Again, whole-dir-recalls are a problem here due to its > > one-request-at-a-time nature. There is simply no chance to do any > > batching. In addition to the above "detect whole-dir-recall" you can > > simply recall the entire directory if it''s small. > > Yes ... Again, nothing like this was mentioned, but I like this. > > What would you suggest we do about subdirectories?Without thinking too much I''d say ignore them. It seems far too easy to get "I accessed a file in the root directory and now it recalls my entire filesystem" otherwise ;) Also, I''ve seen a lot of users that builds directory trees where each level holds totally unrelated data. However, the recall policy should probably be rather tightly coupled to the migration policy. If for example you detected that an entire tree of smallish files were ready for migration and thus migrated as a unit it makes sense to recall the entire thing... I have the feeling that if you let migration do the tough decisions on how to group data (remember that there are owner/group memberships that can help) you can get away with letting recall follow those decisions. I''m sure that there are people lurking on this list having experiences with way larger HSM systems than I have that can share their insights on whether this seems to make sense :-) /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke@hpc2n.umu.se --------------------------------------------------------------------------- Three can keep a secret, if two of them are dead. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Barnaby, Marty L wrote:>Also, and on more of a detailed level, I note a small bubble at the >bottom of the fifth page in the HSM Illustrations PDF, references HPSS >mover protocol. In general, this seems accurate to me. However, >specifically for our cluster set-up, we are interested in a newer HPSS >protocol option called Local File Movement. My guess is that, however >the actual data moving protocol is specified or configured, it would not >be a big problem to substitute this one. Do you see any issues making >this a significant challenge? > >Marty Barnaby > > >The data moving protocol is implemented in the copy tool which is a "third party" tool (HPSS tool in our case). The slide represents the tool we have made at CEA to copy files to/from HPSS (we will shared it with the HPSS community) but we can use any tool with the Lustre HSM design. The design only specifies what the tool should do (mandatory and optionnaly). JC
I would like to completely confirm what was said already - any file mover should work with this. We have even left room for native data transport to the HSM obd by Lustre, instead of using a client file system which a user level mover could access. So I don''t see particular challenges although I do think that the devil is in the details and the final implementation, as usual. - Peter - > -----Original Message----- > From: lustre-devel-bounces@clusterfs.com > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of > Barnaby, Marty L > Sent: Monday, August 14, 2006 2:05 PM > To: lustre-devel@clusterfs.com > Subject: RE: [Lustre-devel] HSM plans > > We at Sandia National Labs are very interested in this > discussion. I think I can start with two basic questions: > > I have a long familiarity with the old Cray/SGI Data > Migration Facility, and the recent HPSS project effort, with > I find essentially equivalent. > I would describe them both as sharing the same basic > paradigm of running a user-space utility package that > leverages the XDMS Data Migration API, which is a dependency > for their respective software, and must be supported by the > source filesystem. It appears to me, from the information I > see, that Clusterfs is proposing a different, basic model. > > Also, and on more of a detailed level, I note a small bubble > at the bottom of the fifth page in the HSM Illustrations > PDF, references HPSS mover protocol. In general, this seems > accurate to me. However, specifically for our cluster > set-up, we are interested in a newer HPSS protocol option > called Local File Movement. My guess is that, however the > actual data moving protocol is specified or configured, it > would not be a big problem to substitute this one. Do you > see any issues making this a significant challenge? > > Marty Barnaby > > > -----Original Message----- > From: lustre-devel-bounces@clusterfs.com > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of > Peter J. Braam > Sent: Monday, August 14, 2006 9:31 AM > To: Niklas Edmundsson > Cc: lustre-devel@clusterfs.com > Subject: RE: [Lustre-devel] HSM plans > > Hi Nikke, > > > -----Original Message----- > > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] > > Sent: Monday, August 14, 2006 2:49 AM > To: Peter J. Braam > Cc: > lustre-devel@clusterfs.com > Subject: Re: [Lustre-devel] > HSM plans > > > On Sat, 12 Aug 2006, Peter J Braam wrote: > > > > > During the last few months, CEA and CFS have discussed > HSM > > requirements and architectural elements for Lustre. > The > goal of the > > > discussions has been to provide a setting which can > address CEA''s > > > operational requirements, can adapt to different > > environments and > can > > evolve to a more integrated Lustre HSM solution. > > > > > > I attach a few slides with diagrams and a short > requirements > > discussion. We look forward to your comments. > > > > It seems that you have covered both the basic need of > easily > hooking into an existing HSM/tape-system by using > an upcall > and the more advanced needs of performance by > enabling the > ability to do parallell IO to/from tape for > those large > sites out there. For example, on our site the > integration > into the tape subsystem would be a simple > script using the > Tivoli Storage Manager Archive capability. > > > > Just to get the discussion going I''ll provide the > "standard > HSM problems" as we have experienced them. > > > > Whole-directory recalls: > > > > Even if you provide a tool to do it efficiently the user > > won''t use it if he can do a simple cp * /dest/ in order to > > get the job done. > > Ideally you would want to detect file recalls in opendir > or > sorted order and start prefetching and continuing > doing so > if requests keeps coming in that order. > > > > Admittedly, this is only a real problem if you allow > small > files to be migrated, ie files that are so large > that the > mount&seek delay is hidden in the actual restore > time are > OK. However, given increased tape transfer rates > but > moderately reduced seek times you quickly get a > policy which > doesn''t migrate as much to tape as you would want. > > > > It would be nice if the design enables the possibility to > > reduce/solve this problem, maybe it does already. > > This was not yet added as a requirement, but it is a good > idea to have this option. > > > > > > > Out of order requests: > > > > Another problem that usually shows up is requesting items > > out-of-order, ie not in the optimal recall-from-tape-order. > > There are > > multiple ways to solve this, the best solution when you > have an > intelligent system like TSM where you have no > clue on tape > affinity is > usually to throw all requests > at the tape management system > and let it > resolve them > in the most efficient order. > > > > Again, whole-dir-recalls are a problem here due to its > > one-request-at-a-time nature. There is simply no chance to > do any > batching. In addition to the above "detect > whole-dir-recall" you can > simply recall the entire > directory if it''s small. > > Yes ... Again, nothing like this was mentioned, but I like this. > > What would you suggest we do about subdirectories? > > > > > > > Migration of complete directories: > > > > Users usually group their data sets in different > > directories. It would > be nice if the migration could be > done with that in mind. > > For example, > > to have the LRU to be common for the entire directory, > and > migrate the > entire directory when migration-day > comes. This way, you get all > > relevant data together and doing a whole-directory recall > suddenly > > doesn''t mean that much tape-swapping. > > > > Migration-schemes using the "migrate the largest file > > first"-philosophy on the other hand usually manages to > spread the > files of a directory on an impressive number > of tapes making a > whole-directory-recall an absolute nightmare. > > > > Nothing like this was mentioned - but it is certainly a good > idea again. > (And fortunately, it doesn''t look that hard to me.) > > > > > These are some of my experiences with an HSM system that > users are > allowed to place files on. Some might argue > that they use it > the wrong > way, but at the same time > HSM systems have done very little (if > > anything?) to adapt to the usage patterns of the average > user. It > > would be very nice if the Lustre HSM support would excel in > > this area. > > > > Thanks a lot Nikke! > > > /Nikke > > -- > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > =-=-=-=-=-=-=- > > Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | > > nikke@hpc2n.umu.se > > ------------------------------------------------------------- > > -------------- > > Some people think "asphalt" is a rectal disorder. > > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > -=-=-=-=-=-=- > > > > > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel > > > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel > >