thr3ads.net - Lustre devel - [Lustre-devel] HSM plans [Aug 2006]

If this information is useful, please help other people find it:
Share via:

Peter J Braam

2006-Aug-12 21:06 UTC

[Lustre-devel] HSM plans

Skipped content of type multipart/alternative-------------- next part
--------------
A non-text attachment was scrubbed...
Name: HSM-imgs-cea-cfs-v1-web.pdf
Type: application/pdf
Size: 48908 bytes
Desc: not available
Url :
http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20060812/0274b230/HSM-imgs-cea-cfs-v1-web-0001.pdf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hsm-cea-reqs-v1-web.pdf
Type: application/pdf
Size: 97927 bytes
Desc: not available
Url :
http://mail.clusterfs.com/pipermail/lustre-devel/attachments/20060812/0274b230/hsm-cea-reqs-v1-web-0001.pdf

Niklas Edmundsson

2006-Aug-14 02:48 UTC

head link

[Lustre-devel] HSM plans

On Sat, 12 Aug 2006, Peter J Braam wrote:
> During the last few months, CEA and CFS have discussed HSM requirements and
> architectural elements for Lustre.  The goal of the discussions has been to
> provide a setting which can address CEA''s operational
requirements, can
> adapt to different environments and can evolve to a more integrated Lustre
> HSM solution.
>
> I attach a few slides with diagrams and a short requirements discussion. 
We
> look forward to your comments.
It seems that you have covered both the basic need of easily hooking 
into an existing HSM/tape-system by using an upcall and the more 
advanced needs of performance by enabling the ability to do parallell 
IO to/from tape for those large sites out there. For example, on our 
site the integration into the tape subsystem would be a simple script 
using the Tivoli Storage Manager Archive capability.

Just to get the discussion going I''ll provide the "standard HSM 
problems" as we have experienced them.

Whole-directory recalls:

Even if you provide a tool to do it efficiently the user won''t use it 
if he can do a simple cp * /dest/ in order to get the job done. 
Ideally you would want to detect file recalls in opendir or sorted 
order and start prefetching and continuing doing so if requests keeps 
coming in that order.

Admittedly, this is only a real problem if you allow small files to be 
migrated, ie files that are so large that the mount&seek delay is 
hidden in the actual restore time are OK. However, given increased 
tape transfer rates but moderately reduced seek times you quickly get 
a policy which doesn''t migrate as much to tape as you would want.

It would be nice if the design enables the possibility to reduce/solve 
this problem, maybe it does already.

Out of order requests:

Another problem that usually shows up is requesting items 
out-of-order, ie not in the optimal recall-from-tape-order. There are 
multiple ways to solve this, the best solution when you have an 
intelligent system like TSM where you have no clue on tape affinity is 
usually to throw all requests at the tape management system and let it 
resolve them in the most efficient order.

Again, whole-dir-recalls are a problem here due to its 
one-request-at-a-time nature. There is simply no chance to do any 
batching. In addition to the above "detect whole-dir-recall" you can 
simply recall the entire directory if it''s small.

Migration of complete directories:

Users usually group their data sets in different directories. It would 
be nice if the migration could be done with that in mind. For example, 
to have the LRU to be common for the entire directory, and migrate the 
entire directory when migration-day comes. This way, you get all 
relevant data together and doing a whole-directory recall suddenly 
doesn''t mean that much tape-swapping.

Migration-schemes using the "migrate the largest file 
first"-philosophy on the other hand usually manages to spread the 
files of a directory on an impressive number of tapes making a 
whole-directory-recall an absolute nightmare.

These are some of my experiences with an HSM system that users are 
allowed to place files on. Some might argue that they use it the wrong 
way, but at the same time HSM systems have done very little (if 
anything?) to adapt to the usage patterns of the average user. It 
would be very nice if the Lustre HSM support would excel in this area.

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  Some people think "asphalt" is a rectal disorder.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Peter J. Braam

2006-Aug-14 09:30 UTC

head link

[Lustre-devel] HSM plans

Hi Nikke,

 > -----Original Message-----
 > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] 
 > Sent: Monday, August 14, 2006 2:49 AM
 > To: Peter J. Braam
 > Cc: lustre-devel@clusterfs.com
 > Subject: Re: [Lustre-devel] HSM plans
 > 
 > On Sat, 12 Aug 2006, Peter J Braam wrote:
 > 
 > > During the last few months, CEA and CFS have discussed HSM 
 > > requirements and architectural elements for Lustre.  The 
 > goal of the 
 > > discussions has been to provide a setting which can address
CEA''s
 > > operational requirements, can adapt to different 
 > environments and can 
 > > evolve to a more integrated Lustre HSM solution.
 > >
 > > I attach a few slides with diagrams and a short requirements 
 > > discussion.  We look forward to your comments.
 > 
 > It seems that you have covered both the basic need of easily 
 > hooking into an existing HSM/tape-system by using an upcall 
 > and the more advanced needs of performance by enabling the 
 > ability to do parallell IO to/from tape for those large 
 > sites out there. For example, on our site the integration 
 > into the tape subsystem would be a simple script using the 
 > Tivoli Storage Manager Archive capability.
 > 
 > Just to get the discussion going I''ll provide the "standard 
 > HSM problems" as we have experienced them.
 > 
 > Whole-directory recalls:
 > 
 > Even if you provide a tool to do it efficiently the user 
 > won''t use it if he can do a simple cp * /dest/ in order to 
 > get the job done. 
 > Ideally you would want to detect file recalls in opendir or 
 > sorted order and start prefetching and continuing doing so 
 > if requests keeps coming in that order.
 > 
 > Admittedly, this is only a real problem if you allow small 
 > files to be migrated, ie files that are so large that the 
 > mount&seek delay is hidden in the actual restore time are 
 > OK. However, given increased tape transfer rates but 
 > moderately reduced seek times you quickly get a policy which 
 > doesn''t migrate as much to tape as you would want.
 > 
 > It would be nice if the design enables the possibility to 
 > reduce/solve this problem, maybe it does already.

This was not yet added as a requirement, but it is a good idea to have
this option.

 > 
 > 
 > Out of order requests:
 > 
 > Another problem that usually shows up is requesting items 
 > out-of-order, ie not in the optimal recall-from-tape-order. 
 > There are 
 > multiple ways to solve this, the best solution when you have an 
 > intelligent system like TSM where you have no clue on tape 
 > affinity is 
 > usually to throw all requests at the tape management system 
 > and let it 
 > resolve them in the most efficient order.
 > 
 > Again, whole-dir-recalls are a problem here due to its 
 > one-request-at-a-time nature. There is simply no chance to do any 
 > batching. In addition to the above "detect whole-dir-recall" you
can
 > simply recall the entire directory if it''s small.

Yes ... Again, nothing like this was mentioned, but I like this.

What would you suggest we do about subdirectories?

 > 
 > 
 > Migration of complete directories:
 > 
 > Users usually group their data sets in different 
 > directories. It would 
 > be nice if the migration could be done with that in mind. 
 > For example, 
 > to have the LRU to be common for the entire directory, and 
 > migrate the 
 > entire directory when migration-day comes. This way, you get all 
 > relevant data together and doing a whole-directory recall suddenly 
 > doesn''t mean that much tape-swapping.
 > 
 > Migration-schemes using the "migrate the largest file 
 > first"-philosophy on the other hand usually manages to spread the 
 > files of a directory on an impressive number of tapes making a 
 > whole-directory-recall an absolute nightmare.
 > 

Nothing like this was mentioned - but it is certainly a good idea again.
(And fortunately, it doesn''t look that hard to me.)



 > These are some of my experiences with an HSM system that users are 
 > allowed to place files on. Some might argue that they use it 
 > the wrong 
 > way, but at the same time HSM systems have done very little (if 
 > anything?) to adapt to the usage patterns of the average user. It 
 > would be very nice if the Lustre HSM support would excel in 
 > this area.
 > 

Thanks a lot Nikke!

 > /Nikke
 > -- 
 > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 > =-=-=-=-=-=-=-
 >   Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    
 > nikke@hpc2n.umu.se
 > -------------------------------------------------------------
 > --------------
 >   Some people think "asphalt" is a rectal disorder.
 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >
-=-=-=-=-=-=- >
 >

Canon, Richard Shane

2006-Aug-14 10:52 UTC

head link

[Lustre-devel] HSM plans

Peter,

I just wanted to echo a few of Nikke''s points.  I think the design
should allow for multiple on-disk files to be packed in a single tape
file.  Furthermore, these groups of files should be recovered together
(similar to the way a cache line works in a cache-based computer
architecture).  As Nikke pointed out, one of the critical issues with
modern tape archive systems is doing some type of small file
aggregation.  HPSS is looking at how some of this could be implemented
in their 7.1 release.  However, I think it would be worthwhile for the
Lustre HSM project to consider directly implementing something similar
to this.  If you would like a more detailed explanation, let me know

I was curious how important the phase 2 deliverables are?  It looks like
a lot of work and I''m not sure I see the value for many of our
applications.  I could see it being useful if you stored HUGE files (10s
of TB) and only need to read relatively small chunks.  However, I''m not
aware of a strong need for this from our users.

--Shane

-----Original Message-----
From: lustre-devel-bounces@clusterfs.com
[mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of Peter J. Braam
Sent: Monday, August 14, 2006 11:31 AM
To: Niklas Edmundsson
Cc: lustre-devel@clusterfs.com
Subject: RE: [Lustre-devel] HSM plans

 Hi Nikke,

 > -----Original Message-----
 > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se] 
 > Sent: Monday, August 14, 2006 2:49 AM
 > To: Peter J. Braam
 > Cc: lustre-devel@clusterfs.com
 > Subject: Re: [Lustre-devel] HSM plans
 > 
 > On Sat, 12 Aug 2006, Peter J Braam wrote:
 > 
 > > During the last few months, CEA and CFS have discussed HSM 
 > > requirements and architectural elements for Lustre.  The 
 > goal of the 
 > > discussions has been to provide a setting which can address
CEA''s
 > > operational requirements, can adapt to different 
 > environments and can 
 > > evolve to a more integrated Lustre HSM solution.
 > >
 > > I attach a few slides with diagrams and a short requirements 
 > > discussion.  We look forward to your comments.
 > 
 > It seems that you have covered both the basic need of easily 
 > hooking into an existing HSM/tape-system by using an upcall 
 > and the more advanced needs of performance by enabling the 
 > ability to do parallell IO to/from tape for those large 
 > sites out there. For example, on our site the integration 
 > into the tape subsystem would be a simple script using the 
 > Tivoli Storage Manager Archive capability.
 > 
 > Just to get the discussion going I''ll provide the "standard 
 > HSM problems" as we have experienced them.
 > 
 > Whole-directory recalls:
 > 
 > Even if you provide a tool to do it efficiently the user 
 > won''t use it if he can do a simple cp * /dest/ in order to 
 > get the job done. 
 > Ideally you would want to detect file recalls in opendir or 
 > sorted order and start prefetching and continuing doing so 
 > if requests keeps coming in that order.
 > 
 > Admittedly, this is only a real problem if you allow small 
 > files to be migrated, ie files that are so large that the 
 > mount&seek delay is hidden in the actual restore time are 
 > OK. However, given increased tape transfer rates but 
 > moderately reduced seek times you quickly get a policy which 
 > doesn''t migrate as much to tape as you would want.
 > 
 > It would be nice if the design enables the possibility to 
 > reduce/solve this problem, maybe it does already.

This was not yet added as a requirement, but it is a good idea to have
this option.

 > 
 > 
 > Out of order requests:
 > 
 > Another problem that usually shows up is requesting items 
 > out-of-order, ie not in the optimal recall-from-tape-order. 
 > There are 
 > multiple ways to solve this, the best solution when you have an 
 > intelligent system like TSM where you have no clue on tape 
 > affinity is 
 > usually to throw all requests at the tape management system 
 > and let it 
 > resolve them in the most efficient order.
 > 
 > Again, whole-dir-recalls are a problem here due to its 
 > one-request-at-a-time nature. There is simply no chance to do any 
 > batching. In addition to the above "detect whole-dir-recall" you
can
 > simply recall the entire directory if it''s small.

Yes ... Again, nothing like this was mentioned, but I like this.

What would you suggest we do about subdirectories?

 > 
 > 
 > Migration of complete directories:
 > 
 > Users usually group their data sets in different 
 > directories. It would 
 > be nice if the migration could be done with that in mind. 
 > For example, 
 > to have the LRU to be common for the entire directory, and 
 > migrate the 
 > entire directory when migration-day comes. This way, you get all 
 > relevant data together and doing a whole-directory recall suddenly 
 > doesn''t mean that much tape-swapping.
 > 
 > Migration-schemes using the "migrate the largest file 
 > first"-philosophy on the other hand usually manages to spread the 
 > files of a directory on an impressive number of tapes making a 
 > whole-directory-recall an absolute nightmare.
 > 

Nothing like this was mentioned - but it is certainly a good idea again.
(And fortunately, it doesn''t look that hard to me.)



 > These are some of my experiences with an HSM system that users are 
 > allowed to place files on. Some might argue that they use it 
 > the wrong 
 > way, but at the same time HSM systems have done very little (if 
 > anything?) to adapt to the usage patterns of the average user. It 
 > would be very nice if the Lustre HSM support would excel in 
 > this area.
 > 

Thanks a lot Nikke!

 > /Nikke
 > -- 
 > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 > =-=-=-=-=-=-=-
 >   Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    
 > nikke@hpc2n.umu.se
 > -------------------------------------------------------------
 > --------------
 >   Some people think "asphalt" is a rectal disorder.
 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >
-=-=-=-=-=-=- >
 > 

_______________________________________________
Lustre-devel mailing list
Lustre-devel@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Barnaby, Marty L

2006-Aug-14 14:05 UTC

head link

[Lustre-devel] HSM plans

We at Sandia National Labs are very interested in this discussion. I
think I can start with two basic questions:

I have a long familiarity with the old Cray/SGI Data Migration Facility,
and the recent HPSS project effort, with I find essentially equivalent.
I would describe them both as sharing the same basic paradigm of running
a user-space utility package that leverages the XDMS Data Migration API,
which is a dependency for their respective software, and must be
supported by the source filesystem. It appears to me, from the
information I see, that Clusterfs is proposing a different, basic model.

Also, and on more of a detailed level, I note a small bubble at the
bottom of the fifth page in the HSM Illustrations PDF, references HPSS
mover protocol. In general, this seems accurate to me. However,
specifically for our cluster set-up, we are interested in a newer HPSS
protocol option called Local File Movement. My guess is that, however
the actual data moving protocol is specified or configured, it would not
be a big problem to substitute this one. Do you see any issues making
this a significant challenge?

Marty Barnaby


-----Original Message-----
From: lustre-devel-bounces@clusterfs.com
[mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of Peter J. Braam
Sent: Monday, August 14, 2006 9:31 AM
To: Niklas Edmundsson
Cc: lustre-devel@clusterfs.com
Subject: RE: [Lustre-devel] HSM plans

 Hi Nikke,

 > -----Original Message-----
 > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se]
 > Sent: Monday, August 14, 2006 2:49 AM  > To: Peter J. Braam  > Cc:
lustre-devel@clusterfs.com  > Subject: Re: [Lustre-devel] HSM plans 
>> On Sat, 12 Aug 2006, Peter J Braam wrote: >
 > > During the last few months, CEA and CFS have discussed HSM  > >
requirements and architectural elements for Lustre.  The  > goal of
the> > discussions has been to provide a setting which can address
CEA''s  >
> operational requirements, can adapt to different  > environments andcan  > > evolve to a more integrated Lustre HSM solution.
 > >
 > > I attach a few slides with diagrams and a short requirements  >
>
discussion.  We look forward to your comments.
 >
 > It seems that you have covered both the basic need of easily  >
hooking into an existing HSM/tape-system by using an upcall  > and the
more advanced needs of performance by enabling the  > ability to do
parallell IO to/from tape for those large  > sites out there. For
example, on our site the integration  > into the tape subsystem would be
a simple script using the  > Tivoli Storage Manager Archive capability.
 >
 > Just to get the discussion going I''ll provide the "standard 
> HSM
problems" as we have experienced them.
 >
 > Whole-directory recalls:
 >
 > Even if you provide a tool to do it efficiently the user  >
won''t use
it if he can do a simple cp * /dest/ in order to  > get the job done. 
 > Ideally you would want to detect file recalls in opendir or  > sorted
order and start prefetching and continuing doing so  > if requests keeps
coming in that order.
 >
 > Admittedly, this is only a real problem if you allow small  > files
to be migrated, ie files that are so large that the  > mount&seek delay
is hidden in the actual restore time are  > OK. However, given increased
tape transfer rates but  > moderately reduced seek times you quickly get
a policy which  > doesn''t migrate as much to tape as you would want.
 >
 > It would be nice if the design enables the possibility to  >
reduce/solve this problem, maybe it does already.

This was not yet added as a requirement, but it is a good idea to have
this option.

 >
 >
 > Out of order requests:
 >
 > Another problem that usually shows up is requesting items  >
out-of-order, ie not in the optimal recall-from-tape-order. 
 > There are
 > multiple ways to solve this, the best solution when you have an  >
intelligent system like TSM where you have no clue on tape  > affinity
is  > usually to throw all requests at the tape management system  > and
let it  > resolve them in the most efficient order.
 >
 > Again, whole-dir-recalls are a problem here due to its  >
one-request-at-a-time nature. There is simply no chance to do any  >
batching. In addition to the above "detect whole-dir-recall" you can 
>
simply recall the entire directory if it''s small.

Yes ... Again, nothing like this was mentioned, but I like this.

What would you suggest we do about subdirectories?

 >
 >
 > Migration of complete directories:
 >
 > Users usually group their data sets in different  > directories. It
would  > be nice if the migration could be done with that in mind. 
 > For example,
 > to have the LRU to be common for the entire directory, and  > migrate
the  > entire directory when migration-day comes. This way, you get
all> relevant data together and doing a whole-directory recall suddenly  >doesn''t mean that much tape-swapping.
 >
 > Migration-schemes using the "migrate the largest file  >
first"-philosophy on the other hand usually manages to spread the  >
files of a directory on an impressive number of tapes making a  >
whole-directory-recall an absolute nightmare.
 > 

Nothing like this was mentioned - but it is certainly a good idea again.
(And fortunately, it doesn''t look that hard to me.)



 > These are some of my experiences with an HSM system that users are  >
allowed to place files on. Some might argue that they use it  > the
wrong  > way, but at the same time HSM systems have done very little
(if> anything?) to adapt to the usage patterns of the average user. It  >would be very nice if the Lustre HSM support would excel in  > this
area.
 > 

Thanks a lot Nikke!

 > /Nikke
 > --
 > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 > =-=-=-=-=-=-=-
 >   Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    
 > nikke@hpc2n.umu.se
 > -------------------------------------------------------------
 > --------------
 >   Some people think "asphalt" is a rectal disorder.
 > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >
-=-=-=-=-=-=- >
 > 

_______________________________________________
Lustre-devel mailing list
Lustre-devel@clusterfs.com
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Peter J. Braam

2006-Aug-14 16:18 UTC

head link

[Lustre-devel] HSM plans

Hi Shane,

I completely agree, and I immediately made note of Nikke''s comments.
We''ll work it into the policy handler.

- Peter - 

 > -----Original Message-----
 > From: Canon, Richard Shane [mailto:canonrs@ornl.gov] 
 > Sent: Monday, August 14, 2006 10:52 AM
 > To: Peter J. Braam; Niklas Edmundsson
 > Cc: lustre-devel@clusterfs.com
 > Subject: RE: [Lustre-devel] HSM plans
 > 
 > 
 > Peter,
 > 
 > I just wanted to echo a few of Nikke''s points.  I think the 
 > design should allow for multiple on-disk files to be packed 
 > in a single tape file.  Furthermore, these groups of files 
 > should be recovered together (similar to the way a cache 
 > line works in a cache-based computer architecture).  As 
 > Nikke pointed out, one of the critical issues with modern 
 > tape archive systems is doing some type of small file 
 > aggregation.  HPSS is looking at how some of this could be 
 > implemented in their 7.1 release.  However, I think it would 
 > be worthwhile for the Lustre HSM project to consider 
 > directly implementing something similar to this.  If you 
 > would like a more detailed explanation, let me know
 > 
 > I was curious how important the phase 2 deliverables are?  
 > It looks like a lot of work and I''m not sure I see the value 
 > for many of our applications.  I could see it being useful 
 > if you stored HUGE files (10s of TB) and only need to read 
 > relatively small chunks.  However, I''m not aware of a strong 
 > need for this from our users.
 > 
 > --Shane
 > 
 > -----Original Message-----
 > From: lustre-devel-bounces@clusterfs.com
 > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of 
 > Peter J. Braam
 > Sent: Monday, August 14, 2006 11:31 AM
 > To: Niklas Edmundsson
 > Cc: lustre-devel@clusterfs.com
 > Subject: RE: [Lustre-devel] HSM plans
 > 
 >  Hi Nikke,
 > 
 >  > -----Original Message-----
 >  > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se]
 >  > Sent: Monday, August 14, 2006 2:49 AM  > To: Peter J. 
 > Braam  > Cc: lustre-devel@clusterfs.com  > Subject: Re: 
 > [Lustre-devel] HSM plans  >  > On Sat, 12 Aug 2006, Peter J 
 > Braam wrote:
 >  >
 >  > > During the last few months, CEA and CFS have discussed 
 > HSM  > > requirements and architectural elements for Lustre. 
 >  The  > goal of the  > > discussions has been to provide a 
 > setting which can address CEA''s  > > operational 
 > requirements, can adapt to different  > environments and can 
 >  > > evolve to a more integrated Lustre HSM solution.
 >  > >
 >  > > I attach a few slides with diagrams and a short 
 > requirements  > > discussion.  We look forward to your comments.
 >  >
 >  > It seems that you have covered both the basic need of 
 > easily  > hooking into an existing HSM/tape-system by using 
 > an upcall  > and the more advanced needs of performance by 
 > enabling the  > ability to do parallell IO to/from tape for 
 > those large  > sites out there. For example, on our site the 
 > integration  > into the tape subsystem would be a simple 
 > script using the  > Tivoli Storage Manager Archive capability.
 >  >
 >  > Just to get the discussion going I''ll provide the 
 > "standard  > HSM problems" as we have experienced them.
 >  >
 >  > Whole-directory recalls:
 >  >
 >  > Even if you provide a tool to do it efficiently the user  
 > > won''t use it if he can do a simple cp * /dest/ in order to 
 >  > get the job done. 
 >  > Ideally you would want to detect file recalls in opendir 
 > or  > sorted order and start prefetching and continuing 
 > doing so  > if requests keeps coming in that order.
 >  >
 >  > Admittedly, this is only a real problem if you allow 
 > small  > files to be migrated, ie files that are so large 
 > that the  > mount&seek delay is hidden in the actual restore 
 > time are  > OK. However, given increased tape transfer rates 
 > but  > moderately reduced seek times you quickly get a 
 > policy which  > doesn''t migrate as much to tape as you would
want.
 >  >
 >  > It would be nice if the design enables the possibility to 
 >  > reduce/solve this problem, maybe it does already.
 > 
 > This was not yet added as a requirement, but it is a good 
 > idea to have this option.
 > 
 >  >
 >  >
 >  > Out of order requests:
 >  >
 >  > Another problem that usually shows up is requesting items 
 >  > out-of-order, ie not in the optimal recall-from-tape-order. 
 >  > There are
 >  > multiple ways to solve this, the best solution when you 
 > have an  > intelligent system like TSM where you have no 
 > clue on tape  > affinity is  > usually to throw all requests 
 > at the tape management system  > and let it  > resolve them 
 > in the most efficient order.
 >  >
 >  > Again, whole-dir-recalls are a problem here due to its  > 
 > one-request-at-a-time nature. There is simply no chance to 
 > do any  > batching. In addition to the above "detect 
 > whole-dir-recall" you can  > simply recall the entire 
 > directory if it''s small.
 > 
 > Yes ... Again, nothing like this was mentioned, but I like this.
 > 
 > What would you suggest we do about subdirectories?
 > 
 >  >
 >  >
 >  > Migration of complete directories:
 >  >
 >  > Users usually group their data sets in different  > 
 > directories. It would  > be nice if the migration could be 
 > done with that in mind. 
 >  > For example,
 >  > to have the LRU to be common for the entire directory, 
 > and  > migrate the  > entire directory when migration-day 
 > comes. This way, you get all  > relevant data together and 
 > doing a whole-directory recall suddenly  > doesn''t mean that 
 > much tape-swapping.
 >  >
 >  > Migration-schemes using the "migrate the largest file  > 
 > first"-philosophy on the other hand usually manages to 
 > spread the  > files of a directory on an impressive number 
 > of tapes making a  > whole-directory-recall an absolute nightmare.
 >  > 
 > 
 > Nothing like this was mentioned - but it is certainly a good 
 > idea again.
 > (And fortunately, it doesn''t look that hard to me.)
 > 
 > 
 > 
 >  > These are some of my experiences with an HSM system that 
 > users are  > allowed to place files on. Some might argue 
 > that they use it  > the wrong  > way, but at the same time 
 > HSM systems have done very little (if  > anything?) to adapt 
 > to the usage patterns of the average user. It  > would be 
 > very nice if the Lustre HSM support would excel in  > this area.
 >  > 
 > 
 > Thanks a lot Nikke!
 > 
 >  > /Nikke
 >  > --
 >  > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 >  > =-=-=-=-=-=-=-
 >  >   Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    
 >  > nikke@hpc2n.umu.se
 >  > -------------------------------------------------------------
 >  > --------------
 >  >   Some people think "asphalt" is a rectal disorder.
 >  > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > 
> -=-=-=-=-=-=- >  >
 >  > 
 > 
 > _______________________________________________
 > Lustre-devel mailing list
 > Lustre-devel@clusterfs.com
 > https://mail.clusterfs.com/mailman/listinfo/lustre-devel
 > 
 >

Niklas Edmundsson

2006-Aug-15 01:14 UTC

head link

[Lustre-devel] HSM plans

On Mon, 14 Aug 2006, Peter J. Braam wrote:
> > Out of order requests:
> >
> > Another problem that usually shows up is requesting items
> > out-of-order, ie not in the optimal recall-from-tape-order.
> > There are
> > multiple ways to solve this, the best solution when you have an
> > intelligent system like TSM where you have no clue on tape
> > affinity is
> > usually to throw all requests at the tape management system
> > and let it
> > resolve them in the most efficient order.
> >
> > Again, whole-dir-recalls are a problem here due to its
> > one-request-at-a-time nature. There is simply no chance to do any
> > batching. In addition to the above "detect whole-dir-recall"
you can
> > simply recall the entire directory if it''s small.
>
> Yes ... Again, nothing like this was mentioned, but I like this.
>
> What would you suggest we do about subdirectories?
Without thinking too much I''d say ignore them. It seems far too easy 
to get "I accessed a file in the root directory and now it recalls my 
entire filesystem" otherwise ;) Also, I''ve seen a lot of users
that
builds directory trees where each level holds totally unrelated data.

However, the recall policy should probably be rather tightly coupled 
to the migration policy. If for example you detected that an entire 
tree of smallish files were ready for migration and thus migrated as a 
unit it makes sense to recall the entire thing...

I have the feeling that if you let migration do the tough decisions on 
how to group data (remember that there are owner/group memberships 
that can help) you can get away with letting recall follow those 
decisions.

I''m sure that there are people lurking on this list having experiences 
with way larger HSM systems than I have that can share their insights 
on whether this seems to make sense :-)

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    nikke@hpc2n.umu.se
---------------------------------------------------------------------------
  Three can keep a secret, if two of them are dead.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Jacques-Charles Lafoucriere

2006-Aug-17 01:22 UTC

head link

[Lustre-devel] HSM plans

Barnaby, Marty L wrote:
>Also, and on more of a detailed level, I note a small bubble at the
>bottom of the fifth page in the HSM Illustrations PDF, references HPSS
>mover protocol. In general, this seems accurate to me. However,
>specifically for our cluster set-up, we are interested in a newer HPSS
>protocol option called Local File Movement. My guess is that, however
>the actual data moving protocol is specified or configured, it would not
>be a big problem to substitute this one. Do you see any issues making
>this a significant challenge?
>
>Marty Barnaby
>
>  
>The data moving protocol is implemented in the copy tool which is a 
"third party" tool (HPSS tool in our case).
The slide represents the tool we have made at CEA to copy files to/from 
HPSS (we will shared it with the HPSS community)
but we can use any tool with the Lustre HSM design.
The design only specifies what the tool should do (mandatory and 
optionnaly).

JC

Peter J. Braam

2006-Aug-17 16:32 UTC

head link

[Lustre-devel] HSM plans

I would like to completely confirm what was said already - any file
mover should work with this.  We have even left room for native data
transport to the HSM obd by Lustre, instead of using a client file
system which a user level mover could access.

So I don''t see particular challenges although I do think that the devil
is in the details and the final implementation, as usual.

- Peter - 

 > -----Original Message-----
 > From: lustre-devel-bounces@clusterfs.com 
 > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of 
 > Barnaby, Marty L
 > Sent: Monday, August 14, 2006 2:05 PM
 > To: lustre-devel@clusterfs.com
 > Subject: RE: [Lustre-devel] HSM plans
 > 
 > We at Sandia National Labs are very interested in this 
 > discussion. I think I can start with two basic questions:
 > 
 > I have a long familiarity with the old Cray/SGI Data 
 > Migration Facility, and the recent HPSS project effort, with 
 > I find essentially equivalent.
 > I would describe them both as sharing the same basic 
 > paradigm of running a user-space utility package that 
 > leverages the XDMS Data Migration API, which is a dependency 
 > for their respective software, and must be supported by the 
 > source filesystem. It appears to me, from the information I 
 > see, that Clusterfs is proposing a different, basic model.
 > 
 > Also, and on more of a detailed level, I note a small bubble 
 > at the bottom of the fifth page in the HSM Illustrations 
 > PDF, references HPSS mover protocol. In general, this seems 
 > accurate to me. However, specifically for our cluster 
 > set-up, we are interested in a newer HPSS protocol option 
 > called Local File Movement. My guess is that, however the 
 > actual data moving protocol is specified or configured, it 
 > would not be a big problem to substitute this one. Do you 
 > see any issues making this a significant challenge?
 > 
 > Marty Barnaby
 > 
 > 
 > -----Original Message-----
 > From: lustre-devel-bounces@clusterfs.com
 > [mailto:lustre-devel-bounces@clusterfs.com] On Behalf Of 
 > Peter J. Braam
 > Sent: Monday, August 14, 2006 9:31 AM
 > To: Niklas Edmundsson
 > Cc: lustre-devel@clusterfs.com
 > Subject: RE: [Lustre-devel] HSM plans
 > 
 >  Hi Nikke,
 > 
 >  > -----Original Message-----
 >  > From: Niklas Edmundsson [mailto:Niklas.Edmundsson@hpc2n.umu.se]
 >  > Sent: Monday, August 14, 2006 2:49 AM  > To: Peter J. Braam  >
Cc:
 > lustre-devel@clusterfs.com  > Subject: Re: [Lustre-devel] 
 > HSM plans  >
 > > On Sat, 12 Aug 2006, Peter J Braam wrote:
 >  >
 >  > > During the last few months, CEA and CFS have discussed 
 > HSM  > > requirements and architectural elements for Lustre. 
 >  The  > goal of the
 > > > discussions has been to provide a setting which can 
 > address CEA''s  >
 > > operational requirements, can adapt to different  > 
 > environments and
 > can  > > evolve to a more integrated Lustre HSM solution.
 >  > >
 >  > > I attach a few slides with diagrams and a short 
 > requirements  > > discussion.  We look forward to your comments.
 >  >
 >  > It seems that you have covered both the basic need of 
 > easily  > hooking into an existing HSM/tape-system by using 
 > an upcall  > and the more advanced needs of performance by 
 > enabling the  > ability to do parallell IO to/from tape for 
 > those large  > sites out there. For example, on our site the 
 > integration  > into the tape subsystem would be a simple 
 > script using the  > Tivoli Storage Manager Archive capability.
 >  >
 >  > Just to get the discussion going I''ll provide the 
 > "standard  > HSM problems" as we have experienced them.
 >  >
 >  > Whole-directory recalls:
 >  >
 >  > Even if you provide a tool to do it efficiently the user  
 > > won''t use it if he can do a simple cp * /dest/ in order to 
 >  > get the job done. 
 >  > Ideally you would want to detect file recalls in opendir 
 > or  > sorted order and start prefetching and continuing 
 > doing so  > if requests keeps coming in that order.
 >  >
 >  > Admittedly, this is only a real problem if you allow 
 > small  > files to be migrated, ie files that are so large 
 > that the  > mount&seek delay is hidden in the actual restore 
 > time are  > OK. However, given increased tape transfer rates 
 > but  > moderately reduced seek times you quickly get a 
 > policy which  > doesn''t migrate as much to tape as you would
want.
 >  >
 >  > It would be nice if the design enables the possibility to 
 >  > reduce/solve this problem, maybe it does already.
 > 
 > This was not yet added as a requirement, but it is a good 
 > idea to have this option.
 > 
 >  >
 >  >
 >  > Out of order requests:
 >  >
 >  > Another problem that usually shows up is requesting items 
 >  > out-of-order, ie not in the optimal recall-from-tape-order. 
 >  > There are
 >  > multiple ways to solve this, the best solution when you 
 > have an  > intelligent system like TSM where you have no 
 > clue on tape  > affinity is  > usually to throw all requests 
 > at the tape management system  > and let it  > resolve them 
 > in the most efficient order.
 >  >
 >  > Again, whole-dir-recalls are a problem here due to its  > 
 > one-request-at-a-time nature. There is simply no chance to 
 > do any  > batching. In addition to the above "detect 
 > whole-dir-recall" you can  > simply recall the entire 
 > directory if it''s small.
 > 
 > Yes ... Again, nothing like this was mentioned, but I like this.
 > 
 > What would you suggest we do about subdirectories?
 > 
 >  >
 >  >
 >  > Migration of complete directories:
 >  >
 >  > Users usually group their data sets in different  > 
 > directories. It would  > be nice if the migration could be 
 > done with that in mind. 
 >  > For example,
 >  > to have the LRU to be common for the entire directory, 
 > and  > migrate the  > entire directory when migration-day 
 > comes. This way, you get all
 > > relevant data together and doing a whole-directory recall 
 > suddenly  >
 > doesn''t mean that much tape-swapping.
 >  >
 >  > Migration-schemes using the "migrate the largest file  > 
 > first"-philosophy on the other hand usually manages to 
 > spread the  > files of a directory on an impressive number 
 > of tapes making a  > whole-directory-recall an absolute nightmare.
 >  > 
 > 
 > Nothing like this was mentioned - but it is certainly a good 
 > idea again.
 > (And fortunately, it doesn''t look that hard to me.)
 > 
 > 
 > 
 >  > These are some of my experiences with an HSM system that 
 > users are  > allowed to place files on. Some might argue 
 > that they use it  > the wrong  > way, but at the same time 
 > HSM systems have done very little (if
 > > anything?) to adapt to the usage patterns of the average 
 > user. It  >
 > would be very nice if the Lustre HSM support would excel in  
 > > this area.
 >  > 
 > 
 > Thanks a lot Nikke!
 > 
 >  > /Nikke
 >  > --
 >  > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 >  > =-=-=-=-=-=-=-
 >  >   Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se     |    
 >  > nikke@hpc2n.umu.se
 >  > -------------------------------------------------------------
 >  > --------------
 >  >   Some people think "asphalt" is a rectal disorder.
 >  > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > 
> -=-=-=-=-=-=- >  >
 >  > 
 > 
 > _______________________________________________
 > Lustre-devel mailing list
 > Lustre-devel@clusterfs.com
 > https://mail.clusterfs.com/mailman/listinfo/lustre-devel
 > 
 > 
 > 
 > _______________________________________________
 > Lustre-devel mailing list
 > Lustre-devel@clusterfs.com
 > https://mail.clusterfs.com/mailman/listinfo/lustre-devel
 > 
 >

Lustre devel - Aug 2006 - HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans

[Lustre-devel] HSM plans