Greetings, We (System Fabric Works) have been retained by Sun to prove concept on integrating the lustre client filesystem with a local disk cache (specifically fscache from Red Hat). Eric Barton and I discussed this several days ago, but I would appreciate others'' feedback on the requirements and approach documented below. I''m relatively new to Lustre, so there is a very real possibility that I "don''t know what I don''t know". Motivation This work is primarily motivated by the need to improve the performance of lustre clients as SMB servers to windows nodes. As I understand it, this need is primarily for file readers. Requirements 1. Enabling fscache should be a mount option, and there should be ioctl support for enabling, disabling and querying a file''s fscache usage. 2. Data read into the page cache will be asynchronously copied to the disk-based fscache upon arrival. 3. If requested data is not present in the page cache, it will be retrieved preferentially from the fscache. If not present in the fscache, data will be read via RPC. 4. When pages are reclaimed due to memory pressure, they should remain in the fscache. 5. When a user writes a page (if we support fscache for non-read-only opens), the corresponding fscache page must either be invalidated or (more likely) rewritten. 6. When a DLM lock is revoked, the entire extent of the lock must be dropped from the fscache (in addition to dropping any page cache resident pages) - regardless of whether any pages are currently resident in the page cache. 7. As sort-of a corollary to #6, DLM locks must not be canceled by the owner as long as pages are resident in the fscache, even if memory pressure reclamation has emptied the page cache for a given file. 8. Utilities and test programs will be needed, of course. 9. The fscache must be cleared upon mount or dismount. High Level Design Points The following is written based primarily on review of the 1.6.5.1 code. I''m aware that this is not the place for new development, but it was deemed a stable place for initial experimentation. Req. Notes 1. In current Redhat distributions, fscache is included and NFS includes fscache support, enabled by a mount option. We don''t see any problems with doing something similar. A per-file ioctl to enable/disable fscache usage is also seen as straightforward. 2. When an RPC read (into the page cache) completes, in the ll_ap_completion() function, an asynchronous read to the same offset in the file''s fscache object will be initiated. This should not materially impact access time (think dirty page to fscache filesystem). 3. When the readpage method is invoked because a page is not already resident in the page cache, the page will be read first from the fscache. This is non-blocking and (presumably) fast for the non-resident case. If available, the fscache read will proceed asynchronously, after which the page will be valid in the page cache. If not available in the fscache, the RPC read will proceed normally. 4. Page removal due to memory pressure is triggered by a call to the llap_shrink_cache function. This function should not require any material change, since pages can be removed from the page cache without removal from the fscache in this case. In fact, if this doesn''t happen, the fscache will never be read. (note: test coverage will be important here) 5. It may be reasonable in early code to enable fscache only for read-only opens. However, we don''t see any inherent problems with running an asynchronous write to the fscache concurrently with a Lustre RPC write. Note that this approach would *never* have dirty pages exist only in the fscache; if it''s dirty it stays in the page cache until it''s written via RPC (or RPC AND fscache if we''re writing to both places).. 6 & 7 This is where it gets a little more tedious. Let me revert to paragraph form to address these cases below. 8 Testing will require the following: * ability to query and punch holes in the page cache (already done). * ability to query and punch holes in the fscache (nearly done). 9 I presume that all locks are canceled when a client dismounts a filesystem, in which case it would never be safe to use data in the fscache from a prior mount. Lock Revocation Please apply that "it looks to me like this is how things work" filter here; I am still pretty new to Lustre (thanks). My questions are summarized after the the text of this section. As of 1.6.5.1, DLM locks keep a list of page-cached pages (lock->l_extents_list contains osc_async_page structs for all currently cached pages - and I think the word extent is used both for each page cached under a lock, and to describe a locked region...is this right?). If a lock is revoked, that list is torn down and the pages are freed. Pages are also removed from that list when they are freed due to memory pressure, making that list sparse with regard to the actual region of the lock. Adding fscache, there will be zero or more page-cache pages in the extent list, as well as zero or more pages in the file object in the fscache. The primary question, then, is whether a lock will remain valid (i.e. not be voluntarily released) if all of the page-cache pages are freed for non-lock-related reasons (see question 3 below). The way I foresee cleaning up the fscache is by looking at the overall extent of the lock (at release or revocation time), and punching a lock-extent-sized hole in the fscache object prior to looping through the page list (possibly in cache_remove_lock() prior to calling cache_remove_extents_from_lock()). However, that would require finding the inode, which (AFAICS) is not available in that context (ironically, unless the l_extents_list is non- empty, in which case the inode can be found via any of the page structs in the list). I have put in a hack to solve this, but see question 6 below. Summarized questions: Q1: Where can I read up on the unit testing infrastructure for Lustre? Q2: Is stale cache already covered by existing unit tests? Q3: Will a DLM lock remain valid (i.e. not be canceled) even if its page list is empty (i.e. all pages have been freed due to memory pressure)? Q4: Will there *always* be a call to cache_remove_lock() when a lock is canceled or revoked? (i.e. is this the place to punch a hole in the fscache object?) Q5: for the purpose of punching a hole in a cache object upon lock revocation, can I rely on the lock->l_req_extent structure as the actual extent of the lock? Q6: a) is there a way to find the inode that I''ve missed?, and b) if not what is the preferred way of giving that function a way to find the inode? ... FYI we have done some experimenting and we have the read path in a demonstrable state, including crude code to effect lock revocation on the fscache contents. The NFS code modularized the fscache hooks pretty nicely, and we have followed that example. Thanks, John Groves John at SystemFabricWorks.com +1-512-302-4005 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-devel/attachments/20081111/31e52478/attachment-0001.html
John, Unfortunately I can''t help you much with the coding but I just thought I''d drop a quick email to let you know that we are extremely interested in this feature. We have used large memory cache NFS export servers to speed up small file metadata workloads already and we are looking to use the same system to provide WLAN access. The idea being that our London renderfarm is closely coupled to the storage but yet our remote (New York and Iceland) offices can work interactively with the resulting data. The disk cache will obviously greatly increase the cache retention. We are more than willing to test this code when it gets to a working state - good luck! Daire ----- "John Groves" <jgl at johngroves.net> wrote:> Greetings, > > We (System Fabric Works) have been retained by Sun to prove concept on > integrating the lustre client filesystem with a local disk cache > (specifically fscache from Red Hat). > > Eric Barton and I discussed this several days ago, but I would > appreciate > others'' feedback on the requirements and approach documented below. > I''m > relatively new to Lustre, so there is a very real possibility that I > "don''t > know what I don''t know". > > Motivation > > This work is primarily motivated by the need to improve the > performance > of lustre clients as SMB servers to windows nodes. As I understand it, > this need is primarily for file readers. > > Requirements > > 1. Enabling fscache should be a mount option, and there should be > ioctl > support for enabling, disabling and querying a file''s fscache usage. > 2. Data read into the page cache will be asynchronously copied to the > disk-based fscache upon arrival. > 3. If requested data is not present in the page cache, it will be > retrieved > preferentially from the fscache. If not present in the fscache, data > will be read via RPC. > 4. When pages are reclaimed due to memory pressure, they should remain > in > the fscache. > 5. When a user writes a page (if we support fscache for non-read-only > opens), > the corresponding fscache page must either be invalidated or > (more likely) rewritten. > 6. When a DLM lock is revoked, the entire extent of the lock must be > dropped from the fscache (in addition to dropping any page cache > resident pages) - regardless of whether any pages are currently > resident > in the page cache. > 7. As sort-of a corollary to #6, DLM locks must not be canceled by the > owner > as long as pages are resident in the fscache, even if memory pressure > reclamation has emptied the page cache for a given file. > 8. Utilities and test programs will be needed, of course. > 9. The fscache must be cleared upon mount or dismount. > > High Level Design Points > > The following is written based primarily on review of the 1.6.5.1 > code. > I''m aware that this is not the place for new development, but it was > deemed a stable place for initial experimentation. > > Req. Notes > > 1. In current Redhat distributions, fscache is included and > NFS includes fscache support, enabled by a mount option. > We don''t see any problems with doing something similar. > A per-file ioctl to enable/disable fscache usage is also seen > as straightforward. > > 2. When an RPC read (into the page cache) completes, in the > ll_ap_completion() function, an asynchronous read to the > same offset in the file''s fscache object will be initiated. > This should not materially impact access time (think dirty page > to fscache filesystem). > > 3. When the readpage method is invoked because a page is not > already resident in the page cache, the page will be read > first from the fscache. This is non-blocking and (presumably) > fast for the non-resident case. If available, the fscache > read will proceed asynchronously, after which the page will be > valid in the page cache. If not available in the fscache, > the RPC read will proceed normally. > > 4. Page removal due to memory pressure is triggered by a call to > the llap_shrink_cache function. This function should not require > any material change, since pages can be removed from the page > cache without removal from the fscache in this case. In fact, > if this doesn''t happen, the fscache will never be read. > (note: test coverage will be important here) > > 5. It may be reasonable in early code to enable fscache only > for read-only opens. However, we don''t see any inherent problems > with running an asynchronous write to the fscache concurrently > with a Lustre RPC write. Note that this approach would *never* > have dirty pages exist only in the fscache; if it''s dirty it > stays in the page cache until it''s written via RPC (or RPC > AND fscache if we''re writing to both places).. > > 6 & 7 This is where it gets a little more tedious. Let me revert to > paragraph form to address these cases below. > > 8 Testing will require the following: > * ability to query and punch holes in the page cache (already done). > * ability to query and punch holes in the fscache (nearly done). > > 9 I presume that all locks are canceled when a client dismounts > a filesystem, in which case it would never be safe to use data > in the fscache from a prior mount. > > > Lock Revocation > > Please apply that "it looks to me like this is how things work" filter > here; > I am still pretty new to Lustre (thanks). My questions are summarized > after the the text of this section. > > As of 1.6.5.1 , DLM locks keep a list of page-cached pages > (lock->l_extents_list contains osc_async_page structs for all > currently > cached pages - and I think the word extent is used both for each page > cached > under a lock, and to describe a locked region...is this right?). If a > lock > is revoked, that list is torn down and the pages are freed. Pages are > also > removed from that list when they are freed due to memory pressure, > making > that list sparse with regard to the actual region of the lock. > > Adding fscache, there will be zero or more page-cache pages in the > extent > list, as well as zero or more pages in the file object in the fscache. > The primary question, then, is whether a lock will remain valid (i.e. > not be > voluntarily released) if all of the page-cache pages are freed for > non-lock-related reasons (see question 3 below). > > The way I foresee cleaning up the fscache is by looking at the overall > extent of the lock (at release or revocation time), and punching a > lock-extent-sized hole in the fscache object prior to looping through > the page list (possibly in cache_remove_lock() prior to calling > cache_remove_extents_from_lock()). > > However, that would require finding the inode, which (AFAICS) is not > available in that context (ironically, unless the l_extents_list is > non- > empty, in which case the inode can be found via any of the page > structs in > the list). I have put in a hack to solve this, but see question 6 > below. > > Summarized questions: > Q1: Where can I read up on the unit testing infrastructure for Lustre? > Q2: Is stale cache already covered by existing unit tests? > Q3: Will a DLM lock remain valid (i.e. not be canceled) even if its > page > list is empty (i.e. all pages have been freed due to memory pressure)? > Q4: Will there *always* be a call to cache_remove_lock() when a lock > is > canceled or revoked? (i.e. is this the place to punch a hole in the > fscache object?) > Q5: for the purpose of punching a hole in a cache object upon lock > revocation, can I rely on the lock->l_req_extent structure as the > actual extent of the lock? > Q6: a) is there a way to find the inode that I''ve missed?, and > b) if not what is the preferred way of giving that function a way to > find the inode? > > ... > > FYI we have done some experimenting and we have the read path in a > demonstrable state, including crude code to effect lock revocation on > the > fscache contents. The NFS code modularized the fscache hooks pretty > nicely, > and we have followed that example. > > Thanks, > John Groves > John at SystemFabricWorks.com > +1-512-302-4005 > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-devel
On Nov 11, 2008 13:23 -0600, John Groves wrote:> This work is primarily motivated by the need to improve the performance > of lustre clients as SMB servers to windows nodes. As I understand it, > this need is primarily for file readers. > > Requirements > > 1. Enabling fscache should be a mount option, and there should be ioctl > support for enabling, disabling and querying a file''s fscache usage.For Lustre there should also be the ability to do this via /proc/fs/lustre tunables/stats.> 2. Data read into the page cache will be asynchronously copied to the > disk-based fscache upon arrival. > 3. If requested data is not present in the page cache, it will be retrieved > preferentially from the fscache. If not present in the fscache, data > will be read via RPC. > 4. When pages are reclaimed due to memory pressure, they should remain in > the fscache. > 5. When a user writes a page (if we support fscache for non-read-only > opens), > the corresponding fscache page must either be invalidated or > (more likely) rewritten. > 6. When a DLM lock is revoked, the entire extent of the lock must be > dropped from the fscache (in addition to dropping any page cache > resident pages) - regardless of whether any pages are currently resident > in the page cache. > 7. As sort-of a corollary to #6, DLM locks must not be canceled by the owner > as long as pages are resident in the fscache, even if memory pressure > reclamation has emptied the page cache for a given file. > 8. Utilities and test programs will be needed, of course. > 9. The fscache must be cleared upon mount or dismount.> High Level Design Points > > The following is written based primarily on review of the 1.6.5.1 code. > I''m aware that this is not the place for new development, but it was > deemed a stable place for initial experimentation.Note that the client IO code was substantially re-written for the 2.0 release. The client IO code from 1.6.5 is still present through the 1.8.x releases.> Req. Notes > > 1. In current Redhat distributions, fscache is included and > NFS includes fscache support, enabled by a mount option. > We don''t see any problems with doing something similar. > A per-file ioctl to enable/disable fscache usage is also seen > as straightforward. > > 2. When an RPC read (into the page cache) completes, in the > ll_ap_completion() function, an asynchronous read to the > same offset in the file''s fscache object will be initiated. > This should not materially impact access time (think dirty page > to fscache filesystem).Do you mean an "asynchronous write to the ... fscache object"?> 3. When the readpage method is invoked because a page is not > already resident in the page cache, the page will be read > first from the fscache. This is non-blocking and (presumably) > fast for the non-resident case. If available, the fscache > read will proceed asynchronously, after which the page will be > valid in the page cache. If not available in the fscache, > the RPC read will proceed normally. > > 4. Page removal due to memory pressure is triggered by a call to > the llap_shrink_cache function. This function should not require > any material change, since pages can be removed from the page > cache without removal from the fscache in this case. In fact, > if this doesn''t happen, the fscache will never be read. > (note: test coverage will be important here) > > 5. It may be reasonable in early code to enable fscache only > for read-only opens. However, we don''t see any inherent problems > with running an asynchronous write to the fscache concurrently > with a Lustre RPC write. Note that this approach would *never* > have dirty pages exist only in the fscache; if it''s dirty it > stays in the page cache until it''s written via RPC (or RPC > AND fscache if we''re writing to both places)..This is dangerous from the point of view that the write to the fscache may succeed, but the RPC may fail for a number of reasons (e.g. client eviction) so it would seem that the write to the fscache cannot start until the RPC completes successfully.> 6 & 7 This is where it gets a little more tedious. Let me revert to > paragraph form to address these cases below. > > 8 Testing will require the following: > * ability to query and punch holes in the page cache (already done). > * ability to query and punch holes in the fscache (nearly done). > > 9 I presume that all locks are canceled when a client dismounts > a filesystem, in which case it would never be safe to use data > in the fscache from a prior mount.A potential future improvement in the second generation of this feature might be the ability to revalidate the files in the local disk cache by the MDT and OST object versions, if those are also stored in fscache.> Lock Revocation > > Please apply that "it looks to me like this is how things work" filter here; > I am still pretty new to Lustre (thanks). My questions are summarized > after the the text of this section. > > As of 1.6.5.1, DLM locks keep a list of page-cached pages > (lock->l_extents_list contains osc_async_page structs for all currently > cached pages - and I think the word extent is used both for each page cached > under a lock, and to describe a locked region...is this right?). If a lock > is revoked, that list is torn down and the pages are freed. Pages are also > removed from that list when they are freed due to memory pressure, making > that list sparse with regard to the actual region of the lock. > > Adding fscache, there will be zero or more page-cache pages in the extent > list, as well as zero or more pages in the file object in the fscache. > The primary question, then, is whether a lock will remain valid (i.e. not be > voluntarily released) if all of the page-cache pages are freed for > non-lock-related reasons (see question 3 below).Yes, the lock can remain valid on the client even when no pages are protected by the lock. However, locks with few pages are more likely to be cancelled by the DLM LRU because the cost of re-fetching those locks is much smaller compared to locks covering lots of data. The lock "weight" function would need to be enhanced to include pages that are in fscache instead of just those in memory.> The way I foresee cleaning up the fscache is by looking at the overall > extent of the lock (at release or revocation time), and punching a > lock-extent-sized hole in the fscache object prior to looping through > the page list (possibly in cache_remove_lock() prior to calling > cache_remove_extents_from_lock()). > > However, that would require finding the inode, which (AFAICS) is not > available in that context (ironically, unless the l_extents_list is non- > empty, in which case the inode can be found via any of the page structs in > the list). I have put in a hack to solve this, but see question 6 below.Actually, each lock has a back-pointer to the inode that is referencing it, in l_ast_data, so that lock_cancel->mapping->page_removal can work. Use ll_inode_from_lock() for this.> Summarized questions: > Q1: Where can I read up on the unit testing infrastructure for Lustre?There is an internal wiki page with some information on this, it should probably be moved to the public wiki.> Q2: Is stale cache already covered by existing unit tests?I''m not sure what you mean. There is no such thing as stale cache in Lustre.> Q3: Will a DLM lock remain valid (i.e. not be canceled) even if its page > list is empty (i.e. all pages have been freed due to memory pressure)?Yes, though the reverse is impossible.> Q4: Will there *always* be a call to cache_remove_lock() when a lock is > canceled or revoked? (i.e. is this the place to punch a hole in the > fscache object?) > Q5: for the purpose of punching a hole in a cache object upon lock > revocation, can I rely on the lock->l_req_extent structure as the > actual extent of the lock?No, there are two different extent ranges on each lock. The requested extent, and the granted extent. The requested extent is the minimum extent size that the server could possibly grant to the client to finish the operation (e.g. large enough to handle a single read or write syscall). The server may decide to grant a larger lock if the resource (object) is not contended. In the current implementation, the DLM will always grant a full-file lock to the first client that requests it, because the most common application case is that only a single client is accessing the file. This avoids any future lock requests for this file in the majority of cases.> Q6: a) is there a way to find the inode that I''ve missed?, and > b) if not what is the preferred way of giving that function a way to > find the inode?See above.> FYI we have done some experimenting and we have the read path in a > demonstrable state, including crude code to effect lock revocation on the > fscache contents. The NFS code modularized the fscache hooks pretty nicely, > and we have followed that example.Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Andreas, thanks for the thoughtful reply, and sorry for being so slow to acknowledge and respond to it. Responses are below. On Fri, Nov 14, 2008 at 6:00 PM, Andreas Dilger <adilger at sun.com> wrote:> On Nov 11, 2008 13:23 -0600, John Groves wrote: > > This work is primarily motivated by the need to improve the performance > > of lustre clients as SMB servers to windows nodes. As I understand it, > > this need is primarily for file readers. > > > > Requirements > > > > 1. Enabling fscache should be a mount option, and there should be ioctl > > support for enabling, disabling and querying a file''s fscache usage. > > For Lustre there should also be the ability to do this via /proc/fs/lustre > tunables/stats.Makes sense, thanks.> > > > 2. Data read into the page cache will be asynchronously copied to the > > disk-based fscache upon arrival. > > 3. If requested data is not present in the page cache, it will be > retrieved > > preferentially from the fscache. If not present in the fscache, data > > will be read via RPC. > > 4. When pages are reclaimed due to memory pressure, they should remain in > > the fscache. > > 5. When a user writes a page (if we support fscache for non-read-only > > opens), > > the corresponding fscache page must either be invalidated or > > (more likely) rewritten. > > 6. When a DLM lock is revoked, the entire extent of the lock must be > > dropped from the fscache (in addition to dropping any page cache > > resident pages) - regardless of whether any pages are currently > resident > > in the page cache. > > 7. As sort-of a corollary to #6, DLM locks must not be canceled by the > owner > > as long as pages are resident in the fscache, even if memory pressure > > reclamation has emptied the page cache for a given file. > > 8. Utilities and test programs will be needed, of course. > > 9. The fscache must be cleared upon mount or dismount. > > > High Level Design Points > > > > The following is written based primarily on review of the 1.6.5.1 code. > > I''m aware that this is not the place for new development, but it was > > deemed a stable place for initial experimentation. > > Note that the client IO code was substantially re-written for the 2.0 > release. The client IO code from 1.6.5 is still present through the > 1.8.x releases.Understood.> > Req. Notes > > > > 1. In current Redhat distributions, fscache is included and > > NFS includes fscache support, enabled by a mount option. > > We don''t see any problems with doing something similar. > > A per-file ioctl to enable/disable fscache usage is also seen > > as straightforward. > > > > 2. When an RPC read (into the page cache) completes, in the > > ll_ap_completion() function, an asynchronous read to the > > same offset in the file''s fscache object will be initiated. > > This should not materially impact access time (think dirty page > > to fscache filesystem). > > > Do you mean an "asynchronous write to the ... fscache object"?Yes - write it is.> > 3. When the readpage method is invoked because a page is not > > already resident in the page cache, the page will be read > > first from the fscache. This is non-blocking and (presumably) > > fast for the non-resident case. If available, the fscache > > read will proceed asynchronously, after which the page will be > > valid in the page cache. If not available in the fscache, > > the RPC read will proceed normally. > > > > 4. Page removal due to memory pressure is triggered by a call to > > the llap_shrink_cache function. This function should not require > > any material change, since pages can be removed from the page > > cache without removal from the fscache in this case. In fact, > > if this doesn''t happen, the fscache will never be read. > > (note: test coverage will be important here) > > > > 5. It may be reasonable in early code to enable fscache only > > for read-only opens. However, we don''t see any inherent problems > > with running an asynchronous write to the fscache concurrently > > with a Lustre RPC write. Note that this approach would *never* > > have dirty pages exist only in the fscache; if it''s dirty it > > stays in the page cache until it''s written via RPC (or RPC > > AND fscache if we''re writing to both places).. > > > This is dangerous from the point of view that the write to the fscache > may succeed, but the RPC may fail for a number of reasons (e.g. client > eviction) so it would seem that the write to the fscache cannot start > until the RPC completes successfully.Good catch, thanks.> > 6 & 7 This is where it gets a little more tedious. Let me revert to > > paragraph form to address these cases below. > > > > 8 Testing will require the following: > > * ability to query and punch holes in the page cache (already done). > > * ability to query and punch holes in the fscache (nearly done). > > > > 9 I presume that all locks are canceled when a client dismounts > > a filesystem, in which case it would never be safe to use data > > in the fscache from a prior mount. > > > A potential future improvement in the second generation of this feature > might be the ability to revalidate the files in the local disk cache by > the MDT and OST object versions, if those are also stored in fscache.Cool idea.> > Lock Revocation > > > > Please apply that "it looks to me like this is how things work" filter > here; > > I am still pretty new to Lustre (thanks). My questions are summarized > > after the the text of this section. > > > > As of 1.6.5.1, DLM locks keep a list of page-cached pages > > (lock->l_extents_list contains osc_async_page structs for all currently > > cached pages - and I think the word extent is used both for each page > cached > > under a lock, and to describe a locked region...is this right?). If a > lock > > is revoked, that list is torn down and the pages are freed. Pages are > also > > removed from that list when they are freed due to memory pressure, making > > that list sparse with regard to the actual region of the lock. > > > > Adding fscache, there will be zero or more page-cache pages in the extent > > list, as well as zero or more pages in the file object in the fscache. > > The primary question, then, is whether a lock will remain valid (i.e. not > be > > voluntarily released) if all of the page-cache pages are freed for > > non-lock-related reasons (see question 3 below). > > > Yes, the lock can remain valid on the client even when no pages are > protected by the lock. However, locks with few pages are more likely > to be cancelled by the DLM LRU because the cost of re-fetching those > locks is much smaller compared to locks covering lots of data. The > lock "weight" function would need to be enhanced to include pages that > are in fscache instead of just those in memory.Got it, thanks. That would have eluded me...> > The way I foresee cleaning up the fscache is by looking at the overall > > extent of the lock (at release or revocation time), and punching a > > lock-extent-sized hole in the fscache object prior to looping through > > the page list (possibly in cache_remove_lock() prior to calling > > cache_remove_extents_from_lock()). >FYI it turns out that fscache doesn''t have the ability to punch a hole. The whole file has to be dropped at present.> > > > > However, that would require finding the inode, which (AFAICS) is not > > available in that context (ironically, unless the l_extents_list is non- > > empty, in which case the inode can be found via any of the page structs > in > > the list). I have put in a hack to solve this, but see question 6 below. > > > Actually, each lock has a back-pointer to the inode that is referencing > it, in l_ast_data, so that lock_cancel->mapping->page_removal can work. > Use ll_inode_from_lock() for this.That''s much nicer than my hack...thanks.> > Summarized questions: > > Q1: Where can I read up on the unit testing infrastructure for Lustre? > > > There is an internal wiki page with some information on this, it should > probably be moved to the public wiki.If there''s a way to let me know when that happens, I''d appreciate it. I''m not a full time lustre-devel reader (at least currently).> > Q2: Is stale cache already covered by existing unit tests? > > > I''m not sure what you mean. There is no such thing as stale cache in > Lustre.What I was driving at is a test to verify that any page cache data was discarded when a lock was revoked. The same test would catch failure to discard fscache data, that being a potentially stale place to reload the page cache from. Perhaps that''s implicitly covered somehow.> > Q3: Will a DLM lock remain valid (i.e. not be canceled) even if its page > > list is empty (i.e. all pages have been freed due to memory > pressure)? > > > Yes, though the reverse is impossible. > > > Q4: Will there *always* be a call to cache_remove_lock() when a lock is > > canceled or revoked? (i.e. is this the place to punch a hole in the > > fscache object?) > > Q5: for the purpose of punching a hole in a cache object upon lock > > revocation, can I rely on the lock->l_req_extent structure as the > > actual extent of the lock? > > No, there are two different extent ranges on each lock. The requested > extent, and the granted extent. The requested extent is the minimum > extent size that the server could possibly grant to the client to finish > the operation (e.g. large enough to handle a single read or write syscall). > The server may decide to grant a larger lock if the resource (object) is > not contended. > > In the current implementation, the DLM will always grant a full-file lock > to the first client that requests it, because the most common application > case is that only a single client is accessing the file. This avoids any > future lock requests for this file in the majority of cases.Thanks. Given that fscache invalidation turns out to be full-file anyway, this becomes moot for the time being.> > Q6: a) is there a way to find the inode that I''ve missed?, and > > b) if not what is the preferred way of giving that function a way to > > find the inode? > > > See above. > > > FYI we have done some experimenting and we have the read path in a > > demonstrable state, including crude code to effect lock revocation on the > > fscache contents. The NFS code modularized the fscache hooks pretty > nicely, > > and we have followed that example. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > >Thanks again! John -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-devel/attachments/20090105/1c5947d6/attachment-0001.html