thr3ads.net - Ocfs2 devel - [Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Wengang Wang

2011-Jul-06 04:38 UTC

[Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting

There is a use case that the app deletes huge number(XX kilo) of files in every
5 minutes. The deletions of some specific files are extreamly slow(costing
xx~xxx seconds). That is unacceptable.

Reading out the dir entries and the relavent inodes cost time. And we are doing
that with i_mutex held, it causes unlink path waiting on the mutex for long
time.

fix:
We drops and retake the mutex in the duration giving change to unlink to go on.
Also, for live nodes, one node only scan and recover this slot where the node
resides(helps performance). And always do it at each scan time. For those dead
(not mounted), we do it when we "should". And for dead slots, no
dropping-retaking
mutex is needed.

Sunil Mushran

2011-Jul-06 06:17 UTC

head link

[Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting

On 07/05/2011 09:38 PM, Wengang Wang wrote:> There is a use case that the app deletes huge number(XX kilo) of files in
every
> 5 minutes. The deletions of some specific files are extreamly slow(costing
> xx~xxx seconds). That is unacceptable.
>
> Reading out the dir entries and the relavent inodes cost time. And we are
doing
> that with i_mutex held, it causes unlink path waiting on the mutex for long
time.
>
> fix:
> We drops and retake the mutex in the duration giving change to unlink to go
on.
> Also, for live nodes, one node only scan and recover this slot where the
node
> resides(helps performance). And always do it at each scan time. For those
dead
> (not mounted), we do it when we "should". And for dead slots, no
dropping-retaking
> mutex is needed.
Yes, this is a good issue to tackle. I will read the patch in greater detail
later. But offhand, I have two comments.

1. "should" is not descriptive. I am assuming you mean do it only
during
actual recovery. If so, that would be incorrect. Say node 0 unlinks a file
that was being used by node 1. Node 0 dies. Recovery will notice that
that inode is active and not delete it. If node 1 dies, or is unable to 
delete
the file for any other reason, then our only hope is orphan scan.

2. All nodes have to scan all slots. Even live slots. I remember we did for
a reason. And that reason should be in the comment in the patch written
by Srini.

Ocfs2 devel - Jul 2011 - [PATCH 0/3] ocfs2: fix slow deleting

[Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting

[Ocfs2-devel] [PATCH 0/3] ocfs2: fix slow deleting