Hi all,
In 1.4 and 1.2 there is a hole that leaves orphans in a orphan directory
till fsck was run or recovery occurs on that slot.
In the current implementation, if node A did rm of a file while it is
opened by node B. Now if node A umounts and node B dies and is restarted
first, it only recovers itself leaving orphan file in node A's orphan
directory.
In order to fix the problem, I am looking for your inputs on what is
the best :)
1) recover all slots(osb->max_slots) whenever there is a recovery. This
can trigger simultaneous recoveries if there were multiple node failures.
2) queue recovery during mount(even if the journal is clean). This can
still leave the file in orphan directory till the slot is used.
3) Initiate recovery periodically after certian interval. I do not see
much benefit in this.
thanks,
--Srini