Looks like it''s time for the "Dilger Procedure", yes?
See http://wiki.hpc.ufl.edu/index.php/Lustre
At least, it sounds like the same thing we encountered and this worked
for us.
Charlie Taylor
UF HPC Center
On Sep 15, 2008, at 10:52 AM, Dan wrote:
> Hi,
>
> One of my OSSs crashed last week. All OSTs on it recover and mount
> except one that causes a kernel panic when it starts replaying
> (after waiting for clients to connect). I fscked it this weekend
> and found no errors but still it panics the system.
>
> My only idea on how to fix it was to run lctl --device 12
> abort_recovery. The instant you run this it causes a kernel panic.
> Somethings not right about the replay info I guess. I''ve brought
up
> the other OSTs and deactivated them on the MGS/MDT but I cannot get
> the clients to mount anyway. Do I need to deactivate it on the
> clients too?
>
> Help!
>
> Dan
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss