Lev Serebryakov
2014-Apr-12 13:09 UTC
One process which would not die force me to power-cycle server and ALL UFS SUJ FSes are completely broken after that AGAIN!
Hello, Freebsd-fs. On my 10-STABLE (r263965) system transmission-daemon stops to work, could not be killed (waits forever in STOP state after "kill -KILL), kernel reports about overfilled accept TCP queue for its socket (sonewconn: pcb 012345678FFFFFFF: Listen queue overflow). Try "shutdown -r now", process aborted due to process which would not die, nothing could be done: system doesn't react on keyboard after that. Wait one hour (!). No result, only more "Listen queue overflow" messages on console. Power-off. Power-on. All UFS2 filesystems can not be recovered with using of automated fsck, due to journal/softupdate inconsistencies. I need to run "fsck -f" TWICE for each of them (as first run ask to re-run fsck). Please note, they are filesystems on MBR slice + BSD label on simple SATA disk attached to chipset port, no RAID, no "strange" GEOM modules, nothing fancy. Plain and easy install -- MBR with one slice, BSD label, filesystems, it's all. So, there are two questions: (1) Does UFS2 SUJ works at all on STABLE system? Should it?! (2) How could I avoid such situation, how could I reboot system WITHOUT such disaster when one process refuse to die? -- // Black Lion AKA Lev Serebryakov <lev at FreeBSD.org>
Erich Dollansky
2014-Apr-12 13:28 UTC
One process which would not die force me to power-cycle server and ALL UFS SUJ FSes are completely broken after that AGAIN!
Hi, On Sat, 12 Apr 2014 17:09:53 +0400 Lev Serebryakov <lev at FreeBSD.org> wrote:> (1) Does UFS2 SUJ works at all on STABLE system? Should it?! >it should.> (2) How could I avoid such situation, how could I reboot system > WITHOUT such disaster when one process refuse to die? >Do you know the name of the program which refuses to stop? Erich
Lev Serebryakov
2014-Apr-13 10:10 UTC
UFS2 SU+J could not recover after power-off sgain (was: One process which would not die force me to power-cycle server and ALL UFS SUJ FSes are completely broken after that AGAIN!)
Hello, Freebsd-fs. You wrote 12 ?????? 2014 ?., 17:09:53: LS> All UFS2 filesystems can not be recovered with using of automated fsck, due LS> to journal/softupdate inconsistencies. I need to run "fsck -f" TWICE for LS> each of them (as first run ask to re-run fsck). "shutdown -h" reboots system, UPS switch power off after that (with delay), 2 out of 5 FSes could not be checked with journal automatically. Manual full "fsck" run didn't find any serious problems, only one or two unlinked files (recovered to lost+found) free block bitmaps! WHY?! How could I trust to UFS2 now?! Both filesystems have same scenario: /dev/ufs/tmp: Journal file sequence mismatch 233263 != 231707 /dev/ufs/tmp: UNEXPECTED SU+J INCONSISTENCY /dev/ufs/tmp: INTERNAL ERROR: GOT TO reply() /dev/ufs/tmp: UNEXPECTED SOFT UPDATE INCONSISTENCY. RUN fsck MANUALLY. /dev/ufs/usr: Journal file sequence mismatch 287936 != 282572 /dev/ufs/usr: UNEXPECTED SU+J INCONSISTENCY /dev/ufs/usr: INTERNAL ERROR: GOT TO reply() /dev/ufs/usr: UNEXPECTED SOFT UPDATE INCONSISTENCY. RUN fsck MANUALLY. Again: these FSes were checked with full fsck two days ago. They reside at SATA HDD without any non-standard or complex geom modules (only geonm_part), and HDD is attahed to chipset SATA port, there is no any RAID controllers or things like that. EVERY non-clean reboot of server leads to "RUN fsck MANUALLY". -- // Black Lion AKA Lev Serebryakov <lev at FreeBSD.org>
Chris H
2014-Apr-14 04:34 UTC
One process which would not die force me to power-cycle server and ALL UFS SUJ FSes are completely broken after that AGAIN!
> Hello, Freebsd-fs. > > On my 10-STABLE (r263965) system transmission-daemon stops to work, could not be > killed (waits forever in STOP state after "kill -KILL), kernel reports about > overfilled accept TCP queue for its socket (sonewconn: pcb 012345678FFFFFFF: Listen queue > overflow). > > Try "shutdown -r now", process aborted due to process which would not die, > nothing could be done: system doesn't react on keyboard after that.Does using halt work better? --Chris> > Wait one hour (!). No result, only more "Listen queue overflow" messages on > console. > > Power-off. Power-on. > > All UFS2 filesystems can not be recovered with using of automated fsck, due > to journal/softupdate inconsistencies. I need to run "fsck -f" TWICE for > each of them (as first run ask to re-run fsck). > > Please note, they are filesystems on MBR slice + BSD label on simple SATA > disk attached to chipset port, no RAID, no "strange" GEOM modules, nothing > fancy. Plain and easy install -- MBR with one slice, BSD label, filesystems, > it's all. > > So, there are two questions: > > (1) Does UFS2 SUJ works at all on STABLE system? Should it?! > > (2) How could I avoid such situation, how could I reboot system WITHOUT such > disaster when one process refuse to die? > > -- > // Black Lion AKA Lev Serebryakov <lev at FreeBSD.org> > > _______________________________________________ > freebsd-stable at freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" >