Hi, I have a severe filesystem crash on a machine with a single ext3 filesystem. It is the second time that this happens to me (on two different machines that are identical in hardware setup and mostly in software - I cloned them when I started to work). It looks like the journal (or something else, i am not an expert here) has gone wild and overwritten loads of inodes. I have a single filesys for every- thing (I know, a bad habit), which has been remounted read-only. Using a ssh tunnel that is still open, I had a look at the fs and found this: root@wally /]# ls -l sbin/ ls: sbin/ipfwadm: Eingabe-/Ausgabefehler ls: sbin/depmod.modutils: Eingabe-/Ausgabefehler ls: sbin/update-modules.modutils: Eingabe-/Ausgabefehler ls: sbin/modprobe.Lmodutils: Eingabe-/Ausgabefehler ls: sbin/mount.smbfs: Eingabe-/Ausgabefehler ls: sbin/mount.smb: Eingabe-/Ausgabefehler insgesamt 1075231521 -rwxr-xr-x 1 root root 14408 22. M� 2002 badblocks -rwxr-xr-x 1 root root 7360 31. Mai 2003 blockdev -rwxr-xr-x 1 root root 47944 31. Mai 2003 cfdisk [... normal entries salvaged] -rwxr-xr-x 1 root root 413 29. Mai 2002 fsck.nfs lrwxrwxrwx 1 root root 7 20. Feb 2003 fsck.vfat -> dosfsck ?rwxrw-rwt 13639 21845 18754 23 27. J� 1994 genksyms -rwxr-xr-x 1 root root 14184 31. Mai 2003 getty -rwxr-xr-x 1 root root 121032 16. Apr 2003 grub -rwxr-xr-x 1 root root 2944 16. Apr 2003 grub-floppy -rwxr-xr-x 1 root root 12110 16. Apr 2003 grub-install -rwxr-xr-x 1 root root 2301 16. Apr 2003 grub-md5-crypt -rwxr-xr-x 1 root root 2473 16. Apr 2003 grub-terminfo -rwxr-xr-x 1 root root 9104 29. Mai 2002 halt ?-----xr-x 2240 1931476992 1952802008 1700749935 1. J� 1970 hciattach cr-Srwsr-T 2240 487270 25816 5, 16 1. J� 1970 hciconfig -rwxr-xr-x 1 root root 27048 10. Apr 2003 hcid [...] -rwxr-xr-x 1 root root 14092 24. Nov 2001 iptunnel lrwxrwxrwx 1 root root 15 23. Jul 08:53 kallsyms -> insmod.modutils --wsrwxrw- 16406 3257942038 1075218782 4618131408204237070 27. J� 2004 kallsyms.modutils -rwxr-xr-x 1 root root 5776 12. Nov 2001 kbdrate ?rwxrw-rwt 16896 21845 12079 26 27. J� 1994 kernelversion -rwxr-xr-x 1 root root 8948 29. Mai 2002 killall5 -rwxr-xr-x 1 root root 19672 3. J� 2002 klogd lrwxrwxrwx 1 root root 15 23. Jul 08:53 ksyms -> insmod.modutils sr-Srw---- 16407 3423617046 1075238984 1075656696 27. J� 2004 ksyms.modutils -rwxr-xr-x 1 root root 430636 8. Apr 2003 ldconfig -rwxr-xr-x 1 root root 20600 31. Mai 2003 losetup lrwxrwxrwx 1 root root 10 23. Jul 08:53 lsmod -> /bin/lsmod ?-wxrwSrwx 2063 21845 36537 73 1. J� 1970 lsmod.Lmodutils s--x--S--- 16406 131072 10376 13 1. J� 1970 lsmod.modutils --wxrwsrwT 16406 2903523350 1074366302 4614300640973588238 27. J� 2004 lspci -rwxr-xr-x 1 root root 43485 3. M� 2002 MAKEDEV -rwxr-xr-x 1 root root 10188 24. Nov 2001 mii-tool [...] lrwxrwxrwx 1 root root 7 20. Feb 2003 mkfs.msdos -> mkdosfs lrwxrwxrwx 1 root root 7 20. Feb 2003 mkfs.vfat -> mkdosfs ?-ws------ 16407 85999638 1075773488 1075233008 28. J� 2004 mkswap -rwxr-xr-x 1 root root 8752 10. Jul 2003 modinfo -rwxr-xr-x 1 root root 39912 10. Apr 2003 modinfo.modutils -rwxr-xr-x 1 root root 19796 10. Jul 2003 modprobe lrwx---rwT 2053 2209351685 1074963494 134587350 23. J� 2004 modprobe.Lmodutils lrwxrwxrwx 1 root root 15 4. Apr 2003 modprobe.modutils -> insmod.modutils -rwxr-xr-x 1 root root 7108 24. Nov 2001 nameif -rwxr-xr-x 1 root root 2808 31. Mai 2003 pivot_root -rwxr-xr-x 1 root root 4716 24. Nov 2001 plipconfig -rwxr-xr-x 1 root root 3324 18. M� 2001 pmap_dump -rwxr-xr-x 1 root root 3372 18. M� 2001 pmap_set -rwxr-xr-x 1 root root 11692 18. M� 2001 portmap ?-ws--s--T 16409 694698007 1075434216 1075242124 31. J� 2004 poweroff -rwxr-xr-x 1 root root 18860 24. Nov 2001 rarp -rwxr-xr-x 1 root root 5088 31. Mai 2003 raw ?rwSrwx--T 16406 113786907 1075297776 1075751024 27. J� 2004 reboot -rwxr-xr-x 1 root root 19056 22. M� 2002 resize2fs -rwxr-xr-x 1 root root 7812 10. Jul 2003 rmmod lrwxrwxrwx 1 root root 6 16. Apr 2003 rmmod.Lmodutils -> insmod lrwxrwxrwx 1 root root 15 4. Apr 2003 rmmod.modutils -> insmod.modutils ?--xr----T 16408 276316184 1075252624 1075773640 27. J� 2004 rmt -rwxr-xr-x 1 root root 42060 24. Nov 2001 route -rwxr-xr-x 1 root root 3196 9. Jul 2003 rpc.lockd [..] -rwxr-xr-x 1 root root 153792 2. Apr 2002 tc ?rws--S--- 16413 3167764512 1075212976 1075238948 2. Feb 2004 telinit -rwxr-xr-x 1 root root 22312 22. M� 2002 tune2fs -rwsr-xr-x 1 root root 14508 21. J� 2002 unix_chkpwd -rwxr-xr-x 1 root root 16560 16. Apr 2003 update-grub -rwxr-xr-x 1 root root 2807 10. Jul 2003 update-modules This can be found throughout the fs - /lib, /usr/bin, /etc and others all are a mess. Luckily I could get hold of the database files and the web content that my students were working on... I suspect the reason for the crash to correspond with large files: I had a 2.6 Gb file in /tmp (a gzipped partition) that I copied over to another machine, and that night it crashed... It ran fine the last year, though. A similar incident happened to my first machine, where I moved a hd image to an external firewire disk. After deleting it from my machine, I was stranded with a totally broken system that I could only get up again with a boot cd, and after an fs check I now still have 16000 files in /lost+found (around 1.5 gig of data...) The hardware on the machine looks like this: 00:00.0 Host bridge: Intel Corp. 82850 850 (Tehama) Chipset Host Bridge (MCH) (rev 04) 00:01.0 PCI bridge: Intel Corp. 82850 850 (Tehama) Chipset AGP Bridge (rev 04) 00:1e.0 PCI bridge: Intel Corp. 82820 820 (Camino 2) Chipset PCI (rev 04) 00:1f.0 ISA bridge: Intel Corp. 82820 820 (Camino 2) Chipset ISA Bridge (ICH2) (rev 04) 00:1f.1 IDE interface: Intel Corp. 82820 820 (Camino 2) Chipset IDE U100 (rev 04) 00:1f.2 USB Controller: Intel Corp. 82820 820 (Camino 2) Chipset USB (Hub A) (rev 04) 00:1f.3 SMBus: Intel Corp. 82820 820 (Camino 2) Chipset SMBus (rev 04) 00:1f.4 USB Controller: Intel Corp. 82820 820 (Camino 2) Chipset USB (Hub B) (rev 04) 00:1f.5 Multimedia audio controller: Intel Corp. 82820 820 (Camino 2) Chipset AC'97 Audio Controller (rev 04) 01:00.0 VGA compatible controller: Matrox Graphics, Inc.: Unknown device 0527 (rev 03) 02:04.0 USB Controller: NEC Corporation USB (rev 41) 02:04.1 USB Controller: NEC Corporation USB (rev 41) 02:04.2 USB Controller: NEC Corporation: Unknown device 00e0 (rev 02) 02:0b.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 78) 02:0c.0 FireWire (IEEE 1394): VIA Technologies, Inc. OHCI Compliant IEEE 1394 Host Controller (rev 46) [vogl /net/home2/vogl]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping : 7 cpu MHz : 2405.505 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 4797.23 [root@wally /]# uptime 10:02:55 up 23 days, 23:48, 4 users, load average: 0.00, 0.00, 0.00 [root@wally /]# uname -a Linux wally 2.4.23 #1 Tue Dec 23 09:44:22 CET 2003 i686 unknown [root@wally /]# mount /dev/ide/host0/bus0/target0/lun0/part1 on / type ext3 (rw,errors=remount-ro) proc on /proc type proc (rw) devpts on /dev/pts type devpts (rw) /dev/ide/host0/bus0/target0/lun0/part5 on /media type vfat (rw,noexec,nodev,gid=2000,umask=002) usbdevfs on /proc/bus/usb type usbdevfs (rw) automount(pid1324) on /data/net type autofs (rw,fd=5,pgrp=1324,minproto=2,maxproto=4) automount(pid1348) on /net type autofs (rw,fd=5,pgrp=1348,minproto=2,maxproto=4) automount(pid1319) on /var/autofs/misc type autofs (rw,fd=5,pgrp=1319,minproto=2,maxproto=4) Anything I can do before formatting the disk to lighten things up? Simon -- _______________________________________________________________________ Dr. Simon Vogl Institut für Pervasive Computing, Johannes Kepler Universität Linz Altenberger Straße 69, A-4040 Linz, Austria Tel: +43 732 2468-8517, Fax: +43 732 2468-8426 mailto: vogl@soft.uni-linz.ac.at, http://www.soft.uni-linz.ac.at/
Hi, I have a severe filesystem crash on a machine with a single ext3 filesystem. It is the second time that this happens to me (on two different machines that are identical in hardware setup and mostly in software - I cloned them when I started to work). It looks like the journal (or something else, i am not an expert here) has gone wild and overwritten loads of inodes. I have a single filesys for every- thing (I know, a bad habit), which has been remounted read-only. Using a ssh tunnel that is still open, I had a look at the fs and found this: root@wally /]# ls -l sbin/ ls: sbin/ipfwadm: Eingabe-/Ausgabefehler ls: sbin/depmod.modutils: Eingabe-/Ausgabefehler ls: sbin/update-modules.modutils: Eingabe-/Ausgabefehler ls: sbin/modprobe.Lmodutils: Eingabe-/Ausgabefehler ls: sbin/mount.smbfs: Eingabe-/Ausgabefehler ls: sbin/mount.smb: Eingabe-/Ausgabefehler insgesamt 1075231521 -rwxr-xr-x 1 root root 14408 22. M� 2002 badblocks -rwxr-xr-x 1 root root 7360 31. Mai 2003 blockdev -rwxr-xr-x 1 root root 47944 31. Mai 2003 cfdisk [... normal entries salvaged] -rwxr-xr-x 1 root root 413 29. Mai 2002 fsck.nfs lrwxrwxrwx 1 root root 7 20. Feb 2003 fsck.vfat -> dosfsck ?rwxrw-rwt 13639 21845 18754 23 27. J� 1994 genksyms -rwxr-xr-x 1 root root 14184 31. Mai 2003 getty -rwxr-xr-x 1 root root 121032 16. Apr 2003 grub -rwxr-xr-x 1 root root 2944 16. Apr 2003 grub-floppy -rwxr-xr-x 1 root root 12110 16. Apr 2003 grub-install -rwxr-xr-x 1 root root 2301 16. Apr 2003 grub-md5-crypt -rwxr-xr-x 1 root root 2473 16. Apr 2003 grub-terminfo -rwxr-xr-x 1 root root 9104 29. Mai 2002 halt ?-----xr-x 2240 1931476992 1952802008 1700749935 1. J� 1970 hciattach cr-Srwsr-T 2240 487270 25816 5, 16 1. J� 1970 hciconfig -rwxr-xr-x 1 root root 27048 10. Apr 2003 hcid [...] -rwxr-xr-x 1 root root 14092 24. Nov 2001 iptunnel lrwxrwxrwx 1 root root 15 23. Jul 08:53 kallsyms -> insmod.modutils --wsrwxrw- 16406 3257942038 1075218782 4618131408204237070 27. J� 2004 kallsyms.modutils -rwxr-xr-x 1 root root 5776 12. Nov 2001 kbdrate ?rwxrw-rwt 16896 21845 12079 26 27. J� 1994 kernelversion -rwxr-xr-x 1 root root 8948 29. Mai 2002 killall5 -rwxr-xr-x 1 root root 19672 3. J� 2002 klogd lrwxrwxrwx 1 root root 15 23. Jul 08:53 ksyms -> insmod.modutils sr-Srw---- 16407 3423617046 1075238984 1075656696 27. J� 2004 ksyms.modutils -rwxr-xr-x 1 root root 430636 8. Apr 2003 ldconfig -rwxr-xr-x 1 root root 20600 31. Mai 2003 losetup lrwxrwxrwx 1 root root 10 23. Jul 08:53 lsmod -> /bin/lsmod ?-wxrwSrwx 2063 21845 36537 73 1. J� 1970 lsmod.Lmodutils s--x--S--- 16406 131072 10376 13 1. J� 1970 lsmod.modutils --wxrwsrwT 16406 2903523350 1074366302 4614300640973588238 27. J� 2004 lspci -rwxr-xr-x 1 root root 43485 3. M� 2002 MAKEDEV -rwxr-xr-x 1 root root 10188 24. Nov 2001 mii-tool [...] lrwxrwxrwx 1 root root 7 20. Feb 2003 mkfs.msdos -> mkdosfs lrwxrwxrwx 1 root root 7 20. Feb 2003 mkfs.vfat -> mkdosfs ?-ws------ 16407 85999638 1075773488 1075233008 28. J� 2004 mkswap -rwxr-xr-x 1 root root 8752 10. Jul 2003 modinfo -rwxr-xr-x 1 root root 39912 10. Apr 2003 modinfo.modutils -rwxr-xr-x 1 root root 19796 10. Jul 2003 modprobe lrwx---rwT 2053 2209351685 1074963494 134587350 23. J� 2004 modprobe.Lmodutils lrwxrwxrwx 1 root root 15 4. Apr 2003 modprobe.modutils -> insmod.modutils -rwxr-xr-x 1 root root 7108 24. Nov 2001 nameif -rwxr-xr-x 1 root root 2808 31. Mai 2003 pivot_root -rwxr-xr-x 1 root root 4716 24. Nov 2001 plipconfig -rwxr-xr-x 1 root root 3324 18. M� 2001 pmap_dump -rwxr-xr-x 1 root root 3372 18. M� 2001 pmap_set -rwxr-xr-x 1 root root 11692 18. M� 2001 portmap ?-ws--s--T 16409 694698007 1075434216 1075242124 31. J� 2004 poweroff -rwxr-xr-x 1 root root 18860 24. Nov 2001 rarp -rwxr-xr-x 1 root root 5088 31. Mai 2003 raw ?rwSrwx--T 16406 113786907 1075297776 1075751024 27. J� 2004 reboot -rwxr-xr-x 1 root root 19056 22. M� 2002 resize2fs -rwxr-xr-x 1 root root 7812 10. Jul 2003 rmmod lrwxrwxrwx 1 root root 6 16. Apr 2003 rmmod.Lmodutils -> insmod lrwxrwxrwx 1 root root 15 4. Apr 2003 rmmod.modutils -> insmod.modutils ?--xr----T 16408 276316184 1075252624 1075773640 27. J� 2004 rmt -rwxr-xr-x 1 root root 42060 24. Nov 2001 route -rwxr-xr-x 1 root root 3196 9. Jul 2003 rpc.lockd [..] -rwxr-xr-x 1 root root 153792 2. Apr 2002 tc ?rws--S--- 16413 3167764512 1075212976 1075238948 2. Feb 2004 telinit -rwxr-xr-x 1 root root 22312 22. M� 2002 tune2fs -rwsr-xr-x 1 root root 14508 21. J� 2002 unix_chkpwd -rwxr-xr-x 1 root root 16560 16. Apr 2003 update-grub -rwxr-xr-x 1 root root 2807 10. Jul 2003 update-modules This can be found throughout the fs - /lib, /usr/bin, /etc and others all are a mess. Luckily I could get hold of the database files and the web content that my students were working on... I suspect the reason for the crash to correspond with large files: I had a 2.6 Gb file in /tmp (a gzipped partition) that I copied over to another machine, and that night it crashed... It ran fine the last year, though. A similar incident happened to my first machine, where I moved a hd image to an external firewire disk. After deleting it from my machine, I was stranded with a totally broken system that I could only get up again with a boot cd, and after an fs check I now still have 16000 files in /lost+found (around 1.5 gig of data...) The hardware on the machine looks like this: 00:00.0 Host bridge: Intel Corp. 82850 850 (Tehama) Chipset Host Bridge (MCH) (rev 04) 00:01.0 PCI bridge: Intel Corp. 82850 850 (Tehama) Chipset AGP Bridge (rev 04) 00:1e.0 PCI bridge: Intel Corp. 82820 820 (Camino 2) Chipset PCI (rev 04) 00:1f.0 ISA bridge: Intel Corp. 82820 820 (Camino 2) Chipset ISA Bridge (ICH2) (rev 04) 00:1f.1 IDE interface: Intel Corp. 82820 820 (Camino 2) Chipset IDE U100 (rev 04) 00:1f.2 USB Controller: Intel Corp. 82820 820 (Camino 2) Chipset USB (Hub A) (rev 04) 00:1f.3 SMBus: Intel Corp. 82820 820 (Camino 2) Chipset SMBus (rev 04) 00:1f.4 USB Controller: Intel Corp. 82820 820 (Camino 2) Chipset USB (Hub B) (rev 04) 00:1f.5 Multimedia audio controller: Intel Corp. 82820 820 (Camino 2) Chipset AC'97 Audio Controller (rev 04) 01:00.0 VGA compatible controller: Matrox Graphics, Inc.: Unknown device 0527 (rev 03) 02:04.0 USB Controller: NEC Corporation USB (rev 41) 02:04.1 USB Controller: NEC Corporation USB (rev 41) 02:04.2 USB Controller: NEC Corporation: Unknown device 00e0 (rev 02) 02:0b.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 78) 02:0c.0 FireWire (IEEE 1394): VIA Technologies, Inc. OHCI Compliant IEEE 1394 Host Controller (rev 46) [vogl /net/home2/vogl]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping : 7 cpu MHz : 2405.505 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 4797.23 [vogl /net/home2/vogl]$ cat /proc/interrupts CPU0 0: 23797604 IO-APIC-edge timer 1: 126073 IO-APIC-edge keyboard 2: 0 XT-PIC cascade 4: 2 IO-APIC-edge serial 8: 4 IO-APIC-edge rtc 14: 355574 IO-APIC-edge ide0 15: 5 IO-APIC-edge ide1 16: 0 IO-APIC-level Matrox Graphics, Inc. MGA Parhelia AGP 17: 3984 IO-APIC-level Intel 82801BA-ICH2 19: 18062239 IO-APIC-level usb-uhci 20: 169 IO-APIC-level ohci1394 22: 0 IO-APIC-level acpi 23: 10021486 IO-APIC-level eth0, ehci_hcd, usb-uhci NMI: 0 LOC: 23796986 ERR: 0 MIS: 0 [root@wally /]# uptime 10:02:55 up 23 days, 23:48, 4 users, load average: 0.00, 0.00, 0.00 [root@wally /]# uname -a Linux wally 2.4.23 #1 Tue Dec 23 09:44:22 CET 2003 i686 unknown [root@wally /]# mount /dev/ide/host0/bus0/target0/lun0/part1 on / type ext3 (rw,errors=remount-ro) proc on /proc type proc (rw) devpts on /dev/pts type devpts (rw) /dev/ide/host0/bus0/target0/lun0/part5 on /media type vfat (rw,noexec,nodev,gid=2000,umask=002) usbdevfs on /proc/bus/usb type usbdevfs (rw) automount(pid1324) on /data/net type autofs (rw,fd=5,pgrp=1324,minproto=2,maxproto=4) automount(pid1348) on /net type autofs (rw,fd=5,pgrp=1348,minproto=2,maxproto=4) automount(pid1319) on /var/autofs/misc type autofs (rw,fd=5,pgrp=1319,minproto=2,maxproto=4) Anything I can do before formatting the disk to lighten things up? Simon -- _______________________________________________________________________ Dr. Simon Vogl Institut für Pervasive Computing, Johannes Kepler Universität Linz Altenberger Straße 69, A-4040 Linz, Austria Tel: +43 732 2468-8517, Fax: +43 732 2468-8426 mailto: vogl@soft.uni-linz.ac.at, http://www.soft.uni-linz.ac.at/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Simon Vogl wrote: | root@wally /]# ls -l sbin/ | ls: sbin/ipfwadm: Eingabe-/Ausgabefehler | ls: sbin/depmod.modutils: Eingabe-/Ausgabefehler [better use "export LANG=C" next time] these are I/O errors, are there no messages in the logs? Christian. - -- BOFH excuse #164: root rot -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFAB9BkC/PVm5+NVoYRAirdAJ4y8OcIUzYCpvE77aEEzj6653JqjQCfSdCn L6IrHn+NhH9/Ulwrn43seWI=Kc1Q -----END PGP SIGNATURE-----