Hello All, Yesterday, one of our customer's cluster(2 box) server goes down after a I/O error during the backup procedure. When we turn on the cluster again, the 600M database file have been lost, disapear. Now, I'm with one of the disks (the other was used to rebuild the database server) and need to know how can I recover any piece of data from the disk. After the crash we mount the disk's partition read-only just to see the damage. We tried debugfs but it didn't find any deleted file. The partition where the data was had only four files, two database files and two backup files. Now there is just 2 files, a test database and it backup file. The production database disapeared. Kernel: 2.2.19 Ext3: 0.0.7a DRBD: 0.5.8 Please, give me some help! Thanks in advance! -- João Alfredo G. Batista <joaoalf@dotx.com.br> ou <jagbdotx@yahoo.com.br> * dotX Consultoria, Serviços e Conectividade * http://www.dotx.com.br * Departamento de Desenvolvimento -- João Alfredo G. Batista <joaoalf@dotx.com.br> ou <jagbdotx@yahoo.com.br> * dotX Consultoria, Serviços e Conectividade * http://www.dotx.com.br * Departamento de Desenvolvimento
On Wed, Mar 20, 2002 at 11:12:59AM -0300, Jo?o Alfredo wrote:> Yesterday, one of our customer's cluster(2 box) server goes down after a > I/O error during the backup procedure. When we turn on the cluster > again, the 600M database file have been lost, disapear.What happended? Did the system perform a replay of the journal?> After the crash we mount the disk's partition read-only just to see the > damage. We tried debugfs but it didn't find any deleted file. TheWas the partition recovered during boottime?> partition where the data was had only four files, two database files and > two backup files. Now there is just 2 files, a test database and it > backup file. The production database disapeared. > > Kernel: 2.2.19 > Ext3: 0.0.7aMan, UPDATE! Ext3 on 2.2.x isn't the smartest of ideas. -- Ralf Hildebrandt (Im Auftrag des Referat V A) Ralf.Hildebrandt@charite.de Charite Campus Virchow-Klinikum Tel. +49 (0)30-450 570-155 Referat V A - Kommunikationsnetze - Fax. +49 (0)30-450 570-916 Why you can't find your system administrators: The Grey Wall(tm) has fallen on them and no one has noticed their absence. [clunk,clunk,help!,anyone?]
Hi, On Wed, Mar 20, 2002 at 11:12:59AM -0300, João Alfredo wrote:> After the crash we mount the disk's partition read-only just to see the > damage. We tried debugfs but it didn't find any deleted file. The > partition where the data was had only four files, two database files and > two backup files. Now there is just 2 files, a test database and it > backup file. The production database disapeared.I've never heard of _any_ ext3 failure modes which could result in silently disappearing files like that. What sort of IO errors did you have? Have you tested DRBD for reliability in the presence of IO failures? Are the previous backups OK? --Stephen