Hi! A few weeks ago we upgraded 9 large webservers from ext2 to ext3. Since then we've seen very strange behavior on several of the machines. Permissions of files are repeatedly changed at random occasions. Several times, ownership of files have been totally mangled. Several users have logged in to discover that all their files suddenly are owned by another user! At two of these occasions the machine froze and rebooted when we started to chown the files back to their real owner. We currently see no other solution than to "downgrade" back to ext2, and prepare for a change to ReiserFS or another journaling fs later this spring. But first I wanted to check with you, is this a known problem? Maybe there's even a solution to it? This is our current configuration: Intel 2 x PIII (SMP) w/ 1 Gb memory Mylex AcceleRAID 352 PCI RAID Controller (DAC 960) 3 x 36 Gb IBM disks. RAID 5 thus gives us approx. 70 Gb kernel 2.2.19 + ext3-0.0.6b Slackware 4.0 (libc5) e2fsprogs-1.20.WIP.sct-20010216 There are no fs related kernel error messages, nothing to indicate that the kernel knows that something is wrong. I'd be happy to provide you with more information if needed. Best regards, /Johan Ekenberg - Sweden
Hi, On Wed, May 09, 2001 at 08:50:34PM +0200, Johan Ekenberg wrote:> > A few weeks ago we upgraded 9 large webservers from ext2 to ext3. Since then we've seen very strange behavior on several of the machines. Permissions of files are repeatedly changed at random occasions. Several times, ownership of files have been totally mangled. Several users have logged in to discover that all their files suddenly are owned by another user! At two of these occasions the machine froze and rebooted when we started to chown the files back to their real owner.OK, a few questions: Did you see any _other_ fs effects? Contents of the files changing, for example? I can't imagine any possible fs corruption which would touch permissions but leave the rest of the inode data intact. Did you have any unplanned recoveries?> There are no fs related kernel error messages, nothing to indicate that the kernel knows that something is wrong. I'd be happy to provide you with more information if needed.If you have any more, then yes, please. I assume you're using hardware raid only, no soft raid? I'm not sure where to start, here --- this has never been reported before and I don't know of anything in the VFS that could be so specific in its corruption. Cheers, Stephen
> This is a WAG, but: > > 1. Did you build ext3 as a module? > 2. If so, are you running on a kernel that was built before > applying the ext3 patch and configuring?No. 2.2.19 was patched, configured and built from scratch with ext3 support. /Johan
> There are so many "layers" involved in analyzing this sort of problem, > there must be a reasonable, systematic approach to rule out > things that > may be going on at other "layers." > > Other potential variables, which may or may not apply to you, this is > just speculation: NIS weirdness (or LDAP), NFS UID mapping problems, > trojan executables or buffer overrun exploits. If you are > using NIS or LDAP for anything, I would suspect that as a strongpotential> source ofNo. No LDAP, no NFS, no NIS.> Attempted RPC exploits against an NFS server could probably cause > mysterious UID and permission changes. Someone could be > trying to give > SUID root to an executable or script file through a buffer overrun. IWe have no RPC services running at all. /Johan
About a month ago I posted the message referenced below. There were a lot of kind and helpful responses, but in the end it seemed nobody knew what to make of it. Since then, all our problems have gradually disappeared. Automagically. Since we have made no site-wide changes that could explain the problem, apart from switching to ext3, I suggest these conclusions: a) The problems had something to do with going from ext2 to ext3. All occured *at the actual time of the upgrade* but were discovered gradually, making it *seem* like ongoing corruption. or b) The problems had nothing to do with ext3 but were, as someone suggested, caused by buffer overflows due to external attacks. We can find no trace of this type of attacks, but that doesn't prove they didn't happen. or c) The cause is something mysterious and exciting yet to be discovered. or d) any combination of a), b) and c) above. Anyway, I'd like to thank everyone for your help and time. I'm happy we didn't downgrade back to ext2. Ext3 rocks. Really. Best regards, /Johan Ekenberg> Hi! > > A few weeks ago we upgraded 9 large webservers from ext2 to ext3. > Since then we've seen very strange behavior on several of the > machines. Permissions of files are repeatedly changed at random > occasions. Several times, ownership of files have been totally > mangled. Several users have logged in to discover that all their > files suddenly are owned by another user! At two of these > occasions the machine froze and rebooted when we started to chown > the files back to their real owner. > > We currently see no other solution than to "downgrade" back to > ext2, and prepare for a change to ReiserFS or another journaling > fs later this spring. > > But first I wanted to check with you, is this a known problem? > Maybe there's even a solution to it? > > This is our current configuration: > Intel 2 x PIII (SMP) w/ 1 Gb memory > Mylex AcceleRAID 352 PCI RAID Controller (DAC 960) > 3 x 36 Gb IBM disks. RAID 5 thus gives us approx. 70 Gb > kernel 2.2.19 + ext3-0.0.6b > Slackware 4.0 (libc5) > e2fsprogs-1.20.WIP.sct-20010216 > > There are no fs related kernel error messages, nothing to > indicate that the kernel knows that something is wrong. I'd be > happy to provide you with more information if needed. > > Best regards, > /Johan Ekenberg - Sweden > - > To unsubscribe from this list: send the line "unsubscribe > linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org >