Hi, Should sqlite users who are paranoid about losing data when hard resets occur be setting the barrier=1 mount option with ext3? The situation is that we think SQLite has written data to a series of 4K blocks in a file and then called fsync() on the file descriptor. After this a hard reset occurs. Upon recovery it seems like one of the 4K blocks has been zeroed. The others are all fine. Happens every now and again under stress testing. System is using data=journaled, but not barrier=1. Should users also be setting barrier=1 for extra robustness in the face of hard resets? Thanks, Dan.
On 07/13/2010 09:47 AM, Dan Kennedy wrote:> Hi, > > Should sqlite users who are paranoid about losing data > when hard resets occur be setting the barrier=1 mount > option with ext3? > > The situation is that we think SQLite has written data > to a series of 4K blocks in a file and then called > fsync() on the file descriptor. After this a hard reset > occurs. Upon recovery it seems like one of the 4K blocks > has been zeroed. The others are all fine. > > Happens every now and again under stress testing. > > System is using data=journaled, but not barrier=1. > > Should users also be setting barrier=1 for extra robustness > in the face of hard resets? > > Thanks, > Dan. >Hi Dan, If you do not use barriers, your storage device could very well lose data if it loses power. There is no easy answer, you need to understand the type and configuration of your storage. For a local SAS/S-ATA drive, you should have barriers enabled when the write cache is enabled (check that with hdparm for example on S-ATA). Note that you could also be safe by disabling the write cache and leaving barriers off as well. If you have a non-volatile write cache (for example on an external, enterprise class array), you can safely mount without barriers. Regards, Ric
On 07/13/2010 08:47 AM, Dan Kennedy wrote:> Hi, > > Should sqlite users who are paranoid about losing data > when hard resets occur be setting the barrier=1 mount > option with ext3?barriers should be enabled whenever you wish to ensure a consistent filesystem post-powerloss, and you have write caches on your drives which may reorder or lose data when power is lost. Whether your resets drop power to drive caches, I dunno.> The situation is that we think SQLite has written data > to a series of 4K blocks in a file and then called > fsync() on the file descriptor. After this a hard reset > occurs. Upon recovery it seems like one of the 4K blocks > has been zeroed. The others are all fine.See ext3_sync_file: /* * In case we didn't commit a transaction, we have to flush * disk caches manually so that data really is on persistent * storage */ if (needs_barrier) blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL, BLKDEV_IFL_WAIT); so w/o barriers you are not flushing the drive cache and that data will be lost.> Happens every now and again under stress testing. > > System is using data=journaled, but not barrier=1. > > Should users also be setting barrier=1 for extra robustness > in the face of hard resets?s/extra// - but yes. -Eric> Thanks, > Dan. > > _______________________________________________ > Ext3-users mailing list > Ext3-users at redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users