Something else that you might want to do is count the number of journal commits that are taking place, via a command like this: perf stat -e jbd2:jbd2_start_commit -a sleep 3600 This will count the number of jbd2 commits are executed in 3600 seconds --- i.e., an hour. If you are running some workload which is constantly calling fsync(2), that will be forcing journal commits, and those turn into cache flush commands that force all state to stable storage. Now, if you are using CF cards that aren't guaranteed to have power-loss protection (hint: even most consumer grade SSD's do not have power loss protection --- you have to pay $$$ for enterprise-grade SLC SSD's to have power loss protection --- and I'm guessing most CF cards are so cheap that they won't make guarantees that all of their flash metadata are saved to stable store on a power loss event) the fact that you are constantly using fsync(2) may not be providing you with the protection you want after a power loss event. Which might not be a problem if you have a handset with a non-removable eMMC device and a non-removable battery that can't fly out when you drop the phone, but for devices which which can easily have unplanned power failure, it may every well be the case that you're going to be badly burned across a power fail event anyway. So the next question I would ask you is whether you care about unplanned power failures. If so, you probably want to test your CF cards to make sure they actually will do the right thing across a power failure --- and if they don't, you may need to replace your CF card provider. If you don't care (because you don't have a removable battery, and the CF card is permanently sealed inside your device, for example), then you might want to consider disabling barriers so you're no longer forcing synchronous cache flush commands to be sent to your CF card. This trades off power failure safety versus increased performance and decreased card wear --- but if you don't need power failure safety, then it might be a good tradeoff. And if you *do* need power fail protection, then it's a good thing to test whether your hardware will actually provide it, so you don't find out the hard way that you're paying the cost of decreased performance and increased card wear, but you didn't get power fail protection *anyway* because of hardware limitations. Cheers, - Ted
Hello. El 11/10/14 21:19, Theodore Ts'o escribió:> If you are running some workload which is constantly calling fsync(2), > that will be forcing journal commits, and those turn into cache flush > commands that force all state to stable storage. Now, if you are > using CF cards that aren't guaranteed to have power-loss protection > (hint: even most consumer grade SSD's do not have power loss > protection --- you have to pay $$$ for enterprise-grade SLC SSD's to > have power loss protection --- and I'm guessing most CF cards are so > cheap that they won't make guarantees that all of their flash metadata > are saved to stable store on a power loss event) the fact that you are > constantly using fsync(2) may not be providing you with the protection > you want after a power loss event. > >This got me worried! How can we test if a device really stores all the data safely after a barrier and sudden power loss? Is there a tool for that? I am thinking something along the lines of a tool that does writes with some barriers in between and then I unplug the device and run the same tool but in a "check mode" that tells me if the requested data before the barrier is really there. Something sysadmin friendly or maybe even user friendly, but not too hard to use. Thanks for your insight! -- Ivan Baldo - ibaldo@adinet.com.uy - http://ibaldo.codigolibre.net/ From Montevideo, Uruguay, at the south of South America. Freelance programmer and GNU/Linux system administrator, hire me! Alternatives: ibaldo@codigolibre.net - http://go.to/ibaldo
dunno about any special tools, but misusing a mysql database could be a good check for this. unplug/reset your device while inserts into the db are ongoing (dont forget to use innodb for the tables). unplug / reset your device, boot it up again and take a look into the mysql log. theres a good chance that innodb gets wrecked... sure, this is not perfect. but could be a impressive test if it ends like i think. make sure your mysql instance is configured to be "safe": http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_method http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit and enable binlogs + sync binlogs or in other words: make it as slow as possible :p On Sun, Oct 12, 2014 at 4:07 PM, Ivan Baldo <ibaldo@adinet.com.uy> wrote:> Hello. > > El 11/10/14 21:19, Theodore Ts'o escribió: >> >> If you are running some workload which is constantly calling fsync(2), >> that will be forcing journal commits, and those turn into cache flush >> commands that force all state to stable storage. Now, if you are >> using CF cards that aren't guaranteed to have power-loss protection >> (hint: even most consumer grade SSD's do not have power loss >> protection --- you have to pay $$$ for enterprise-grade SLC SSD's to >> have power loss protection --- and I'm guessing most CF cards are so >> cheap that they won't make guarantees that all of their flash metadata >> are saved to stable store on a power loss event) the fact that you are >> constantly using fsync(2) may not be providing you with the protection >> you want after a power loss event. >> >> > This got me worried! > How can we test if a device really stores all the data safely after a > barrier and sudden power loss? > Is there a tool for that? > I am thinking something along the lines of a tool that does writes with > some barriers in between and then I unplug the device and run the same tool > but in a "check mode" that tells me if the requested data before the barrier > is really there. > Something sysadmin friendly or maybe even user friendly, but not too > hard to use. > Thanks for your insight! > > -- > Ivan Baldo - ibaldo@adinet.com.uy - http://ibaldo.codigolibre.net/ > From Montevideo, Uruguay, at the south of South America. > Freelance programmer and GNU/Linux system administrator, hire me! > Alternatives: ibaldo@codigolibre.net - http://go.to/ibaldo > > _______________________________________________ > Ext3-users mailing list > Ext3-users@redhat.com > https://www.redhat.com/mailman/listinfo/ext3-users-- Sent from the Delta quadrant using Borg technology!