Nguyen Xuan. Hai
2015-Dec-03 08:19 UTC
[Ocfs2-users] Auto reboot when running fio benchmarking
Hi all, I'm performing benchmarking on OCFS2 file system on LVM using fio tool. There are some scenarios that when we run it, after few minutes, computer will reboot automatically. These scenarios are related to OCFS2 file system only (there is no problem with ext3). We tried to fix by adding option "--debug" in fio command (example: /fio RandWR-ASync-IOdepth1-FixFileSize --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/). Some scenarios can run successfully without rebooting. But there are still some scenarios cannot run successfully. We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html). Some scenarios can run successfully without rebooting. But there are still some scenarios cannot run successfully. This is content of file "RandWR-ASync-IOdepth1-FixFileSize": [global] directory=/mnt/fio4G filename=fio_data invalidate=1 ioengine=libaio direct=1 ;ramp_time=30 iodepth=1 [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=512 size=4g numjobs=1 group_reporting [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=4k size=4g numjobs=1 group_reporting [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=64k size=4g numjobs=1 group_reporting [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=1m size=4g numjobs=1 group_reporting [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix] new_group rw=randwrite bs=512 size=1g numjobs=4 group_reporting [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix] new_group rw=randwrite bs=4k size=1g numjobs=4 group_reporting [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix] new_group rw=randwrite bs=64k size=1g numjobs=4 group_reporting [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix] new_group rw=randwrite bs=1m size=1g numjobs=4 group_reporting Could you help me find out the reason? Thanks and Best regards, -- ====================================================================Nguyen Xuan Hai (Mr) Toshiba Software Development (Vietnam) Co.,Ltd ==================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20151203/a69ff1b7/attachment.html -------------- next part -------------- -- This mail was scanned by BitDefender For more information please visit http://www.bitdefender.com
Hi, On Thu, Dec 03, 2015 at 03:19:52PM +0700, Nguyen Xuan. Hai wrote:> Hi all, > > I'm performing benchmarking on OCFS2 file system on LVM using fio tool. > There are some scenarios that when we run it, after few minutes, > computer will reboot automatically. These scenarios are related to > OCFS2 file system only (there is no problem with ext3). > > We tried to fix by adding option "--debug" in fio command > (example: /fio RandWR-ASync-IOdepth1-FixFileSize > --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/). > Some scenarios can run successfully without rebooting. But there are > still some scenarios cannot run successfully.Sorry for late reply. Could you provide more information? such as 1. which cluster stack were you using, o2cb or pcmk? if pcmk, ocfs2 RA monitor timeout will triger fencing - reboot. I have not experienced rebooting when using o2cb, and am wondering if o2cb has similiar fencing mechianism. Maybe, kernel panic also incurs rebooting sometimes. 2. What scenarios led to reboot? 3. all logs: kernel logs, pacemaker logs if pcmk. Thanks, Eric> > We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after > referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html). > Some scenarios can run successfully without rebooting. But there are > still some scenarios cannot run successfully. > > This is content of file "RandWR-ASync-IOdepth1-FixFileSize": > [global] > directory=/mnt/fio4G > filename=fio_data > invalidate=1 > ioengine=libaio > direct=1 > ;ramp_time=30 > iodepth=1 > > [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=512 > size=4g > numjobs=1 > group_reporting > > [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=4k > size=4g > numjobs=1 > group_reporting > > [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=64k > size=4g > numjobs=1 > group_reporting > > [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=1m > size=4g > numjobs=1 > group_reporting > > [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=512 > size=1g > numjobs=4 > group_reporting > > [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=4k > size=1g > numjobs=4 > group_reporting > > [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=64k > size=1g > numjobs=4 > group_reporting > > [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=1m > size=1g > numjobs=4 > group_reporting > > Could you help me find out the reason? > > Thanks and Best regards, > > -- > ====================================================================> Nguyen Xuan Hai (Mr) > > Toshiba Software Development (Vietnam) Co.,Ltd > > ====================================================================>> -- > This mail was scanned by BitDefender > For more information please visit http://www.bitdefender.com> _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users