Hi, On Thu, Dec 03, 2015 at 03:19:52PM +0700, Nguyen Xuan. Hai wrote:> Hi all, > > I'm performing benchmarking on OCFS2 file system on LVM using fio tool. > There are some scenarios that when we run it, after few minutes, > computer will reboot automatically. These scenarios are related to > OCFS2 file system only (there is no problem with ext3). > > We tried to fix by adding option "--debug" in fio command > (example: /fio RandWR-ASync-IOdepth1-FixFileSize > --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/). > Some scenarios can run successfully without rebooting. But there are > still some scenarios cannot run successfully.Sorry for late reply. Could you provide more information? such as 1. which cluster stack were you using, o2cb or pcmk? if pcmk, ocfs2 RA monitor timeout will triger fencing - reboot. I have not experienced rebooting when using o2cb, and am wondering if o2cb has similiar fencing mechianism. Maybe, kernel panic also incurs rebooting sometimes. 2. What scenarios led to reboot? 3. all logs: kernel logs, pacemaker logs if pcmk. Thanks, Eric> > We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after > referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html). > Some scenarios can run successfully without rebooting. But there are > still some scenarios cannot run successfully. > > This is content of file "RandWR-ASync-IOdepth1-FixFileSize": > [global] > directory=/mnt/fio4G > filename=fio_data > invalidate=1 > ioengine=libaio > direct=1 > ;ramp_time=30 > iodepth=1 > > [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=512 > size=4g > numjobs=1 > group_reporting > > [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=4k > size=4g > numjobs=1 > group_reporting > > [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=64k > size=4g > numjobs=1 > group_reporting > > [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix] > new_group > rw=randwrite > bs=1m > size=4g > numjobs=1 > group_reporting > > [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=512 > size=1g > numjobs=4 > group_reporting > > [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=4k > size=1g > numjobs=4 > group_reporting > > [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=64k > size=1g > numjobs=4 > group_reporting > > [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix] > new_group > rw=randwrite > bs=1m > size=1g > numjobs=4 > group_reporting > > Could you help me find out the reason? > > Thanks and Best regards, > > -- > ====================================================================> Nguyen Xuan Hai (Mr) > > Toshiba Software Development (Vietnam) Co.,Ltd > > ====================================================================>> -- > This mail was scanned by BitDefender > For more information please visit http://www.bitdefender.com> _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users
Nguyen Xuan. Hai
2015-Dec-15 09:20 UTC
[Ocfs2-users] Auto reboot when running fio benchmarking
Hi Eric, 1. I am using o2cb cluster stack. 2. The scenarios led to reboot: Randomly writing with Fixed file size. This is an example of these scenarios: [global] directory=/mnt/fio4G filename=fio_data invalidate=1 ioengine=libaio direct=1 ;ramp_time=30 iodepth=1 [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=512 size=4g numjobs=1 group_reporting [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=4k size=4g numjobs=1 group_reporting [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=64k size=4g numjobs=1 group_reporting [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix] new_group rw=randwrite bs=1m size=4g numjobs=1 group_reporting 3. I've attached the log files (kernel log, system log, message log). Please take a look. Thank you so much, Hai Nguyen On 12/15/2015 4:09 PM, Eric Ren wrote:> Hi, > > On Thu, Dec 03, 2015 at 03:19:52PM +0700, Nguyen Xuan. Hai wrote: >> Hi all, >> >> I'm performing benchmarking on OCFS2 file system on LVM using fio tool. >> There are some scenarios that when we run it, after few minutes, >> computer will reboot automatically. These scenarios are related to >> OCFS2 file system only (there is no problem with ext3). >> >> We tried to fix by adding option "--debug" in fio command >> (example: /fio RandWR-ASync-IOdepth1-FixFileSize >> --output=RandWR-ASync-IOdepth1-FixFileSize.out //*-*//*-debug=io*/). >> Some scenarios can run successfully without rebooting. But there are >> still some scenarios cannot run successfully. > Sorry for late reply. Could you provide more information? such as > 1. which cluster stack were you using, o2cb or pcmk? if pcmk, ocfs2 RA monitor timeout > will triger fencing - reboot. I have not experienced rebooting when using o2cb, and am > wondering if o2cb has similiar fencing mechianism. Maybe, kernel panic also incurs > rebooting sometimes. > 2. What scenarios led to reboot? > 3. all logs: kernel logs, pacemaker logs if pcmk. > > Thanks, > Eric >> We tried to upgrade Linux kernel from 3.4.34 to 3.10.65 (after >> referred to link: https://oss.oracle.com/pipermail/ocfs2-users/2014-February/006130.html). >> Some scenarios can run successfully without rebooting. But there are >> still some scenarios cannot run successfully. >> >> This is content of file "RandWR-ASync-IOdepth1-FixFileSize": >> [global] >> directory=/mnt/fio4G >> filename=fio_data >> invalidate=1 >> ioengine=libaio >> direct=1 >> ;ramp_time=30 >> iodepth=1 >> >> [RandWR-512-ASync-Depth1-Thread1-NoGrp-4G-Fix] >> new_group >> rw=randwrite >> bs=512 >> size=4g >> numjobs=1 >> group_reporting >> >> [RandWR-4k-ASync-Depth1-Thread1-NoGrp-4G-Fix] >> new_group >> rw=randwrite >> bs=4k >> size=4g >> numjobs=1 >> group_reporting >> >> [RandWR-64k-ASync-Depth1-Thread1-NoGrp-4G-Fix] >> new_group >> rw=randwrite >> bs=64k >> size=4g >> numjobs=1 >> group_reporting >> >> [RandWR-1m-ASync-Depth1-Thread1-NoGrp-4G-Fix] >> new_group >> rw=randwrite >> bs=1m >> size=4g >> numjobs=1 >> group_reporting >> >> [RandWR-512-ASync-Depth1-Thread4-Grp-1G-Fix] >> new_group >> rw=randwrite >> bs=512 >> size=1g >> numjobs=4 >> group_reporting >> >> [RandWR-4k-ASync-Depth1-Thread4-Grp-1G-Fix] >> new_group >> rw=randwrite >> bs=4k >> size=1g >> numjobs=4 >> group_reporting >> >> [RandWR-64k-ASync-Depth1-Thread4-Grp-1G-Fix] >> new_group >> rw=randwrite >> bs=64k >> size=1g >> numjobs=4 >> group_reporting >> >> [RandWR-1m-ASync-Depth1-Thread4-Grp-1G-Fix] >> new_group >> rw=randwrite >> bs=1m >> size=1g >> numjobs=4 >> group_reporting >> >> Could you help me find out the reason? >> >> Thanks and Best regards, >> >> -- >> ====================================================================>> Nguyen Xuan Hai (Mr) >> >> Toshiba Software Development (Vietnam) Co.,Ltd >> >> ====================================================================>> >> -- >> This mail was scanned by BitDefender >> For more information please visit http://www.bitdefender.com >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users at oss.oracle.com >> https://oss.oracle.com/mailman/listinfo/ocfs2-users >-- ====================================================================Nguyen Xuan Hai (Mr) Toshiba Software Development (Vietnam) Co.,Ltd ==================================================================== -------------- next part -------------- A non-text attachment was scrubbed... Name: log_benchmarking.rar Type: application/octet-stream Size: 146677 bytes Desc: not available Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20151215/69cfcf86/attachment-0001.obj -------------- next part -------------- -- This mail was scanned by BitDefender For more information please visit http://www.bitdefender.com