Stefan Priebe - Profihost AG
2012-Nov-14 13:42 UTC
problem with ceph and btrfs patch: set journal_info in async trans commit worker
Hello list, i wanted to try out ceph with latest vanilla kernel 3.7-rc5. I was seeing a massive performance degration. I see around 22x btrfs-endio-write processes every 10-20 seconds and they run a long time while consuming a massive amount of CPU. So my performance of 23.000 iops drops to an up and down of 23.000 iops to 0 - avg is now 2500 iops instead of 23.000. Git bisect shows me commit: e209db7ace281ca347b1ac699bf1fb222eac03fe "Btrfs: set journal_info in async trans commit worker" as the problematic patch. When i revert this one everything is fine again. Is this known? Greets, Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miao Xie
2012-Nov-15 05:18 UTC
Re: problem with ceph and btrfs patch: set journal_info in async trans commit worker
Hi, Stefan On wed, 14 Nov 2012 14:42:07 +0100, Stefan Priebe - Profihost AG wrote:> Hello list, > > i wanted to try out ceph with latest vanilla kernel 3.7-rc5. I was seeing a massive performance degration. I see around 22x btrfs-endio-write processes every 10-20 seconds and they run a long time while consuming a massive amount of CPU. > > So my performance of 23.000 iops drops to an up and down of 23.000 iops to 0 - avg is now 2500 iops instead of 23.000. > > Git bisect shows me commit: e209db7ace281ca347b1ac699bf1fb222eac03fe "Btrfs: set journal_info in async trans commit worker" as the problematic patch. > > When i revert this one everything is fine again. > > Is this known?Could you try the following patch? http://marc.info/?l=linux-btrfs&m=135175512030453&w=2 I think the patch Btrfs: set journal_info in async trans commit worker is not the real reason that caused the regression. I guess it is caused by the bug of the reservation. When we join the same transaction handle more than 2 times, the pointer of the reservation in the transaction handle would be lost, and the statistical data in the reservation would be corrupted. And then we would trigger the space flush, which may block your tasks. Thanks Miao> > Greets, > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Priebe - Profihost AG
2012-Nov-15 08:50 UTC
Re: problem with ceph and btrfs patch: set journal_info in async trans commit worker
Hi Miao, Am 15.11.2012 06:18, schrieb Miao Xie:> Hi, Stefan > > On wed, 14 Nov 2012 14:42:07 +0100, Stefan Priebe - Profihost AG wrote: >> Hello list, >> >> i wanted to try out ceph with latest vanilla kernel 3.7-rc5. I was seeing a massive performance degration. I see around 22x btrfs-endio-write processes every 10-20 seconds and they run a long time while consuming a massive amount of CPU. >> >> So my performance of 23.000 iops drops to an up and down of 23.000 iops to 0 - avg is now 2500 iops instead of 23.000. >> >> Git bisect shows me commit: e209db7ace281ca347b1ac699bf1fb222eac03fe "Btrfs: set journal_info in async trans commit worker" as the problematic patch. >> >> When i revert this one everything is fine again. >> >> Is this known? > > Could you try the following patch? > > http://marc.info/?l=linux-btrfs&m=135175512030453&w=2 > > I think the patch > > Btrfs: set journal_info in async trans commit worker > > is not the real reason that caused the regression. > > I guess it is caused by the bug of the reservation. When we join the > same transaction handle more than 2 times, the pointer of the reservation > in the transaction handle would be lost, and the statistical data in the > reservation would be corrupted. And then we would trigger the space flush, > which may block your tasks.i applied your whole patchset. It looks a lot better now but avg iops is now 5000 iops and not 23.000 like when removing the mentioned commit (e209db7ace281ca347b1ac699bf1fb222eac03fe). Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html