Hi, on one system I have a "frozen transaction" since more than 24 hours, without any IO. I can''t umount the partition, delete a snapshot or write anything. I try to reboot the system, but the problem is still present. Here the "frozen transaction" : $ ps auxw | grep btrfs | grep D root 1835 0.0 0.0 0 0 ? D Oct08 0:13 [btrfs-cleaner] root 1836 0.0 0.0 0 0 ? D Oct08 0:01 [btrfs-transacti] root 2633 0.0 0.0 0 0 ? D Oct08 0:00 [flush-btrfs-1] The partition is mounted with this options : # mount | grep btrfs /dev/mapper/vg--sofia-backup on /backup type btrfs (rw,noatime,compress-force=zlib,nossd) The disk is near full : # btrfs fi df /backup/ Data: total=482.68GB, used=480.89GB System, DUP: total=32.00MB, used=72.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=10.12GB, used=8.82GB But one of the last actions was the removing of some big subvolumes (near 50GB). There is no error in logs, the frozen transaction was started from a 3.5* kernel (from GIT), and the system is now running on a 3.6.1 kernel (vanilla). Is there something I can do to solve that problem ? Thanks, Olivier -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 09, 2012 at 09:37:48AM +0200, Olivier Bonvalet wrote:> on one system I have a "frozen transaction" since more than 24 hours, > without any IO. > I can''t umount the partition, delete a snapshot or write anything. > I try to reboot the system, but the problem is still present.The processes could point at the cleaner deadlock, though I''m not completely sure without looking at the process stacks (/proc/PID/stack). If the problem persists accross reboots, how long after mount does it take to get to this state? Cleaner usually kicks in after the 30 second transaction commit period, so this should be easy to verify if it''s immediate or if it requires some load to get into the dead state.> The partition is mounted with this options : > # mount | grep btrfs > /dev/mapper/vg--sofia-backup on /backup type btrfs > (rw,noatime,compress-force=zlib,nossd)So you don''t mount with autodefrag, hmm. The deadlock I had in mind is more likely with autodefrag but also requires umount.> The disk is near full : > # btrfs fi df /backup/ > Data: total=482.68GB, used=480.89GBQuite full.> System, DUP: total=32.00MB, used=72.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=10.12GB, used=8.82GB > > But one of the last actions was the removing of some big subvolumes (near > 50GB).Given the amount of free space left, this creates high pressure on data writes and makes the deadlock more likely.> There is no error in logs, the frozen transaction was started from a 3.5* > kernel (from GIT), and the system is now running on a 3.6.1 kernel > (vanilla). > > Is there something I can do to solve that problem ?No, there''s a patch sent out in order to fix the deadlocks but it''s unfortunatelly still unmerged. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks for your reply. On 09/10/2012 11:52, David Sterba wrote:> On Tue, Oct 09, 2012 at 09:37:48AM +0200, Olivier Bonvalet wrote: >> on one system I have a "frozen transaction" since more than 24 hours, >> without any IO. >> I can''t umount the partition, delete a snapshot or write anything. >> I try to reboot the system, but the problem is still present. > > The processes could point at the cleaner deadlock, though I''m not > completely sure without looking at the process stacks (/proc/PID/stack).I didn''t see any "stack" entry in /proc/$PID/ ; I will try to find which kernel option export that.> > If the problem persists accross reboots, how long after mount does it > take to get to this state? Cleaner usually kicks in after the 30 second > transaction commit period, so this should be easy to verify if it''s > immediate or if it requires some load to get into the dead state.The cleaner process get it''s state D between 30 and 60 seconds after the reboot. But that cleaner process should not throw a lot of write access ? This time I tried to remount with the space-cache enabled, there is a lot of read access now. Does that space cache will help to find "free locations" ?> >> The partition is mounted with this options : >> # mount | grep btrfs >> /dev/mapper/vg--sofia-backup on /backup type btrfs >> (rw,noatime,compress-force=zlib,nossd) > > So you don''t mount with autodefrag, hmm. The deadlock I had in mind > is more likely with autodefrag but also requires umount. > >> The disk is near full : >> # btrfs fi df /backup/ >> Data: total=482.68GB, used=480.89GB > > Quite full.Yes, it''s the problem.> >> System, DUP: total=32.00MB, used=72.00KB >> System: total=4.00MB, used=0.00 >> Metadata, DUP: total=10.12GB, used=8.82GB >> >> But one of the last actions was the removing of some big subvolumes (near >> 50GB). > > Given the amount of free space left, this creates high pressure on data > writes and makes the deadlock more likely. > >> There is no error in logs, the frozen transaction was started from a 3.5* >> kernel (from GIT), and the system is now running on a 3.6.1 kernel >> (vanilla). >> >> Is there something I can do to solve that problem ? > > No, there''s a patch sent out in order to fix the deadlocks but it''s > unfortunatelly still unmerged. > >I suppose I can''t resize the FS without solving that cleanup deadlock before ?> david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 09, 2012 at 12:07:20PM +0200, Olivier Bonvalet wrote:> I didn''t see any "stack" entry in /proc/$PID/ ; I will try to find which > kernel option export that.CONFIG_STACKTRACE> >If the problem persists accross reboots, how long after mount does it > >take to get to this state? Cleaner usually kicks in after the 30 second > >transaction commit period, so this should be easy to verify if it''s > >immediate or if it requires some load to get into the dead state. > > The cleaner process get it''s state D between 30 and 60 seconds after the > reboot. But that cleaner process should not throw a lot of write access ?It needs to update the references so does both reads and writes.> This time I tried to remount with the space-cache enabled, there is a lot of > read access now. Does that space cache will help to find "free locations" ?Yes. As for the reads, the free space needs to fill the memory structures, if the disk is almost full there are also quite some data to read before it''s complete. But reads are not the problem.> I suppose I can''t resize the FS without solving that cleanup deadlock before > ?Probably no, although if you''re fast enough and add another device before the cleaner starts, it could work :) Other than that, these are the patches that should fix the deadlock: https://patchwork.kernel.org/patch/1383951/ https://patchwork.kernel.org/patch/1383941/ (it touches vfs and needs recompiling whole kernel, not just btrfs) david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/10/2012 14:32, David Sterba wrote:> On Tue, Oct 09, 2012 at 12:07:20PM +0200, Olivier Bonvalet wrote: >> I didn''t see any "stack" entry in /proc/$PID/ ; I will try to find which >> kernel option export that. > > CONFIG_STACKTRACECONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y CONFIG_CC_STACKPROTECTOR=y # CONFIG_DEBUG_STACK_USAGE is not set CONFIG_USER_STACKTRACE_SUPPORT=y # CONFIG_DEBUG_STACKOVERFLOW is not set I suppose it''s CONFIG_DEBUG_STACK_USAGE ?> >>> If the problem persists accross reboots, how long after mount does it >>> take to get to this state? Cleaner usually kicks in after the 30 second >>> transaction commit period, so this should be easy to verify if it''s >>> immediate or if it requires some load to get into the dead state. >> >> The cleaner process get it''s state D between 30 and 60 seconds after the >> reboot. But that cleaner process should not throw a lot of write access ? > > It needs to update the references so does both reads and writes. > >> This time I tried to remount with the space-cache enabled, there is a lot of >> read access now. Does that space cache will help to find "free locations" ? > > Yes. > > As for the reads, the free space needs to fill the memory structures, if > the disk is almost full there are also quite some data to read before > it''s complete. But reads are not the problem.Well... I don''t know if it is related to that space cache, but the cleanup process is now working : it makes a lot of write requests, and I have now 30Go of free space. So it will be solved soon. Any chance that it can be related to that space cache feature ?>> I suppose I can''t resize the FS without solving that cleanup deadlock before >> ? > > Probably no, although if you''re fast enough and add another device before > the cleaner starts, it could work :)Ho it''s possible, it''s a virtualized system, so the device can easily grow.> Other than that, these are the patches that should fix the deadlock: > > https://patchwork.kernel.org/patch/1383951/ > https://patchwork.kernel.org/patch/1383941/ > > (it touches vfs and needs recompiling whole kernel, not just btrfs) >I was starting to patch my kernel before to see it''s now solved. Thanks for your answers ! Olivier -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 09, 2012 at 03:49:01PM +0200, Olivier Bonvalet wrote:> On 09/10/2012 14:32, David Sterba wrote: > >On Tue, Oct 09, 2012 at 12:07:20PM +0200, Olivier Bonvalet wrote: > >>I didn''t see any "stack" entry in /proc/$PID/ ; I will try to find which > >>kernel option export that. > > > >CONFIG_STACKTRACE > > CONFIG_STACKTRACE_SUPPORT=y > CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y > CONFIG_CC_STACKPROTECTOR=y > # CONFIG_DEBUG_STACK_USAGE is not set > CONFIG_USER_STACKTRACE_SUPPORT=y > # CONFIG_DEBUG_STACKOVERFLOW is not set > > I suppose it''s CONFIG_DEBUG_STACK_USAGE ?I looked into the sources which config option enables the /proc/./stack output, but I don''t know which one is it in the menuconfig.> Well... I don''t know if it is related to that space cache, but the cleanup > process is now working : it makes a lot of write requests, and I have now > 30Go of free space. So it will be solved soon.Great, looks like it was able to do some progress and free more space for further work.> Any chance that it can be related to that space cache feature ?The problem is that if the space is tight, it slows down operations that depend on it. I''ve seen very slow snapshot cleaning in similar situation, free-space was constantly working. So it is related, but also expected and IMHO unavoidable.> >Probably no, although if you''re fast enough and add another device before > >the cleaner starts, it could work :) > > Ho it''s possible, it''s a virtualized system, so the device can easily grow.That makes a difference of course :)> I was starting to patch my kernel before to see it''s now solved.Glad to hear that, david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/10/2012 16:07, David Sterba wrote:> On Tue, Oct 09, 2012 at 03:49:01PM +0200, Olivier Bonvalet wrote: >> On 09/10/2012 14:32, David Sterba wrote: >>> On Tue, Oct 09, 2012 at 12:07:20PM +0200, Olivier Bonvalet wrote: >>>> I didn''t see any "stack" entry in /proc/$PID/ ; I will try to find which >>>> kernel option export that. >>> >>> CONFIG_STACKTRACE >> >> CONFIG_STACKTRACE_SUPPORT=y >> CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y >> CONFIG_CC_STACKPROTECTOR=y >> # CONFIG_DEBUG_STACK_USAGE is not set >> CONFIG_USER_STACKTRACE_SUPPORT=y >> # CONFIG_DEBUG_STACKOVERFLOW is not set >> >> I suppose it''s CONFIG_DEBUG_STACK_USAGE ? > > I looked into the sources which config option enables the /proc/./stack > output, but I don''t know which one is it in the menuconfig. >Ok, I will search for that, thanks.>> Well... I don''t know if it is related to that space cache, but the cleanup >> process is now working : it makes a lot of write requests, and I have now >> 30Go of free space. So it will be solved soon. > > Great, looks like it was able to do some progress and free more space > for further work. > >> Any chance that it can be related to that space cache feature ? > > The problem is that if the space is tight, it slows down operations that > depend on it. I''ve seen very slow snapshot cleaning in similar > situation, free-space was constantly working. So it is related, but also > expected and IMHO unavoidable. >What I didn''t understand, is that since 24 hours, there was near 0 IO request done on that device. The cleanup processes was just «frozen», not doing anything visible (no CPU and no IO) ; like a deadlock.>>> Probably no, although if you''re fast enough and add another device before >>> the cleaner starts, it could work :) >> >> Ho it''s possible, it''s a virtualized system, so the device can easily grow. > > That makes a difference of course :) > >> I was starting to patch my kernel before to see it''s now solved. > > > Glad to hear that, > david >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 09, 2012 at 04:12:54PM +0200, Olivier Bonvalet wrote:> What I didn''t understand, is that since 24 hours, there was near 0 IO > request done on that device. The cleanup processes was just «frozen», not > doing anything visible (no CPU and no IO) ; like a deadlock.Yep, it was a deadlock and I''ve found Roman''s report against 3.6-rc5 that has the same symptoms: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg18658.html The blocked threads are transaction-commit, cleaner and the generic writeback flusher, it''s not the deadlock on s_umount mutex but inside the tree locks, and still unresolved. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html