Hello, I just upgraded kernel to 3.11.6 added new disk and created btrfs: # mkfs.btrfs /dev/sdb # mount -t btrfs -o compress=lzo,compress-force=lzo /dev/sdb /usr/local/mysql/data I started copying files from old disk and then I got ''No space left on device'', but there is a lot of space. # df -h Filesystem Size Used Avail Use% Mounted on /dev/sdb 2.8T 110G 2.7T 4% /usr/local/mysql/data # btrfs fi show Label: none uuid: c0bfcb22-8b7c-4936-afcd-7acdf58f1d6c Total devices 1 FS bytes used 108.68GB devid 1 size 2.73TB used 113.04GB path /dev/sdb # btrfs fi df /usr/local/mysql/data Data: total=111.01GB, used=108.25GB System, DUP: total=8.00MB, used=20.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=441.91MB Metadata: total=8.00MB, used=0.00 I tried balance from FAQ: # btrfs fi balance start -dusage=5 /usr/local/mysql/data Done, had to relocate 4 out of 117 chunks But it doesn''t help. When I try to copy file there is still ''No space left on device''. What to do ? Regards, Igor -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Some more info, exact error message is: cp: writing ‘/usr/local/mysql/data/gbdata/parts_0015.MYI’: No space left on device cp: failed to extend ‘/usr/local/mysql/data/gbdata/parts_0015.MYI’: No space left on device Files are 2.7G - 7.7G big. On Sat, Oct 26, 2013 at 9:46 PM, Igor M <igork20@gmail.com> wrote:> Hello, > > I just upgraded kernel to 3.11.6 added new disk and created btrfs: > > # mkfs.btrfs /dev/sdb > # mount -t btrfs -o compress=lzo,compress-force=lzo /dev/sdb > /usr/local/mysql/data > > I started copying files from old disk and then I got ''No space left on > device'', but there is a lot of space. > > # df -h > Filesystem Size Used Avail Use% Mounted on > /dev/sdb 2.8T 110G 2.7T 4% /usr/local/mysql/data > > # btrfs fi show > Label: none uuid: c0bfcb22-8b7c-4936-afcd-7acdf58f1d6c > Total devices 1 FS bytes used 108.68GB > devid 1 size 2.73TB used 113.04GB path /dev/sdb > > # btrfs fi df /usr/local/mysql/data > Data: total=111.01GB, used=108.25GB > System, DUP: total=8.00MB, used=20.00KB > System: total=4.00MB, used=0.00 > Metadata, DUP: total=1.00GB, used=441.91MB > Metadata: total=8.00MB, used=0.00 > > I tried balance from FAQ: > > # btrfs fi balance start -dusage=5 /usr/local/mysql/data > Done, had to relocate 4 out of 117 chunks > > > But it doesn''t help. When I try to copy file there is still ''No space > left on device''. > > What to do ? > > > Regards, > > Igor-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Oct 26, 2013, at 1:46 PM, Igor M <igork20@gmail.com> wrote:> > # mount -t btrfs -o compress=lzo,compress-force=lzo /Why do you have two compression mount options? You need to pick one of these.> What to do ?Are there any kernel messages reported by dmesg at the time the copy starts and fails? What''s the exact copy command you''re using? Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Oct 26, 2013 at 11:35 PM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 26, 2013, at 1:46 PM, Igor M <igork20@gmail.com> wrote: >> >> # mount -t btrfs -o compress=lzo,compress-force=lzo / > > Why do you have two compression mount options? You need to pick one of these.I removed one. I was thinking both were needed.> >> What to do ? > > Are there any kernel messages reported by dmesg at the time the copy starts and fails? What''s the exact copy command you''re using?No messages. Just whem mounting: device fsid c0bfcb22-8b7c-4936-afcd- 7acdf58f1d6c devid 1 transid 622 /dev/sdb btrfs: force lzo compression btrfs: disk space caching is enabled I even added enospc_debug mount option, still no messages. I''m using simple cp command: # cp -a /mnt/old_hd/data/gbdata/* /usr/local/mysql/data/gbdata/ or for one file # cp -a /mnt/old_hd/data/gbdata/parts_0016.MYD /usr/local/mysql/data/gbdata/ cp: writing ‘/usr/local/mysql/data/gbdata/parts_0016.MYD’: No space left on device cp: failed to extend ‘/usr/local/mysql/data/gbdata/parts_0016.MYD’: No space left on device It''s the same error if I try to copy for ex. with midnight commander. On Sat, Oct 26, 2013 at 11:35 PM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 26, 2013, at 1:46 PM, Igor M <igork20@gmail.com> wrote: >> >> # mount -t btrfs -o compress=lzo,compress-force=lzo / > > Why do you have two compression mount options? You need to pick one of these. > >> What to do ? > > Are there any kernel messages reported by dmesg at the time the copy starts and fails? What''s the exact copy command you''re using? > > > Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Oct 26, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote:> > I even added enospc_debug mount option, still no messages.If it were kernel enospc, you should have messages in dmesg. What version of btrfs progs when making the btrfs volume?> > cp: failed to extend ‘/usr/local/mysql/data/gbdata/parts_0016.MYD’: No > space left on deviceReboot with kernel parameter ignore_loglevel and retry the copy, and see if you now have anything in dmesg at the time of the copy.> > It''s the same error if I try to copy for ex. with midnight commander.I have the same kernel version, the same mount options, and use the same cp -a on a 5.3GB file and cannot reproduce your results. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Oct 27, 2013 at 12:17 AM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 26, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote: > >> >> I even added enospc_debug mount option, still no messages. > > If it were kernel enospc, you should have messages in dmesg. > > What version of btrfs progs when making the btrfs volume? > >> >> cp: failed to extend ‘/usr/local/mysql/data/gbdata/parts_0016.MYD’: No >> space left on device > > Reboot with kernel parameter ignore_loglevel and retry the copy, and see if you now have anything in dmesg at the time of the copy. > >Some more info. This files are MySQL database files. Tables contains only text so they compress well. But, if I copy some uncompressable file, for example some video than everything is ok, no error message, copy is successfull. I also tried mounting without compression (without compress-force=lzo) and in this case no error is reported. So it seems it''s something with compression ? I''ll try rebooting with ignore_loglevel parameter.>> >> It''s the same error if I try to copy for ex. with midnight commander. > > I have the same kernel version, the same mount options, and use the same cp -a on a 5.3GB file and cannot reproduce your results. > > > Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Oct 27, 2013 at 12:22 AM, Igor M <igork20@gmail.com> wrote:> On Sun, Oct 27, 2013 at 12:17 AM, Chris Murphy <lists@colorremedies.com> wrote: >> >> On Oct 26, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote: >> >>> >>> I even added enospc_debug mount option, still no messages. >> >> If it were kernel enospc, you should have messages in dmesg. >> >> What version of btrfs progs when making the btrfs volume? >> >>> >>> cp: failed to extend ‘/usr/local/mysql/data/gbdata/parts_0016.MYD’: No >>> space left on device >> >> Reboot with kernel parameter ignore_loglevel and retry the copy, and see if you now have anything in dmesg at the time of the copy. >> >> > > Some more info. This files are MySQL database files. Tables contains > only text so they compress well. > But, if I copy some uncompressable file, for example some video than > everything is ok, no error message, copy is successfull. > I also tried mounting without compression (without compress-force=lzo) > and in this case no error is reported. > So it seems it''s something with compression ? > > I''ll try rebooting with ignore_loglevel parameter. > >>> >>> It''s the same error if I try to copy for ex. with midnight commander. >> >> I have the same kernel version, the same mount options, and use the same cp -a on a 5.3GB file and cannot reproduce your results. >> >> >> Chris MurphyStill no messages. Parameter seems to be active as /sys/module/printk/parameters/ignore_loglevel is Y, but there are no messages in log files or dmesg. Maybe I need to turn on some kernel debugging option and recompile kernel ? Also I should mention that cca 230G+ data was copied before this error started to occur. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Didn''t see before. btrfs progs were compiled today form git. # btrfs version Btrfs v0.20-rc1-358-g194aa4a -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Still no messages. Parameter seems to be active as > /sys/module/printk/parameters/ignore_loglevel is Y, but there are no > messages in log files or dmesg. Maybe I need to turn on some kernel > debugging option and recompile kernel ? > Also I should mention that cca 230G+ data was copied before this error > started to occur.I think I saw a similar issue before. Can you try using rsync with "--bwlimit XY" option to copy the files? The option will limit the speed, in kB, at which the file is being copied; it will work even when source and destination files are on a local machine. -- Tomasz Chmielewski http://wpkg.org -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> wrote:>> Still no messages. Parameter seems to be active as >> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no >> messages in log files or dmesg. Maybe I need to turn on some kernel >> debugging option and recompile kernel ? >> Also I should mention that cca 230G+ data was copied before this error >> started to occur. > > I think I saw a similar issue before. > > Can you try using rsync with "--bwlimit XY" option to copy the files? > > The option will limit the speed, in kB, at which the file is being > copied; it will work even when source and destination files are on a > local machine. >Also I run strace cp -a .. ... read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No space left on device) Last two write calls take a lot more time, and then last one returns ENOSPC. But if this write is retryed, then it succeeds. I tried with midnight commander and when error occurs, if I Retry operation then it finishes copying this file until error occurs again at next file. With --bwlimit it seems to be better, lower the speed later the error occurs, and if it''s slow enough copy is successfull. But now I''m not sure anymore. I copied a few files with bwlimit, and now sudenly error doesn''t occur anymore, even with no bwlimit. I''ll do some more tests. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013/10/27 10:50 AM, Igor M wrote:> On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> wrote: >>> Still no messages. Parameter seems to be active as >>> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no >>> messages in log files or dmesg. Maybe I need to turn on some kernel >>> debugging option and recompile kernel ? >>> Also I should mention that cca 230G+ data was copied before this error >>> started to occur. >> I think I saw a similar issue before. >> >> Can you try using rsync with "--bwlimit XY" option to copy the files? >> >> The option will limit the speed, in kB, at which the file is being >> copied; it will work even when source and destination files are on a >> local machine. >> > Also I run strace cp -a .. > ... > read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 > write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No > space left on device) > > Last two write calls take a lot more time, and then last one returns > ENOSPC. But if this write is retryed, then it succeeds. > I tried with midnight commander and when error occurs, if I Retry > operation then it finishes copying this file until error occurs again > at next file. > > With --bwlimit it seems to be better, lower the speed later the error > occurs, and if it''s slow enough copy is successfull. > But now I''m not sure anymore. I copied a few files with bwlimit, and > now sudenly error doesn''t occur anymore, even with no bwlimit. > I''ll do some more tests. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.htmlThis sounds to me like the problem is related to read performance causing a bork. This would explain why bwlimit helps, as well as why cp works the second time around (since it is cached). -- __________ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Oct 27, 2013 at 9:56 AM, Brendan Hide <brendan@swiftspirit.co.za> wrote:> On 2013/10/27 10:50 AM, Igor M wrote: >> >> On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> >> wrote: >>>> >>>> Still no messages. Parameter seems to be active as >>>> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no >>>> messages in log files or dmesg. Maybe I need to turn on some kernel >>>> debugging option and recompile kernel ? >>>> Also I should mention that cca 230G+ data was copied before this error >>>> started to occur. >>> >>> I think I saw a similar issue before. >>> >>> Can you try using rsync with "--bwlimit XY" option to copy the files? >>> >>> The option will limit the speed, in kB, at which the file is being >>> copied; it will work even when source and destination files are on a >>> local machine. >>> >> Also I run strace cp -a .. >> ... >> read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 >> write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 >> read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 >> write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No >> space left on device) >> >> Last two write calls take a lot more time, and then last one returns >> ENOSPC. But if this write is retryed, then it succeeds. >> I tried with midnight commander and when error occurs, if I Retry >> operation then it finishes copying this file until error occurs again >> at next file. >> >> With --bwlimit it seems to be better, lower the speed later the error >> occurs, and if it''s slow enough copy is successfull. >> But now I''m not sure anymore. I copied a few files with bwlimit, and >> now sudenly error doesn''t occur anymore, even with no bwlimit. >> I''ll do some more tests. >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > This sounds to me like the problem is related to read performance causing a > bork. This would explain why bwlimit helps, as well as why cp works the > second time around (since it is cached). >cp doesn''t work second time. cp always fails. If last write() call is retryed it works, I think this midnight commander do, if you choose ''retry''. It doesn''t copy from begining it continues where it left. I also tried from different disk, same result. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I made some more tests. Disk is 3TB, first cca 225GB is copied without errors. Then errors ''No space left on device'' begins. Now if I use rsync with ''--bwlimit'' option no error occurs or if I choose ''Retry'' in Midnight Commander then continues and after a while another error occurs and again ''Retry'' and so on. I also noticed something else. Just before this error occurs, write() call takes a lot longer, I also see that progress stops. If I do ''btrfs fi df ..'' at this moment: Data: total=114.01GB, used=112.00GB System, DUP: total=8.00MB, used=20.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=452.93MB Metadata: total=8.00MB, used=0.00 then error is reported, again ''btrfs fi df ..'' Data: total=114.01GB, used=113.00GB System, DUP: total=8.00MB, used=20.00KB System: total=4.00MB, used=0.00 Metadata, DUP: total=1.00GB, used=456.98MB Metadata: total=8.00MB, used=0.00 and then ''Retry'' and it goes on, until another error. Always before error, copying stops and used Data and Metadata changes. Maybe it''s something with allocating metadata. I don''t know. This goes on for cca 25G-30G and from now on no errors anymore. After this 1.3TB was copied without errors. But some of data was on rather slow disk, so maybe that''s why no more errors. On Sun, Oct 27, 2013 at 9:50 AM, Igor M <igork20@gmail.com> wrote:> On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> wrote: >>> Still no messages. Parameter seems to be active as >>> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no >>> messages in log files or dmesg. Maybe I need to turn on some kernel >>> debugging option and recompile kernel ? >>> Also I should mention that cca 230G+ data was copied before this error >>> started to occur. >> >> I think I saw a similar issue before. >> >> Can you try using rsync with "--bwlimit XY" option to copy the files? >> >> The option will limit the speed, in kB, at which the file is being >> copied; it will work even when source and destination files are on a >> local machine. >> > > Also I run strace cp -a .. > ... > read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 > write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No > space left on device) > > Last two write calls take a lot more time, and then last one returns > ENOSPC. But if this write is retryed, then it succeeds. > I tried with midnight commander and when error occurs, if I Retry > operation then it finishes copying this file until error occurs again > at next file. > > With --bwlimit it seems to be better, lower the speed later the error > occurs, and if it''s slow enough copy is successfull. > But now I''m not sure anymore. I copied a few files with bwlimit, and > now sudenly error doesn''t occur anymore, even with no bwlimit. > I''ll do some more tests.-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Oct 27, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote:> I made some more tests. Disk is 3TB, first cca 225GB is copied without errors. > Then errors ''No space left on device'' begins.Post the full entire dmesg somewhere please. pastebin.com is one option. Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Oct 27, 2013, at 4:46 PM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 27, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote: > >> I made some more tests. Disk is 3TB, first cca 225GB is copied without errors. >> Then errors ''No space left on device'' begins. > > Post the full entire dmesg somewhere please. pastebin.com is one option.And on list or pasted, the output for the disk from: smartctl -x /dev/sdX Chris Murphy-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 28, 2013 at 12:27 AM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 27, 2013, at 4:46 PM, Chris Murphy <lists@colorremedies.com> wrote: > >> >> On Oct 27, 2013, at 3:53 PM, Igor M <igork20@gmail.com> wrote: >> >>> I made some more tests. Disk is 3TB, first cca 225GB is copied without errors. >>> Then errors ''No space left on device'' begins. >> >> Post the full entire dmesg somewhere please. pastebin.com is one option. > > > And on list or pasted, the output for the disk from: > > smartctl -x /dev/sdX >dmesg: http://pastebin.com/t2H1QYye source disk: http://pastebin.com/JqKxkxKr dest disk: http://pastebin.com/ez9jALS2 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Oct 28, 2013, at 1:40 AM, Igor M <igork20@gmail.com> wrote:> > dmesg: http://pastebin.com/t2H1QYyeYou''ve got a warning related to pcie bridge on boot, with a trace that follows. I don''t know if this could be related to some problems. [ 0.325976] ------------[ cut here ]------------ [ 0.326086] WARNING: CPU: 5 PID: 1 at drivers/pci/search.c:46 pci_find_upstream_pcie_bridge+0x50/0x70() [ 0.326263] Modules linked in: [ 0.326412] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 3.11.6 #1 [ 0.326520] Hardware name: System manufacturer System Product Name/P8H77-V LE, BIOS 0601 06/06/2012 [ 0.326695] 0000000000000000 0000000000000009 ffffffff81449f86 0000000000000000 [ 0.327032] ffffffff810514b1 ffff88040b031898 ffff88040b031800 ffff88040b031898 [ 0.327367] 0000000080000000 000077ff80000000 ffffffff812191a0 0000000000000004 [ 0.327704] Call Trace: [ 0.327819] [<ffffffff81449f86>] ? dump_stack+0x41/0x51 [ 0.327931] [<ffffffff810514b1>] ? warn_slowpath_common+0x81/0xb0 [ 0.328040] [<ffffffff812191a0>] ? pci_find_upstream_pcie_bridge+0x50/0x70 [ 0.328151] [<ffffffff813333d3>] ? intel_iommu_add_device+0x43/0x210 [ 0.328261] [<ffffffff81330510>] ? bus_set_iommu+0x60/0x60 [ 0.328370] [<ffffffff8133053c>] ? add_iommu_group+0x2c/0x60 [ 0.328481] [<ffffffff81292a3d>] ? bus_for_each_dev+0x4d/0x80 [ 0.328591] [<ffffffff813304fa>] ? bus_set_iommu+0x4a/0x60 [ 0.328701] [<ffffffff8163bc0d>] ? intel_iommu_init+0xb20/0xc45 [ 0.328812] [<ffffffff81612919>] ? unpack_to_rootfs+0x24b/0x25b [ 0.328922] [<ffffffff816163c6>] ? pci_iommu_init+0xe/0x37 [ 0.329031] [<ffffffff816163b8>] ? memblock_find_dma_reserve+0x148/0x148 [ 0.329142] [<ffffffff810002f2>] ? do_one_initcall+0x102/0x150 [ 0.329252] [<ffffffff81611e43>] ? kernel_init_freeable+0xfd/0x18e [ 0.329362] [<ffffffff816117cf>] ? do_early_param+0x83/0x83 [ 0.329471] [<ffffffff81444480>] ? rest_init+0x70/0x70 [ 0.329579] [<ffffffff81444489>] ? kernel_init+0x9/0xe0 [ 0.329688] [<ffffffff8144f36c>] ? ret_from_fork+0x7c/0xb0 [ 0.329797] [<ffffffff81444480>] ? rest_init+0x70/0x70 [ 0.329907] ---[ end trace 0946f959337cff8b ]--- There are also numerous ACPI errors. ACPI Error: [DSSP] Namespace lookup failure check this: http://forums.gentoo.org/viewtopic-t-960476-start-0.html Anyway, I don''t see any read or write failures for any of the drives which is what I was kinda expecting.> > dest disk: http://pastebin.com/ez9jALS2This is a new drive with only 71 power on hours yet I''m seeing this: • 0x0009 2 27 Transition from drive PhyRdy to drive PhyNRdy • 0x000a 2 27 Device-to-host register FISes sent due to a COMRESET That''s unexpected but I don''t know that it''s releated. The dmesg doesn''t report any phy issues with the drive. Maybe check syslog or journalctl with a case insensitive search for phy and see if you find anything. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Oct 27, 2013 at 09:50:37AM +0100, Igor M wrote:> On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> wrote: > >> Still no messages. Parameter seems to be active as > >> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no > >> messages in log files or dmesg. Maybe I need to turn on some kernel > >> debugging option and recompile kernel ? > >> Also I should mention that cca 230G+ data was copied before this error > >> started to occur. > > > > I think I saw a similar issue before. > > > > Can you try using rsync with "--bwlimit XY" option to copy the files? > > > > The option will limit the speed, in kB, at which the file is being > > copied; it will work even when source and destination files are on a > > local machine. > > > > Also I run strace cp -a .. > ... > read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 > read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 > write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No > space left on device) > > Last two write calls take a lot more time, and then last one returns > ENOSPC. But if this write is retryed, then it succeeds. > I tried with midnight commander and when error occurs, if I Retry > operation then it finishes copying this file until error occurs again > at next file. > > With --bwlimit it seems to be better, lower the speed later the error > occurs, and if it''s slow enough copy is successfull. > But now I''m not sure anymore. I copied a few files with bwlimit, and > now sudenly error doesn''t occur anymore, even with no bwlimit. > I''ll do some more tests.I just sent a patch to the list [PATCH] Btrfs: make sure the delalloc workers actually flush compressed writes Can you run this patch and see if it makes a difference for your test? Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 28, 2013 at 6:57 PM, Chris Murphy <lists@colorremedies.com> wrote:> > On Oct 28, 2013, at 1:40 AM, Igor M <igork20@gmail.com> wrote: >> >> dmesg: http://pastebin.com/t2H1QYye > > > You''ve got a warning related to pcie bridge on boot, with a trace that follows. I don''t know if this could be related to some problems. > > [ 0.325976] ------------[ cut here ]------------ > [ 0.326086] WARNING: CPU: 5 PID: 1 at drivers/pci/search.c:46 pci_find_upstream_pcie_bridge+0x50/0x70() > [ 0.326263] Modules linked in: > [ 0.326412] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 3.11.6 #1 > [ 0.326520] Hardware name: System manufacturer System Product Name/P8H77-V LE, BIOS 0601 06/06/2012 > [ 0.326695] 0000000000000000 0000000000000009 ffffffff81449f86 0000000000000000 > [ 0.327032] ffffffff810514b1 ffff88040b031898 ffff88040b031800 ffff88040b031898 > [ 0.327367] 0000000080000000 000077ff80000000 ffffffff812191a0 0000000000000004 > [ 0.327704] Call Trace: > [ 0.327819] [<ffffffff81449f86>] ? dump_stack+0x41/0x51 > [ 0.327931] [<ffffffff810514b1>] ? warn_slowpath_common+0x81/0xb0 > [ 0.328040] [<ffffffff812191a0>] ? pci_find_upstream_pcie_bridge+0x50/0x70 > [ 0.328151] [<ffffffff813333d3>] ? intel_iommu_add_device+0x43/0x210 > [ 0.328261] [<ffffffff81330510>] ? bus_set_iommu+0x60/0x60 > [ 0.328370] [<ffffffff8133053c>] ? add_iommu_group+0x2c/0x60 > [ 0.328481] [<ffffffff81292a3d>] ? bus_for_each_dev+0x4d/0x80 > [ 0.328591] [<ffffffff813304fa>] ? bus_set_iommu+0x4a/0x60 > [ 0.328701] [<ffffffff8163bc0d>] ? intel_iommu_init+0xb20/0xc45 > [ 0.328812] [<ffffffff81612919>] ? unpack_to_rootfs+0x24b/0x25b > [ 0.328922] [<ffffffff816163c6>] ? pci_iommu_init+0xe/0x37 > [ 0.329031] [<ffffffff816163b8>] ? memblock_find_dma_reserve+0x148/0x148 > [ 0.329142] [<ffffffff810002f2>] ? do_one_initcall+0x102/0x150 > [ 0.329252] [<ffffffff81611e43>] ? kernel_init_freeable+0xfd/0x18e > [ 0.329362] [<ffffffff816117cf>] ? do_early_param+0x83/0x83 > [ 0.329471] [<ffffffff81444480>] ? rest_init+0x70/0x70 > [ 0.329579] [<ffffffff81444489>] ? kernel_init+0x9/0xe0 > [ 0.329688] [<ffffffff8144f36c>] ? ret_from_fork+0x7c/0xb0 > [ 0.329797] [<ffffffff81444480>] ? rest_init+0x70/0x70 > [ 0.329907] ---[ end trace 0946f959337cff8b ]--- > > > There are also numerous ACPI errors. > > ACPI Error: [DSSP] Namespace lookup failure > > check this: > http://forums.gentoo.org/viewtopic-t-960476-start-0.html > > > Anyway, I don''t see any read or write failures for any of the drives which is what I was kinda expecting. > > >> >> dest disk: http://pastebin.com/ez9jALS2 > > This is a new drive with only 71 power on hours yet I''m seeing this: > > • 0x0009 2 27 Transition from drive PhyRdy to drive PhyNRdy > • 0x000a 2 27 Device-to-host register FISes sent due to a COMRESET > > > > That''s unexpected but I don''t know that it''s releated. The dmesg doesn''t report any phy issues with the drive. Maybe check syslog or journalctl with a case insensitive search for phy and see if you find anything. > > > > > Chris Murphy >Drive should be ok. About pcie_bridge warning, I''m not sure how to solve, otherwise this computer is stable. There is no error if compression is turned off or non-compressible file is copied. I''ll try on different computer and see if the same will happen. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 28, 2013 at 8:05 PM, Josef Bacik <jbacik@fusionio.com> wrote:> On Sun, Oct 27, 2013 at 09:50:37AM +0100, Igor M wrote: >> On Sun, Oct 27, 2013 at 2:00 AM, Tomasz Chmielewski <tch@virtall.com> wrote: >> >> Still no messages. Parameter seems to be active as >> >> /sys/module/printk/parameters/ignore_loglevel is Y, but there are no >> >> messages in log files or dmesg. Maybe I need to turn on some kernel >> >> debugging option and recompile kernel ? >> >> Also I should mention that cca 230G+ data was copied before this error >> >> started to occur. >> > >> > I think I saw a similar issue before. >> > >> > Can you try using rsync with "--bwlimit XY" option to copy the files? >> > >> > The option will limit the speed, in kB, at which the file is being >> > copied; it will work even when source and destination files are on a >> > local machine. >> > >> >> Also I run strace cp -a .. >> ... >> read(3, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 >> write(4, "350348f07$0$24520$c3e8da3$fb4835"..., 65536) = 65536 >> read(3, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = 65536 >> write(4, "62.76C52BF412E849CB86D4FF3898B94"..., 65536) = -1 ENOSPC (No >> space left on device) >> >> Last two write calls take a lot more time, and then last one returns >> ENOSPC. But if this write is retryed, then it succeeds. >> I tried with midnight commander and when error occurs, if I Retry >> operation then it finishes copying this file until error occurs again >> at next file. >> >> With --bwlimit it seems to be better, lower the speed later the error >> occurs, and if it''s slow enough copy is successfull. >> But now I''m not sure anymore. I copied a few files with bwlimit, and >> now sudenly error doesn''t occur anymore, even with no bwlimit. >> I''ll do some more tests. > > I just sent a patch to the list > > [PATCH] Btrfs: make sure the delalloc workers actually flush compressed writes > > Can you run this patch and see if it makes a difference for your test? Thanks, > > JosefI''ll try with this patch. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html