On Sat, Nov 28, 2015 at 10:42:28AM -0500, Mikhail T. wrote:> I was copying /home from an old server (narawntapu) to a new one > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > I created a ZFS filesystem for it.> The copying was started thus:> root at aldan:/home (435) cp -Rpn /mnt/* .> for a while this was proceeding at a decent clip with cp making > newnfsreq-uests:> load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > 100% > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > -> > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > 100%> ZFS on the destination compressing and writing stuff out and the traffic > between the two ranging from 30 to 50Mb/s (according to systat), but > then something happened and the cp-process is now hung:> load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300kThis normally means that the process is opening a fifo for reading and is waiting for a writer. Although cp -R will normally copy a fifo by calling mkfifo at the destination, it may open one if a regular file is replaced with a fifo between the time it reads the directory and it copies that file. This is not that unlikely if large directory trees are copied during that time. On the other hand, cp without -R/-r/-l/-s will always open a fifo. You can make cp continue by opening the fifo (which you'll need to find first, for example by checking what has been copied already) for writing, like : >/path/to/some/fifo. It will be replaced with an empty file at the destination. -- Jilles Tjoelker
Jilles Tjoelker wrote:> On Sat, Nov 28, 2015 at 10:42:28AM -0500, Mikhail T. wrote: > > I was copying /home from an old server (narawntapu) to a new one > > (aldan). The narawntapu:/home is mounted on aldan as /mnt with flags > > ro,intr. On narawntapu /home was simply located on an SSD, but on aldan > > I created a ZFS filesystem for it. > > > The copying was started thus: > > > root at aldan:/home (435) cp -Rpn /mnt/* . > > > for a while this was proceeding at a decent clip with cp making > > newnfsreq-uests: > > > load: 0.78 cmd: cp 38711 [newnfsreq] 802.84r 1.57u 140.63s 20% 10768k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/sent/cur/1219621413.32392.hd8cl:2,S > > 100% > > load: 1.23 cmd: cp 38711 [newnfsreq] 874.19r 1.66u 154.74s 17% 4576k > > /mnt/mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > > -> > > ./mi/.kde/share/apps/kmail/dimap/.42838394.directory/ML/cur/1219595347.32392.rMDFf:2,S > > 100% > > > ZFS on the destination compressing and writing stuff out and the traffic > > between the two ranging from 30 to 50Mb/s (according to systat), but > > then something happened and the cp-process is now hung: > > > load: 0.55 cmd: cp 38711 [fifoor] 1107.67r 2.09u 194.12s 0% 3300k > > load: 0.50 cmd: cp 38711 [fifoor] 1112.66r 2.09u 194.12s 0% 3300k > > load: 0.22 cmd: cp 38711 [fifoor] 1642.37r 2.09u 194.12s 0% 3300k > > This normally means that the process is opening a fifo for reading and > is waiting for a writer. Although cp -R will normally copy a fifo by > calling mkfifo at the destination, it may open one if a regular file is > replaced with a fifo between the time it reads the directory and it > copies that file. This is not that unlikely if large directory trees are > copied during that time. >Oops, thanks. I didn't know that [fifoor] in these lines meant that was what the WCHAN is. Obviously, you should now ignore everything I said;-) rick> On the other hand, cp without -R/-r/-l/-s will always open a fifo. > > You can make cp continue by opening the fifo (which you'll need to find > first, for example by checking what has been copied already) for > writing, like : >/path/to/some/fifo. It will be replaced with an empty > file at the destination. > > -- > Jilles Tjoelker > _______________________________________________ > freebsd-stable at freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org" >
On 28.11.2015 17:41, Jilles Tjoelker wrote:> Although cp -R will normally copy a fifo by calling mkfifo at the destination, it may open one if a regular file is replaced with a fifo between the time it reads the directory and it copies that file.The sole fifo under /home here was mi/.licq/licq_fifo, created in 2003. I echoed something into it (on the NFS-client side) and the cp-process resumed. I then performed a simple test: 1. Create a fifo in an NFS-exported directory and try to copy it with the -R flag mi at narawntapu:/cache/src (792) mkfifo /green/tmp/test mi at narawntapu:/cache/src (793) cp -Rpn /green/tmp/test /tmp/ mi at narawntapu:/cache/src (794) ls -l /tmp/test prw-r--r-- 1 mi wheel 0 29 ??? 00:05 /tmp/test The above worked fine. 2. Now, when I try to do the same thing via an NFS mount, I get the same hang in fifoor: root at aldan:ports/x11/kde4 (475) cp -Rpn /green/tmp/test /tmp/ load: 0.42 cmd: cp 38299 [fifoor] 1.15r 0.00u 0.00s 0% 1868k So, the good news is, this is not ZFS' fault. The bad news is, there is still a bug... Unless, of course, this is some known "feature" of the NFS... Compare, for example, how stat(1) describes the same named pipe from both machines: Local FS: 92 74636334 prw-r--r-- 1 mi wheel 0 0 "Nov 29 00:05:51 2015" "Nov 29 00:05:51 2015" "Nov 29 00:05:51 2015" "Nov 29 00:05:51 2015" 16384 0 0 /green/tmp/test NFS-client: 973143811 74636334 ?rw-r--r-- 1 mi wheel 4294967295 0 "Nov 29 00:05:51 2015" "Nov 29 00:05:51 2015" "Nov 29 00:05:51 2015" "Dec 31 18:59:59 1969" 16384 0 0 /green/tmp/test That question-mark in the node-type (instead of the "p") is, I guess, what confuses cp into trying to read from it instead of creating a fifo. Should I file a PR? Thank you! -mi