Eugene M. Zheganin
2020-May-27 15:11 UTC
zfs receive -s: transfer got interrupted, but no token on the receiving side.
Hello, I have a ZFS dataset about 10T of actual size (may be more) that I need to send over a very laggy connection. So I'm sending it from the shell-script that reattempts to send it after a short timeout, retrieving the send token first. Like that: ===Cut== #!/bin/sh exitstatus=1 token=`ssh user at server zfs get receive_resume_token data/reference | grep -v SOURCE | awk '{print $3}'` while ([ $exitstatus -ne 0 ]) do ??? zfs send -t $token | ssh -C user at server sudo zfs receive -Fus data/reference exitstatus=$? ??? echo "Send interrupted/ended, sleeping for 5 secs." ??? sleep 5 done ===Cut== Usually this goes just flawlessly. But this time, due to a low transfer speed and a very laggy connectivity, thus resulting in a send time of several weeks, I got a problem: about one reattempt out of 200 fails with a situation when there's no snapshot on the receiving side (only an incomplete dataset which is definitely smaller than original one), thus meaning that the dataset is incomplete, but there's no token on this dataset. This happened already twice, each time on a different size. How can this even be possible ? Recieving side side: [root at playkey-nas:~]# zfs list -t all NAME???????????? USED? AVAIL? REFER? MOUNTPOINT data??????????? 4,67T? 20,8T?? 128K? /data data/reference? 4,67T? 20,8T?? 128K? /data/reference (as you can see there's no snapshot) Sending side and the snapshot: [root at san1:/usr/src]# zfs list -t all | grep data/reference at ver2_5917 data/reference at ver2_5917 44,3G????? -? 7,73T? - Eugene.
Eugene Grosbein
2020-May-27 22:28 UTC
zfs receive -s: transfer got interrupted, but no token on the receiving side.
27.05.2020 22:11, Eugene M. Zheganin wrote:> Hello, > > > I have a ZFS dataset about 10T of actual size (may be more) that I need to send over a very laggy connection. So I'm sending it from the shell-script that reattempts to send it after a short timeout, retrieving the send token first. Like that: > > ===Cut==> > #!/bin/sh > > exitstatus=1 > token=`ssh user at server zfs get receive_resume_token data/reference | grep -v SOURCE | awk '{print $3}'` > while ([ $exitstatus -ne 0 ]) > do > zfs send -t $token | ssh -C user at server sudo zfs receive -Fus data/reference > exitstatus=$? > echo "Send interrupted/ended, sleeping for 5 secs." > sleep 5 > done > > ===Cut==> > Usually this goes just flawlessly. But this time, due to a low transfer speed and a very laggy connectivity, thus resulting in a send time of several weeks, I got a problem: > > about one reattempt out of 200 fails with a situation when there's no snapshot on the receiving side (only an incomplete dataset which is definitely smaller than original one), thus meaning that the dataset is incomplete, but there's no token on this dataset. This happened already twice, each time on a different size. > > How can this even be possible ? > > > Recieving side side: > > [root at playkey-nas:~]# zfs list -t all > NAME USED AVAIL REFER MOUNTPOINT > data 4,67T 20,8T 128K /data > data/reference 4,67T 20,8T 128K /data/reference > > (as you can see there's no snapshot) > > Sending side and the snapshot: > > [root at san1:/usr/src]# zfs list -t all | grep data/reference at ver2_5917 > data/reference at ver2_5917 44,3G - 7,73T -zfs(8) manual page tells: receive_resume_token For filesystems or volumes which have saved partially-completed state from zfs receive -s, this opaque token can be provided to zfs send -t to resume and complete the zfs receive. Note that /bin/sh does NOT support storing arbitrary opaque data in its variables in unencoded form. Maybe sometimes you get token with characters that break such code. Here is how I encode completely opaque and even binary data when I need to store in a shell variable: key=$(command_printing_data | b64encode -r -) # use binary key later: echo "$key" | b64decode -r | geli attach -pk - $p This technique works for any kind of data including special symbols, zero bytes etc.