I'd like some opinions on a couple of long-standing rsync issues. My two oldest, uncommitted patches are: - A "no hang" patch that makes sure that the pipe from the receiver to the generator can't block with resend requests. - The "move files" patch that got changed into a --delete-sent-files option. For each item I have two questions -- do we need to deal with this? And is the proposed change a good way to implement the change? Some comments on each item follow: Redo-Channel Anti-Hang Patch =========================== I've had a couple different incarnations of this patch because the IO section is quite complex and there were concerns about memory usage (if rsync keeps the redo channel clear, it has to note the redo items somewhere). My first one kept a buffer of redo items that would expand only as redo items arrived (Red Hat actually incorporated this one into their released version of rsync 2.4.6 at some point, so it even got a lot of testing, unbeknownst to me at the time). My later patch changed the no-hang algorithm to keep a flag char for every item in the file list (to avoid a criticism about having a growing buffer). I actually prefer the first patch these days, because it has a lower memory impact on large file lists. Anyone have an opinion on this? More technical comments: I don't believe that we should use the existing per-item flags in the flist data since it is shared between two forked processes, and twiddling data in copy-on-write memory may well cause a larger memory bloat than just keeping a separate flag array. Of course if we eventually switch over to threads, this copy-on-write pitfall would go away. Another alternative implementation would be to change the rsync algorithm to recycle the redo items immediately instead of in a separate pass. This would eliminate the need for caching the redo data. (Aside: have the recent checksum-length changes taken into account the redo pass that tries to use an alternate checksum size for the resends?) Move-File Support ================ I'm currently debating with myself whether I believe that rsync should get a --delete-sent-files option. If you think of rsync as a "keeping things in sync" program, removing a file after transferring it seems to be a little outside the realm of rsync's purpose. However, if you think of rsync as a feature-rich and more bandwidth-efficient copy tool, then having the ability to move files between machines as well as copy them seems like an appropriate addition. I certainly need to be able to move files between systems at my work, and I haven't seen a better tool for this than rsync with one of my old move-files patches applied to it. I'd love to hear what people think about this issue. Implementing this is interesting. The best way to do it is for the receiver to send a success message back to the sender so that it can remove the source file only when it has been successfully sent. One implementation was to use the redo channel for this ack, and that means that the above no-hang patch would need to be implemented first. An untried implementation would be to use the error channel (from the receiver through the generator to the sender) as a way to send the sender a "delete item X" message. An alternate approach that is conceptually simpler is to add a file- removal pass on the sending side at the end of the transfer, but I have grown more doubtful over time that this method would properly handle error conditions in a reasonable manner (since we want to avoid both erasing a file that didn't get sent and leaving a file unremoved that did get sent). Thoughts? ..wayne..
On Thu, May 08, 2003 at 12:56:31PM -0700, Wayne Davison wrote:> I'd like some opinions on a couple of long-standing rsync issues. My > two oldest, uncommitted patches are: > > - A "no hang" patch that makes sure that the pipe from the receiver > to the generator can't block with resend requests. > > - The "move files" patch that got changed into a --delete-sent-files > option. > > For each item I have two questions -- do we need to deal with this? > And is the proposed change a good way to implement the change? Some > comments on each item follow: > > Redo-Channel Anti-Hang Patch > ===========================> > I've had a couple different incarnations of this patch because the IO > section is quite complex and there were concerns about memory usage (if > rsync keeps the redo channel clear, it has to note the redo items > somewhere). My first one kept a buffer of redo items that would expand > only as redo items arrived (Red Hat actually incorporated this one into > their released version of rsync 2.4.6 at some point, so it even got a > lot of testing, unbeknownst to me at the time). My later patch changed > the no-hang algorithm to keep a flag char for every item in the file > list (to avoid a criticism about having a growing buffer). I actually > prefer the first patch these days, because it has a lower memory impact > on large file lists. Anyone have an opinion on this?If it is used just for the redo the redo buffer makes more sense. However, i think this has more use than just redo so i lean toward a transfer-status flags array.> More technical comments: I don't believe that we should use the > existing per-item flags in the flist data since it is shared between > two forked processes, and twiddling data in copy-on-write memory may > well cause a larger memory bloat than just keeping a separate flag > array. Of course if we eventually switch over to threads, this > copy-on-write pitfall would go away.The current flags field in the flist is a char is sent as a char and has no free bits. Adding more, non-transmitted, flags would be best done by adding another field rather than expanding the current one. As for the copy-on-write issue; I've not looked into it but are you sure the fork is done after the flist is fully populated?> Another alternative implementation would be to change the rsync > algorithm to recycle the redo items immediately instead of in a > separate pass. This would eliminate the need for caching the redo > data. (Aside: have the recent checksum-length changes taken into > account the redo pass that tries to use an alternate checksum size > for the resends?)When starting the redo a phase variable (seemingly in conflict with other vars of the same name so i haven't yet identified its scope) is incremented and the global csum_length is set to SUM_LENGTH. Currently this is detected with if (remote_version < 27) { s2length = csum_length; } else if (csum_length == SUM_LENGTH) { s2length = SUM_LENGTH; } else { If i could be sure that the same phase as indicates redo were in scope i'd use that and eliminate csum_length. Were there a per-file status to test that is what i would use. As is clear, we depend on global vars to determine that we are in the redo phase. For performance reasons i'd favour doing the redo as near to detection as possible but it couldn't depend on global vars to identify it as a redo so we would need to have per-file flags for that. SPECULATIVE While dynamic checksum and block sizes should reduce redo frequency i wonder if we shouldn't fall back on whole-file if a redo fails. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
On Thu, May 08, 2003 at 12:56:31PM -0700, Wayne Davison wrote:> I'd like some opinions on a couple of long-standing rsync issues. My > two oldest, uncommitted patches are: > > - A "no hang" patch that makes sure that the pipe from the receiver > to the generator can't block with resend requests. > > - The "move files" patch that got changed into a --delete-sent-files > option. > > For each item I have two questions -- do we need to deal with this? > And is the proposed change a good way to implement the change? Some > comments on each item follow: > > Move-File Support > ================> > I'm currently debating with myself whether I believe that rsync should > get a --delete-sent-files option. If you think of rsync as a "keeping > things in sync" program, removing a file after transferring it seems to > be a little outside the realm of rsync's purpose. However, if you think > of rsync as a feature-rich and more bandwidth-efficient copy tool, then > having the ability to move files between machines as well as copy them > seems like an appropriate addition. I certainly need to be able to move > files between systems at my work, and I haven't seen a better tool for > this than rsync with one of my old move-files patches applied to it. > I'd love to hear what people think about this issue.Reviewing the paragraph below i find i am thinking in type (like out-loud only with a keyboard). Well, i think of rsync as a "keep trees in sync" tool modeled on cp. Where cp's semantics apply they should be used. If you want a swiss army chainsaw use perl :) Using rsync as a network mv utility is almost contradictory to the sync aspect buuUUuut a network mv is semi-consistant with the cp model. I could go either way.> Implementing this is interesting. The best way to do it is for the > receiver to send a success message back to the sender so that it can > remove the source file only when it has been successfully sent. One > implementation was to use the redo channel for this ack, and that means > that the above no-hang patch would need to be implemented first. An > untried implementation would be to use the error channel (from the > receiver through the generator to the sender) as a way to send the > sender a "delete item X" message. > > An alternate approach that is conceptually simpler is to add a file- > removal pass on the sending side at the end of the transfer, but I have > grown more doubtful over time that this method would properly handle > error conditions in a reasonable manner (since we want to avoid both > erasing a file that didn't get sent and leaving a file unremoved that > did get sent).Even if a move-file mode weren't implimented i still like the ACK very much. One of rsync's greatest weaknesses is the way it handles errors. We really have no positive confirmation of successful transfers. We know an error occured if the error channel makes a report but can still have reported the file as successfully sent. With an ACK we could defer reporting (--verbose) the file as sent until we have the ACK. No assumptions. If the ACK/NACK indicated type of failure (mismatched checksums, write-error, EACCESS) that would be even better. What kind of latency has the redo (perhaps renamed status) channel? This really should be a structured transmission sending file number and status flags, not a text stream. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt