I have been trying for quite a while now to understand why is the flist.c:f_name() function implemented using static buffers. Anyone care to comment? The immediate problem is that any call to f_name overrides the previous content (well, obvious). This, combined with the fact that several function calls are made with the result of f_name(file) results in problems handling hardlinks - and possibly other issues as well, but hardlinks is what made me look deeper inside the rsync code. As a side note, the f_name function has some "protection" against this static clobbering: instead of having a single buffer, it has... TEN, and it cycles through them. That's both funny and sad... One obvious problem path is starting from generator.c:generate_files() function. In there we call in a while loop: recv_generator(local_name?local_name:f_name(file),flist,i,f), which calls hlink.c:check_hard_link(), which calls hlink.c:hlink_compare(), which calls flist.c:file_compare(), which calls flist.c:f_name(). Interested parties might want to follow the code, it is easier and more obvious... Anyway, the end result is that in recv_generator() the first argument (fname) is now clobbered by our search through the hlink_list, and depending on what was the last thing we put in there there are various failures happening. The bad part is that they are mostly silent. Most of the bitching is done via "if (verbose > 1)", so with the standard 'rsync -av' there are no indicators that would tell somebody rsync failed to handle a particular file... One way of "unclobbering" this is by forcing another call to f_name after the check_hard_link() call is finished (wither in check_hard_link itself or in recv_generator), but that smells rather nasty... On the other hand, reimplementing f_name to make it sane would require some bigger changes, since f_name() happens to be quite a popular function inside the rsync code... This is bad and needs some fixing... Little hammer or big hammer? Cristian -- ---------------------------------------------------------------------- Cristian Gafton -- gafton@redhat.com -- Red Hat, Inc. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "There are two kinds of people who never amount to much: those who cannot do what they are told, and those who can do nothing else." --Cyrus Curtis
On Mon, Dec 29, 2003 at 04:55:19PM -0500, Cristian Gafton wrote:> > I have been trying for quite a while now to understand why is the > flist.c:f_name() function implemented using static buffers. Anyone care to > comment? > > The immediate problem is that any call to f_name overrides the previous > content (well, obvious). This, combined with the fact that several > function calls are made with the result of f_name(file) results in > problems handling hardlinks - and possibly other issues as well, but > hardlinks is what made me look deeper inside the rsync code. > > As a side note, the f_name function has some "protection" against this > static clobbering: instead of having a single buffer, it has... TEN, and > it cycles through them. That's both funny and sad... > > One obvious problem path is starting from generator.c:generate_files() > function. In there we call in a while loop: > > recv_generator(local_name?local_name:f_name(file),flist,i,f), which calls > hlink.c:check_hard_link(), which calls > hlink.c:hlink_compare(), which calls > flist.c:file_compare(), which calls > flist.c:f_name(). > > Interested parties might want to follow the code, it is easier and more > obvious... > > Anyway, the end result is that in recv_generator() the first argument > (fname) is now clobbered by our search through the hlink_list, and > depending on what was the last thing we put in there there are various > failures happening. > > The bad part is that they are mostly silent. Most of the bitching is done > via "if (verbose > 1)", so with the standard 'rsync -av' there are no > indicators that would tell somebody rsync failed to handle a particular > file... > > One way of "unclobbering" this is by forcing another call to f_name after > the check_hard_link() call is finished (wither in check_hard_link itself > or in recv_generator), but that smells rather nasty... On the other hand, > reimplementing f_name to make it sane would require some bigger changes, > since f_name() happens to be quite a popular function inside the rsync > code... > > This is bad and needs some fixing... Little hammer or big hammer?How about an even smaller hammer. In recv_generator do a strlcpy of arg1 to an automatic. -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
On Mon, Dec 29, 2003 at 04:55:19PM -0500, Cristian Gafton wrote:> I have been trying for quite a while now to understand why is the > flist.c:f_name() function implemented using static buffers. Anyone > care to comment?It's fairly handy for short-lived stuff, such as a printf() statement that need two f_name() values. However, as you noted, for long-lived stuff, it is an accident waiting to happen.> This is bad and needs some fixing... Little hammer or big hammer?I'm inclined to go with the big hammer for the next release (after 2.6.0). The f_name() function is labeled as one of the biggest hot spots in the code, so optimizing it to allow an expansion into a local buffer should make it both safer and faster (one section of the code currently uses strdup() to protect the f_name() value -- ick). Also, optimizing the string-compare case to avoid copying would probably be a win as well. I've got some code that takes a stab at doing this: http://www.blorf.net/rsync-big-hammer.patch It's been only minimally tested, so use it either as an indicator of what I'm currently considering for post-2.6.0, or feel free to help test it. ..wayne..
On Mon, Dec 29, 2003 at 04:55:19PM -0500, Cristian Gafton wrote:> [...] f_name(file) results in problems handling hardlinks - and > possibly other issues as wellThis is fixed in the CVS version. Feel free to try it out. For those that don't like to use CVS, you can always rsync the latest unpacked version of the CVS source from here: rsync://rsync.samba.org/ftp/unpacked/rsync/ Also, I just re-enabled the building of the nightly tar files (and I produced a tar a few minutes ago to test my script): http://rsync.samba.org/ftp/rsync/nightly/ ..wayne..