Patches welcome, eh, Paul? Upon further (belated) investigation, there are 2 affected places in receiver.c with this error message. Both call write_file(). And write_file is called only in those two places. So that is the appropriate location to patch. Especially since the obvious fix is to use the rewrite code already there for the sparse file writes. -------------------------------------8<------------------------------------- --- fileio.c.orig Fri Jan 25 15:07:34 2002 +++ fileio.c Sat Apr 26 12:16:25 2003 @@ -69,25 +69,28 @@ return len; } int write_file(int f,char *buf,size_t len) { int ret = 0; - if (!sparse_files) { - return write(f,buf,len); - } - while (len>0) { - int len1 = MIN(len, SPARSE_WRITE_SIZE); - int r1 = write_sparse(f, buf, len1); + int r1; + if (sparse_files) { + int len1 = MIN(len, SPARSE_WRITE_SIZE); + r1 = write_sparse(f, buf, len1); + } else { + /* Normal writes are in this loop, too, so that */ + /* retries of partial writes will set errno. */ + r1 = write(f, buf, len); + } if (r1 <= 0) { if (ret > 0) return ret; return r1; } len -= r1; buf += r1; ret += r1; } return ret; -------------------------------------8<------------------------------------- Test of unpatched rsync (bogus error message): write failed on jvejunk2 : Success rsync error: error in file IO (code 11) at receiver.c(243) rsync: connection unexpectedly closed (200 bytes read so far) rsync error: error in rsync protocol data stream (code 12) at io.c(165) Test of patched rsync (correct error message): write failed on jvejunk2 : No space left on device rsync error: error in file IO (code 11) at receiver.c(243) rsync: connection unexpectedly closed (200 bytes read so far) rsync error: error in rsync protocol data stream (code 12) at io.c(165) -- John Van Essen <vanes002@umn.edu> On Thu, 7 Nov 2002 12:37:09, Green, Paul <Paul.Green@stratus.com> wrote:>John Van Essen [mailto:vanes002@umn.edu] wrote, in partial: >> In receiver.c there are three places with code similar to: >> >> if (fd != -1 && write_file(fd,data,i) != i) { >> rprintf(FERROR,"write failed on %s : %s\n",fname,strerror(errno)); >> exit_cleanup(RERR_FILEIO); >> } >> >> A partial write due to running out of room (as opposed to a write >> that fails to write any data at all) apparently does not set an >> error code (errno stays 0). This is on RedHat 7.2 btw. >[snip] >> So the rprintf lines need to be changed to something like this: >> >> rprintf(FERROR,"write failed on %s : %s\n",fname, >> strerror(errno ? errno : ENOSPC)); > >I disagree. If you read the POSIX standard, you will see that the error >code is returned on the NEXT call to write the data. POSIX requires that the >"short write" not return any error. (It can't, because it has to set the >return value to -1 to indicate the error, but it also has to set the return >value to the number of bytes written in the short write; hence, two calls >are needed). So, there are more problems with this code than just the way >it handles errors. > >See paragraph 6.4.2.1 of the POSIX-1996 standard. To quote the relevant >paragraph here: > >"If a write() requests that more bytes be written than there is room for >(for example, the physical end of a medium), only as many bytes as there is >room for shall be written. For example, suppose there is space for 20 bytes >more in a file before reaching a limit. A write of 512 bytes would return >20. The next write of a nonzero number of bytes would give a failure return >(except as noted below)." > >The exception has to do with getting interrupted by a signal and is not >interesting here. > >"Upon successful completion, write() shall return an integer indicating the >number of bytes actually written. Otherwise, it shall return a value of -1 >and set errno to indicate the error." > >"[ENOSPC] There is no free space remaining on the device containing the >file." > >So, RedHat is correct in not setting errno in this case. And you are >correct that ENOSPC is the proper error code for this case, but you only >know it is out of space via delphic knowledge. rsync can't possibly know >that. I would not want to see rsync assume it knows the proper error code; >that's the job of the OS. Instead, rsync should be recoded to continue >attempting to write data until get it gets a nonzero error code, not until >it gets a short write. > >Patches welcome. > >PG >-- >Paul Green, Senior Technical Consultant, Stratus Technologies. >Voice: +1 (978) 461-7557; FAX: +1 (978) 461-3610 >Speaking from Stratus not for Stratus
On Sat, Apr 26, 2003 at 03:09:29PM -0500, John Van Essen wrote:> Patches welcome, eh, Paul? > > Upon further (belated) investigation, there are 2 affected places > in receiver.c with this error message. Both call write_file(). > And write_file is called only in those two places. So that is the > appropriate location to patch. Especially since the obvious fix is > to use the rewrite code already there for the sparse file writes.This looks correct to me. On partial write we need to retry. The callers to write_file only test for completeness. I'd drop the added comments or put a generic "loop until all is written or error" at the top of the loop. The inline comments make the code harder to read.> -------------------------------------8<------------------------------------- > --- fileio.c.orig Fri Jan 25 15:07:34 2002 > +++ fileio.c Sat Apr 26 12:16:25 2003 > @@ -69,25 +69,28 @@ > return len; > } > > > > int write_file(int f,char *buf,size_t len) > { > int ret = 0; > > - if (!sparse_files) { > - return write(f,buf,len); > - } > - > while (len>0) { > - int len1 = MIN(len, SPARSE_WRITE_SIZE); > - int r1 = write_sparse(f, buf, len1); > + int r1; > + if (sparse_files) { > + int len1 = MIN(len, SPARSE_WRITE_SIZE); > + r1 = write_sparse(f, buf, len1); > + } else { > + /* Normal writes are in this loop, too, so that */ > + /* retries of partial writes will set errno. */ > + r1 = write(f, buf, len); > + } > if (r1 <= 0) { > if (ret > 0) return ret; > return r1; > } > len -= r1; > buf += r1; > ret += r1; > } > return ret; > -------------------------------------8<------------------------------------- > > Test of unpatched rsync (bogus error message): > > write failed on jvejunk2 : Success > rsync error: error in file IO (code 11) at receiver.c(243) > rsync: connection unexpectedly closed (200 bytes read so far) > rsync error: error in rsync protocol data stream (code 12) at io.c(165) > > > Test of patched rsync (correct error message): > > write failed on jvejunk2 : No space left on device > rsync error: error in file IO (code 11) at receiver.c(243) > rsync: connection unexpectedly closed (200 bytes read so far) > rsync error: error in rsync protocol data stream (code 12) at io.c(165) > > -- > John Van Essen <vanes002@umn.edu> > > > > > On Thu, 7 Nov 2002 12:37:09, Green, Paul <Paul.Green@stratus.com> wrote: > >John Van Essen [mailto:vanes002@umn.edu] wrote, in partial: > >> In receiver.c there are three places with code similar to: > >> > >> if (fd != -1 && write_file(fd,data,i) != i) { > >> rprintf(FERROR,"write failed on %s : %s\n",fname,strerror(errno)); > >> exit_cleanup(RERR_FILEIO); > >> } > >> > >> A partial write due to running out of room (as opposed to a write > >> that fails to write any data at all) apparently does not set an > >> error code (errno stays 0). This is on RedHat 7.2 btw. > >[snip] > >> So the rprintf lines need to be changed to something like this: > >> > >> rprintf(FERROR,"write failed on %s : %s\n",fname, > >> strerror(errno ? errno : ENOSPC)); > > > >I disagree. If you read the POSIX standard, you will see that the error > >code is returned on the NEXT call to write the data. POSIX requires that the > >"short write" not return any error. (It can't, because it has to set the > >return value to -1 to indicate the error, but it also has to set the return > >value to the number of bytes written in the short write; hence, two calls > >are needed). So, there are more problems with this code than just the way > >it handles errors. > > > >See paragraph 6.4.2.1 of the POSIX-1996 standard. To quote the relevant > >paragraph here: > > > >"If a write() requests that more bytes be written than there is room for > >(for example, the physical end of a medium), only as many bytes as there is > >room for shall be written. For example, suppose there is space for 20 bytes > >more in a file before reaching a limit. A write of 512 bytes would return > >20. The next write of a nonzero number of bytes would give a failure return > >(except as noted below)." > > > >The exception has to do with getting interrupted by a signal and is not > >interesting here. > > > >"Upon successful completion, write() shall return an integer indicating the > >number of bytes actually written. Otherwise, it shall return a value of -1 > >and set errno to indicate the error." > > > >"[ENOSPC] There is no free space remaining on the device containing the > >file." > > > >So, RedHat is correct in not setting errno in this case. And you are > >correct that ENOSPC is the proper error code for this case, but you only > >know it is out of space via delphic knowledge. rsync can't possibly know > >that. I would not want to see rsync assume it knows the proper error code; > >that's the job of the OS. Instead, rsync should be recoded to continue > >attempting to write data until get it gets a nonzero error code, not until > >it gets a short write. > > > >Patches welcome. > > > >PG > >-- > >Paul Green, Senior Technical Consultant, Stratus Technologies. > >Voice: +1 (978) 461-7557; FAX: +1 (978) 461-3610 > >Speaking from Stratus not for Stratus > > -- > To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html >-- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: jw@pegasys.ws Remember Cernan and Schmitt
JW, (Ya know, I've been on this list for over a year and I still have no idea what your first name is. Anyway...) I moved my comment above the loop. Here's the (final) patch. If you want to change the comment before applying it, feel free to do so. --- fileio.c.orig Fri Jan 25 15:07:34 2002 +++ fileio.c Sun Apr 27 12:29:17 2003 @@ -69,25 +69,28 @@ return len; } int write_file(int f,char *buf,size_t len) { int ret = 0; - if (!sparse_files) { - return write(f,buf,len); - } - + /* Normal writes are in this loop, too, so that retries + of partial writes will eventually set errno. */ while (len>0) { - int len1 = MIN(len, SPARSE_WRITE_SIZE); - int r1 = write_sparse(f, buf, len1); + int r1; + if (sparse_files) { + int len1 = MIN(len, SPARSE_WRITE_SIZE); + r1 = write_sparse(f, buf, len1); + } else { + r1 = write(f, buf, len); + } if (r1 <= 0) { if (ret > 0) return ret; return r1; } len -= r1; buf += r1; ret += r1; } return ret; -- John Van Essen <vanes002@umn.edu> On Sat, 26 Apr 2003, jw schultz <jw@pegasys.ws> wrote:>On Sat, Apr 26, 2003 at 03:09:29PM -0500, John Van Essen wrote: >> Patches welcome, eh, Paul? >> >> Upon further (belated) investigation, there are 2 affected places >> in receiver.c with this error message. Both call write_file(). >> And write_file is called only in those two places. So that is the >> appropriate location to patch. Especially since the obvious fix is >> to use the rewrite code already there for the sparse file writes. > >This looks correct to me. On partial write we need to >retry. The callers to write_file only test for >completeness. > >I'd drop the added comments or put a generic >"loop until all is written or error" at the top of the loop. >The inline comments make the code harder to read. > >> -------------------------------------8<------------------------------------- >> --- fileio.c.orig Fri Jan 25 15:07:34 2002 >> +++ fileio.c Sat Apr 26 12:16:25 2003 >> @@ -69,25 +69,28 @@ >> return len; >> } >> >> >> >> int write_file(int f,char *buf,size_t len) >> { >> int ret = 0; >> >> - if (!sparse_files) { >> - return write(f,buf,len); >> - } >> - >> while (len>0) { >> - int len1 = MIN(len, SPARSE_WRITE_SIZE); >> - int r1 = write_sparse(f, buf, len1); >> + int r1; >> + if (sparse_files) { >> + int len1 = MIN(len, SPARSE_WRITE_SIZE); >> + r1 = write_sparse(f, buf, len1); >> + } else { >> + /* Normal writes are in this loop, too, so that */ >> + /* retries of partial writes will set errno. */ >> + r1 = write(f, buf, len); >> + } >> if (r1 <= 0) { >> if (ret > 0) return ret; >> return r1; >> } >> len -= r1; >> buf += r1; >> ret += r1; >> } >> return ret; >> -------------------------------------8<------------------------------------- >> >> Test of unpatched rsync (bogus error message): >> >> write failed on jvejunk2 : Success >> rsync error: error in file IO (code 11) at receiver.c(243) >> rsync: connection unexpectedly closed (200 bytes read so far) >> rsync error: error in rsync protocol data stream (code 12) at io.c(165) >> >> >> Test of patched rsync (correct error message): >> >> write failed on jvejunk2 : No space left on device >> rsync error: error in file IO (code 11) at receiver.c(243) >> rsync: connection unexpectedly closed (200 bytes read so far) >> rsync error: error in rsync protocol data stream (code 12) at io.c(165) >> >> -- >> John Van Essen <vanes002@umn.edu> >-- >________________________________________________________________ > J.W. Schultz Pegasystems Technologies > email address: jw@pegasys.ws > > Remember Cernan and Schmitt
On Sun, 27 Apr 2003, jw schultz <jw@pegasys.ws> wrote:>On Sun, Apr 27, 2003 at 02:40:39PM -0500, John Van Essen wrote: >> JW, >> >> (Ya know, I've been on this list for over a year and I still have >> no idea what your first name is. Anyway...) > >To imply decete is not humourous. The only amusement i've >found in people who disbelieve me when told my name is their >clumsy efforts to hide their curiosity and incapacity to >explain it. It is like watching someone attempting to >confirm that someone else isn't wearing underpants. If J.W. >(In email i leave out the punctuation) is good enough for >the DMV and the IRS it should be good enough for you.No deceit was implied and no disrespect was intended. I would have put a smiley (:D) at the end of the parenthesized sentence, but emoticons don't seem to be part of the culture of this list so I (regrettably) left it out. :-( Please accept my apologies for my awkward attempt to get to know you a little better. I will be more circumspect in the future. -- John Van Essen <vanes002@umn.edu>
Committed with block comment for whole function. On Sat, Apr 26, 2003 at 03:09:29PM -0500, John Van Essen wrote:> Upon further (belated) investigation, there are 2 affected places > in receiver.c with this error message. Both call write_file(). > And write_file is called only in those two places. So that is the > appropriate location to patch. Especially since the obvious fix is > to use the rewrite code already there for the sparse file writes. > > -------------------------------------8<------------------------------------- > --- fileio.c.orig Fri Jan 25 15:07:34 2002 > +++ fileio.c Sat Apr 26 12:16:25 2003 > @@ -69,25 +69,28 @@ > return len; > } > > > > int write_file(int f,char *buf,size_t len) > { > int ret = 0; > > - if (!sparse_files) { > - return write(f,buf,len); > - } > - > while (len>0) { > - int len1 = MIN(len, SPARSE_WRITE_SIZE); > - int r1 = write_sparse(f, buf, len1); > + int r1; > + if (sparse_files) { > + int len1 = MIN(len, SPARSE_WRITE_SIZE); > + r1 = write_sparse(f, buf, len1); > + } else { > + /* Normal writes are in this loop, too, so that */ > + /* retries of partial writes will set errno. */ > + r1 = write(f, buf, len); > + } > if (r1 <= 0) { > if (ret > 0) return ret; > return r1; > } > len -= r1; > buf += r1; > ret += r1; > } > return ret;
Apparently Analagous Threads
- [PATCH] allow to change the block size used to handle sparse files
- reducing memmoves
- DO NOT REPLY [Bug 3925] New: rsync is unable to sync large (approx 4G) sparse files
- AW: samba ldap and pam without -with-ldapsam option
- problem syncing files on RAM in embedded device