John R. LoVerso
2003-May-20 23:49 UTC
patch for better handling of write failures (disk full)
I've been having problems trying to sync two small partitions (128MB) that may be near to full. If rsync gets a write error (such as is caused when you fill up a partition) during a sync without the use of "-T", it will stop with this error: rsync: writefd_unbuffered failed to write 4 bytes: phase "unknown": Broken pipe rsync error: error in rsync protocol data stream (code 12) at io.c(515) That's because the code in receive_data() just calls exit_cleanup() upon a write error, which bombs out the receiver. For whatever reason, the sender doesn't correctly pick this up and it in turn aborts. People have "worked around" this problem by using "-T", putting the received file in temp space (assuming it's big enough). That allows the receive to complete without error, but then finish_transfer() will get an error in copy_file() copying from the temp space to the destination (which may presumably still fill up). At least this tells you about the error and doesn't blow up, but it always leaves the partially transferred file (in my case, if it fills the small partition, I don't want the partially transferred file around) Here are two patches ("that work for me, YMMV"): receiver.c: upon a write error, just discard the rest of the current file transfer and keep working. rsync.c: if using -T and not --partial, remove partial result Perhaps the changes in receive_data() could specifically just target ENOSPC, on the assumption that any other write error is fatal. I'm also using John Van Essen's write_file() patch from: http://lists.samba.org/pipermail/rsync/2003-April/010511.html diff -Nru a/rsync/receiver.c b/rsync/receiver.c --- a/rsync/receiver.c Tue May 20 08:56:43 2003 +++ b/rsync/receiver.c Tue May 20 08:56:43 2003 @@ -214,6 +214,7 @@ static char file_sum1[MD4_SUM_LENGTH]; static char file_sum2[MD4_SUM_LENGTH]; char *map=NULL; + int discard = 0; count = read_int(f_in); n = read_int(f_in); @@ -240,7 +241,9 @@ if (fd != -1 && write_file(fd,data,i) != i) { rprintf(FERROR,"write failed on %s : %s\n",fname,strerror(errno)); - exit_cleanup(RERR_FILEIO); + discard = 1; + fd = -1; + // exit_cleanup(RERR_FILEIO); } offset += i; continue; @@ -268,7 +271,9 @@ if (fd != -1 && write_file(fd,map,len) != (int) len) { rprintf(FERROR,"write failed on %s : %s\n", fname,strerror(errno)); - exit_cleanup(RERR_FILEIO); + discard = 1; + fd = -1; + // exit_cleanup(RERR_FILEIO); } offset += len; } @@ -278,7 +283,9 @@ if (fd != -1 && offset > 0 && sparse_end(fd) != 0) { rprintf(FERROR,"write failed on %s : %s\n", fname,strerror(errno)); - exit_cleanup(RERR_FILEIO); + discard = 1; + fd = -1; + // exit_cleanup(RERR_FILEIO); } sum_end(file_sum1); @@ -293,6 +300,8 @@ return 0; } } + if (discard) + return 2; return 1; } @@ -458,6 +467,16 @@ close(fd1); } close(fd2); + + /* + * This means a write error occured, and the file is discarded + */ + if (recv_ok == 2) { + if (verbose > 2) + rprintf(FINFO,"discarding %s\n",fname); + do_unlink(fnametmp); + cleanup_disable(); + } else { if (verbose > 2) rprintf(FINFO,"renaming %s to %s\n",fnametmp,fname); @@ -476,6 +495,7 @@ write_int(f_gen,i); } } + } } if (delete_after) { diff -Nru a/rsync/rsync.c b/rsync/rsync.c --- a/rsync/rsync.c Tue May 20 08:56:43 2003 +++ b/rsync/rsync.c Tue May 20 08:56:43 2003 @@ -243,8 +243,14 @@ /* rename failed on cross-filesystem link. Copy the file instead. */ if (copy_file(fnametmp,fname, file->mode & INITACCESSPERMS)) { - rprintf(FERROR,"copy %s -> %s : %s\n", + int err = errno; + extern int keep_partial; + rprintf(FERROR,"error copy %s -> %s : %s\n", fnametmp,fname,strerror(errno)); + /* remove partial result if disk full */ + if (err == ENOSPC && !keep_partial) { + (void)unlink(fname); + } } else { set_perms(fname,file,NULL,0); } John