hi, all:
I need to back up about 50 files, the size of which won't exceed 5m,
every 10~15 minutes to four remote machines.
The back up command is written in a shell script file and was
executed by the scheduling program with system() function. The scheduling
program is implemented with c++.
The command as follow:
*rsync -az /home/admin/service/* admin at 10.249.49.101:
/news_hot_data*
* At first, this works just fine.* But after about one or two days,
rsync will stop at some place and the whole backup process stuck.
The following is the output with -vv option when backup stop:
*opening connection using: ssh -l admin 10.249.49.101 rsync
--server -vvlogDtprze.isf . /news_hot_data*
I used strace to track the system call, and I found select() was
invoked again and again, never end until the program was killed by ctrl+c.
The following is the output:
......
18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in
[1024])
18477 read(1024, "\36\0\0\0", 16384) = 4
18477 select(1027, [255 1024], [255], NULL, NULL) = 1 (out [255])
18477 write(255,
"]\306\304\2315\r\346\314\26]\2\275\350|X\305X\216\361\"\301}\t\34\213\357GPS\360\214\370"...,
48) = 48
18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in [255])
18477 read(255,
"l\210\377V\20\270\0270\276@\363N\366\n!\311\211\312\206\216\25\3\1\323\375\370\24\0174lM\312"...,
8192) = 48
18477 select(1027, [255 1024], [1025], NULL, NULL) = 1 (out
[1025])
18477 write(1025, "\35\0\0\0\335\226\333L", 8) = 8
18477 select(1027, [255 1024], [], NULL, NULL <unfinished
...>
18476 <... select resumed> ) = 0 (Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
18476 select(1026, [], [], NULL, {60, 0}) = 0
(Timeout)
......
I read the linux manual about select(), know it was used to wait
something to be ready, but I don't know what exactly it is waiting, and it
never will ready.
*And what made me more confused is that when I execute the script
in the terminal, it still works!*
I have been confused for a few days and try to find out the
reason by google, but failed.
thanks
James li
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.samba.org/pipermail/rsync/attachments/20101111/b89dcf6a/attachment.html>
Thanks for the help from Mr K S Braunsdorf . This problem is caused by the opened pipe, which should have been closed, in the c++ program. thanks James Li 2010/11/11 jian li <lijian06nju at gmail.com>> hi, all: > I need to back up about 50 files, the size of which won't exceed 5m, > every 10~15 minutes to four remote machines. > > The back up command is written in a shell script file and was > executed by the scheduling program with system() function. The scheduling > program is implemented with c++. > The command as follow: > *rsync -az /home/admin/service/* admin at 10.249.49.101: > /news_hot_data* > > * At first, this works just fine.* But after about one or two days, > rsync will stop at some place and the whole backup process stuck. > The following is the output with -vv option when backup stop: > *opening connection using: ssh -l admin 10.249.49.101 rsync > --server -vvlogDtprze.isf . /news_hot_data* > > I used strace to track the system call, and I found select() was > invoked again and again, never end until the program was killed by ctrl+c. > The following is the output: > ...... > 18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in > [1024]) > > 18477 read(1024, "\36\0\0\0", 16384) = 4 > 18477 select(1027, [255 1024], [255], NULL, NULL) = 1 (out [255]) > 18477 write(255, > "]\306\304\2315\r\346\314\26]\2\275\350|X\305X\216\361\"\301}\t\34\213\357GPS\360\214\370"..., > 48) = 48 > 18477 select(1027, [255 1024], [], NULL, NULL) = 1 (in [255]) > 18477 read(255, "l\210\377V\20\270\0270\276@\363N\366\n!\311\211\312\206\216\25\3\1\323\375\370\24\0174lM\312"..., > 8192) = 48 > 18477 select(1027, [255 1024], [1025], NULL, NULL) = 1 (out > [1025]) > 18477 write(1025, "\35\0\0\0\335\226\333L", 8) = 8 > 18477 select(1027, [255 1024], [], NULL, NULL <unfinished > ...> > 18476 <... select resumed> ) = 0 (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > 18476 select(1026, [], [], NULL, {60, 0}) = 0 > (Timeout) > ...... > I read the linux manual about select(), know it was used to wait > something to be ready, but I don't know what exactly it is waiting, and it > never will ready. > *And what made me more confused is that when I execute the > script in the terminal, it still works!* > > I have been confused for a few days and try to find out the > reason by google, but failed. > > > thanks > James li >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.samba.org/pipermail/rsync/attachments/20101113/d6692522/attachment.html>