Joerg Schneider
2006-Nov-03 13:24 UTC
a strange behavior on a small memory system with tun0
Hello I am using openssh as vpn on a router with freewrt.org and rsync to sync my local data to a remote server. Everything works fine for me so far. The only problem is that the channel-1 (tun0) goes down on the router after 50-200MB transfered via tun0 or after 5-20 hours only with a ping to the server . Channel-0 is still working. Tcpdump shows that a ping from the router to the server via tun0 reaches the server but the replay from Server doesn't the router. The tunnel goes unidirectional. So far I think that must be e memory out problem on the router. What i did try till now is: -I replaced the router from Linksys 16MB ram to Asus 32MB ram (24MB free), no change. -I added 512MB swap via external usb harddrive (swap isn't even used when tun0 goes down!), no change. -I switched from openssh 4.3p2 to 4.4p1 and chiper=blowfish, no difference. -I switched from openwrt.org to freewrt.org, no difference. -I tried the HPNPatch (www.psc.edu/networking/projects/hpn-ssh) and tested with HPNBufferSize=512kb, NoChiper=yes on server and router, not better. -I tried with openvpn, works fine but i would prefer openssh for my backup solution. -I changed for testing in channels.c: debug2("channel %d: rcvd adjust %u", id, adjust); to debug2("channel %d: rcvd adjust %u is_now: %u", id, adjust, c->remote_window); debug on router shows: .. .. .. Nov 3 12:26:51 (none) user.info : debug2: channel 1: rcvd adjust 1049336 is_now: 705152^M Nov 3 12:26:59 (none) user.info : debug2: channel 1: rcvd adjust 1048663 is_now: 698457^M Nov 3 12:27:01 (none) user.info : debug2: channel 1: window 108432 sent adjust 108524^M Nov 3 12:27:06 (none) user.info : debug2: channel 1: rcvd adjust 1049535 is_now: 693233^M Nov 3 12:27:14 (none) user.info : debug2: channel 1: rcvd adjust 1048580 is_now: 697384^M Nov 3 12:27:22 (none) user.info : debug2: channel 1: rcvd adjust 1049529 is_now: 695099^M Nov 3 12:28:30 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M ............ tun0 (channel-1) is unidirectional now and I close channel-0 manualy for testing: ............ Nov 3 12:28:33 (none) user.info : debug2: channel 0: rcvd ext data 22^M Nov 3 12:28:33 (none) user.info : netmask: Unknown host Nov 3 12:28:33 (none) user.info : debug2: channel 0: written 22 to efd 6^M Nov 3 12:28:33 (none) user.info : debug1: client_input_channel_req: channel 0 rtype exit-status reply 0^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: rcvd eof^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: output open -> drain^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: rcvd close^M Nov 3 12:28:33 (none) user.info : debug3: channel 0: will not send data after close^M Nov 3 12:28:33 (none) user.info : debug3: channel 0: will not send data after close^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: obuf empty^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: close_write^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: output drain -> closed^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: almost dead^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: gc: notify user^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: gc: user detached^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: send close^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: is dead^M Nov 3 12:28:33 (none) user.info : debug2: channel 0: garbage collecting^M Nov 3 12:28:33 (none) user.info : debug1: channel 0: free: client-session, nchannels 2^M Nov 3 12:28:33 (none) user.info : debug3: channel 0: status: The following connections are open:^M Nov 3 12:28:33 (none) user.info : #0 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cfd -1)^M Nov 3 12:28:33 (none) user.info : #1 tun (t4 r1 i0/0 o0/0 fd 7/7 cfd -1)^M Nov 3 12:31:19 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M Nov 3 12:31:34 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M Nov 3 12:31:49 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M Nov 3 12:32:04 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M Nov 3 12:32:19 (none) user.info : debug1: client_input_channel_req: channel 1 rtype keepalive at openssh.com reply 1^M .. .. .. no reply from server..... top: Mem: 17476K used, 12760K free, 0K shrd, 0K buff, 2708K cached Load average: 0.00, 0.00, 0.07 (State: S=sleeping R=running, W=waiting) PID USER STATUS RSS PPID %CPU %MEM COMMAND 11303 root R 412 11294 2.9 1.3 top 252 root S 616 1 0.0 2.0 ssh 11293 root S 572 185 0.0 1.8 dropbear 11294 root S 472 11293 0.0 1.5 sh 3462 root S 420 1 0.0 1.3 hbc_sync.sh 11292 root S 412 3462 0.0 1.3 rsync 3383 root S 388 1 0.0 1.2 crond 84 root S 352 1 0.0 1.1 syslogd 73 root S 340 1 0.0 1.1 logger 1 root S 320 0 0.0 1.0 init 185 root S 296 1 0.0 0.9 dropbear 200 root S 296 1 0.0 0.9 httpd 87 root S 288 1 0.0 0.9 klogd 77 root S 288 1 0.0 0.9 init 4 root SW 0 1 0.0 0.0 kswapd 3 root SWN 0 1 0.0 0.0 ksoftirqd_CPU0 158 root SW 0 1 0.0 0.0 usb-storage-0 95 root SW 0 1 0.0 0.0 khubd 5 root SW 0 1 0.0 0.0 bdflush 2 root SW 0 1 0.0 0.0 keventd 160 root SW 0 1 0.0 0.0 scsi_eh_0 37 root SWN 0 1 0.0 0.0 jffs2_gcd_mtd2 6 root SW 0 1 0.0 0.0 kupdated 7 root SW 0 1 0.0 0.0 cifsoplockd 10 root SW 0 1 0.0 0.0 mtdblockd My router's logfile (<-- means message from server via channel-0): Sat Jan 1 01:00:08 CET 2000 --> client start up... Sat Jan 1 01:00:08 CET 2000 --> storing server key Sat Jan 1 01:00:08 CET 2000 --> storing remote-config Sat Jan 1 01:00:08 CET 2000 --> storing DNS config Sat Jan 1 01:00:08 CET 2000 --> creating partition_key Sat Jan 1 01:00:08 CET 2000 --> starting tunnel tun0 ssh -vvv -w $TUN_ID:$TUN_ID root@$SERVER_IP -i /etc/ssh/id_rsa -o Cipher=blowfish -o Compression=no -o HPNBufferSize=512 -f /root/$REMOTE_CMD >> /tmp/hbc_log Sat Jan 1 01:00:15 CET 2000 --> ping to server Sat Jan 1 01:00:15 CET 2000 --> PING 172.27.0.1 (172.27.0.1): 56 data bytes 84 bytes from 172.27.0.1: icmp_seq=0 ttl=64 time=32.8 ms --- 172.27.0.1 ping statistics --- 1 packets transmitted, 1 packets received, 0% packet loss round-trip min/avg/max = 32.8/32.8/32.8 ms Sat Jan 1 01:00:15 CET 2000 --> storing crontabs Sat Jan 1 01:00:15 CET 2000 --> restarting cron Sat Jan 1 01:00:15 CET 2000 --> getting systime from server 172.27.0.1 Sat Jan 1 01:00:15 CET 2000 --> creating rsync_key Fri Nov 3 11:47:00 CET 2006 --> creating rsync_exclude Fri Nov 3 11:47:00 CET 2006 --> activating swap Fri Nov 3 11:47:00 CET 2006 --> swap is activated ( 249976 kb)! Fri Nov 3 11:47:00 CET 2006 --> client is ready and waiting for cron_start! Fri Nov 3 11:46:59 CET 2006 <-- Unlock partition --- key slot 0 unlocked. Fri Nov 3 11:47:01 CET 2006 <-- Server route to 192.168.3.0 255.255.255.0 via tun0 Fri Nov 3 11:47:01 CET 2006 <-- Server is ready for user: backup0 Fri Nov 3 11:52:12 CET 2006 <-- VPN is alive --- Fri Nov 3 11:56:38 CET 2006 --> rsync start up...! Fri Nov 3 11:56:38 CET 2006 --> free memory : 21004 kb Fri Nov 3 11:56:38 CET 2006 --> free swap : 249976 kb Fri Nov 3 11:56:38 CET 2006 --> tun0 statistic: RX bytes:68808 (67.1 KiB) TX bytes:69034 (67.4 KiB) Fri Nov 3 11:56:38 CET 2006 --> eth0 statistic: RX bytes:242241 (236.5 KiB) TX bytes:192526 (188.0 KiB) Fri Nov 3 11:56:38 CET 2006 --> mounting CIFS 1 ... Fri Nov 3 11:56:38 CET 2006 --> mount.cifs //192.168.3.65/admin /tmp/mnt/SAMBA Fri Nov 3 11:56:38 CET 2006 --> rsync CIFS_PATH1-1 Fri Nov 3 11:56:38 CET 2006 --> rsync -az --delete /tmp/mnt/SAMBA/samba backup0 at 172.27.0.1::backup0 Fri Nov 3 11:56:54 CET 2006 --> free memory : 15908 kb Fri Nov 3 11:56:54 CET 2006 --> free swap : 249976 kb Fri Nov 3 11:56:55 CET 2006 --> tun0 statistic: RX bytes:96641 (94.3 KiB) TX bytes:512489 (500.4 KiB) Fri Nov 3 11:56:55 CET 2006 --> eth0 statistic: RX bytes:3029525 (2.8 MiB) TX bytes:844158 (824.3 KiB) Fri Nov 3 12:02:44 CET 2006 <-- VPN is alive --- rsync 29736 backup0 7u REG 253,5 0 3604516 /home/backup0/daily.0/profiles/Anwendungsdaten/Thunderbird/Profiles/uemq90yi.default/Mail/Local Folders/.Inbox.jgORXx Fri Nov 3 12:08:02 CET 2006 <-- VPN is alive --- rsync 29736 backup0 7u REG 253,5 0 3604522 /home/backup0/daily.0/profiles/Anwendungsdaten/Thunderbird/Profiles/uemq90yi.default/Mail/Local Folders/.Trash.F8Nsbv Fri Nov 3 12:12:41 CET 2006 --> free memory : 5012 kb Fri Nov 3 12:12:41 CET 2006 --> free swap : 248832 kb Fri Nov 3 12:12:41 CET 2006 --> tun0 statistic: RX bytes:817137 (797.9 KiB) TX bytes:9740160 (9.2 MiB) Fri Nov 3 12:12:41 CET 2006 --> eth0 statistic: RX bytes:592136574 (564.7 MiB) TX bytes:37319118 (35.5 MiB) Fri Nov 3 12:12:41 CET 2006 --> unmounting CIFS 1 ... Fri Nov 3 12:12:42 CET 2006 --> mounting CIFS 2 ... Fri Nov 3 12:12:42 CET 2006 --> mount.cifs //192.168.3.65/home_admin /tmp/mnt/HOME Fri Nov 3 12:12:42 CET 2006 --> rsync CIFS_PATH2-1 Fri Nov 3 12:12:42 CET 2006 --> rsync -az --delete /tmp/mnt/HOME backup0 at 172.27.0.1::backup0 Fri Nov 3 12:13:23 CET 2006 <-- VPN is alive --- rsync 2255 backup0 7u REG 253,5 7340032 3342346 /home/backup0/daily.0/HOME/.backup1.tar.eoRhST Fri Nov 3 12:18:55 CET 2006 <-- VPN is alive --- rsync 2255 backup0 7u REG 253,5 37486592 3342358 /home/backup0/daily.0/HOME/.backup2.tar.jqlgVH Fri Nov 3 12:24:29 CET 2006 <-- VPN is alive --- rsync 2255 backup0 7u REG 253,5 81002496 3342358 /home/backup0/daily.0/HOME/.backup2.tar.jqlgVH Fri Nov 3 12:27:33 CET 2006 <-- !!! VPN is offline (count:1) !!! Fri Nov 3 12:27:35 CET 2006 <-- !!! VPN is offline (count:2) !!! Fri Nov 3 12:27:37 CET 2006 <-- !!! VPN is offline (count:3) !!! Fri Nov 3 12:27:39 CET 2006 <-- !!! VPN is offline (count:4) !!! Fri Nov 3 12:27:41 CET 2006 <-- !!! VPN is offline (count:5) !!! Fri Nov 3 12:27:43 CET 2006 <-- !!! VPN is offline (count:6) !!! Fri Nov 3 12:27:45 CET 2006 <-- !!! VPN is offline (count:7) !!! Fri Nov 3 12:27:47 CET 2006 <-- !!! VPN is offline (count:8) !!! Fri Nov 3 12:27:49 CET 2006 <-- !!! VPN is offline (count:9) !!! Fri Nov 3 12:27:51 CET 2006 <-- !!! VPN is offline (count:10) !!! Client backup0 is down! Fri Nov 3 12:28:33 CET 2006 <-- killing process id: 2137 for user: backup0 Fri Nov 3 12:28:33 CET 2006 <-- killing process id: 2255 for user: backup0 Fri Nov 3 12:28:33 CET 2006 <-- Server shutdown for user: backup0 completed! Fri Nov 3 12:30:01 CET 2006 --> free memory : 5154 kb Fri Nov 3 12:30:01 CET 2006 --> free swap : 248832 kb Fri Nov 3 12:30:01 CET 2006 --> tun0 statistic: RX bytes:2653208 (2.5 MiB) TX bytes:123763582 (118.0 MiB) Fri Nov 3 12:30:01 CET 2006 --> eth0 statistic: RX bytes:732883585 (698.9 MiB) TX bytes:170591978 (162.6 MiB) The server is gentoo on AMD with 1024kb RAM and only for my backups at my ISP. I don't know what to do next....any Idea? Many thanks in advanced Joerg Schneider