Hi, I have two ftp servers, and each has one interface each on the same two externals lines, connected through the same routers. Each does passive ftp fine on all four external IPs involved. One does active ftp fine on both lines. The other fails at active ftp on just one line. I''m trying to figure out what''s wrong with the one server on the one line, but so far I can''t find the logic of it. The ip rule and routing tables for the two servers are identical. Both servers are running precisely the same iptables rules and nothing pertinent''s being logged as dropped - but even dropping the firewall entirely the problem remains, so iptables has nothing to do with it. Both are running ProFTPd with precisely the same configuration file. I switched the problem server to Pure-FTPd as a test - and the behavior was still the same. No error shows up in the ftp daemon logs, even at the highest debug level. No difference is made by which ftp client is run, or whether a firewall is up at the client end. The problem is not intermittent, but 100% consistently there. There are some minor differences in the kernels: The problem system is running 2.4.20, and has some features as modules that the non-problem system, with a 2.4.21 kernel, has directly in the kernel. It''s also possible that some kernel switch is set differently between the two. Is there anything there to look for that could account for the difference in behavior? When I run tcpdump on server and client it looks like this for an "ls" following login on the failing active ftp connection: # tcpdump -i eth2 src or dst port ftp-data tcpdump: listening on eth2 14:23:19.900936 [eth2 ip].ftp-data > [remote ip].33050: S 982153112:2982153112(0) win 5840 <mss 1460,sackOK,timestamp 61754850 ,nop,wscale 0> (DF) 14:23:19.962487 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46588689 61754850,nop,wscale 0> (DF) 14:23:23.364417 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46592089 61754850,nop,wscale 0> (DF) 14:23:25.901169 [eth2 ip].ftp-data > [remote ip].33050: S 982153112:2982153112(0) win 5840 <mss 1460,sackOK,timestamp 61755450 ,nop,wscale 0> (DF) 14:23:25.938500 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46594668 61754850,nop,wscale 0> (DF) 14:23:29.358543 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46598089 61754850,nop,wscale 0> (DF) 14:23:37.901654 [eth2 ip].ftp-data > [remote ip].33050: S 982153112:2982153112(0) win 5840 <mss 1460,sackOK,timestamp 61756650 ,nop,wscale 0> (DF) 14:23:37.989714 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46606666 61754850,nop,wscale 0> (DF) 14:23:41.559774 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46610289 61754850,nop,wscale 0> (DF) 14:24:01.902622 [eth2 ip].ftp-data > [remote ip].33050: S 982153112:2982153112(0) win 5840 <mss 1460,sackOK,timestamp 61759050 ,nop,wscale 0> (DF) 14:24:01.942078 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46630673 61754850,nop,wscale 0> (DF) 14:24:05.560151 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46634289 61754850,nop,wscale 0> (DF) 14:24:49.904552 [eth2 ip].ftp-data > [remote ip].33050: S 982153112:2982153112(0) win 5840 <mss 1460,sackOK,timestamp 61763850 ,nop,wscale 0> (DF) 14:24:49.948785 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46678675 61754850,nop,wscale 0> (DF) 14:24:53.764910 [remote ip].33050 > [eth2 ip].ftp-data: S 8335459:78335459(0) ack 2982153113 win 5792 <mss 1460,sackOK,timestamp 46682489 61754850,nop,wscale 0> (DF) 442 packets received by filter 0 packets dropped by kernel For a successful one: # tcpdump -i eth1 src or dst port ftp-data tcpdump: listening on eth1 14:28:45.894017 [eth1 ip].ftp-data > [remote ip].33052: S 3335994396:3335994396(0) win 5840 <mss 1460,sackOK,timestamp 61787448 0,nop,wscale 0> (DF) 14:28:45.937242 [remote ip].33052 > [eth1 ip].ftp-data: S 411111188:411111188(0) ack 3335994397 win 5792 <mss 1460,sackOK,timestamp 146914689 61787448,nop,wscale 0> (DF) 14:28:45.937366 [eth1 ip].ftp-data > [remote ip].33052: . ack 1 win 5840 <nop,nop,timestamp 61787452 146914689> (DF) 14:28:45.954203 [eth1 ip].ftp-data > [remote ip].33052: P 1:999(998) ack 1 win 5840 <nop,nop,timestamp 61787454 146914689> (DF) 14:28:45.954390 [eth1 ip].ftp-data > [remote ip].33052: . 999:2447(1448) ack 1 win 5840 <nop,nop,timestamp 61787454 146914689> (DF) 14:28:45.954482 [eth1 ip].ftp-data > [remote ip].33052: FP 2447:3498(1051) ack 1 win 5840 <nop,nop,timestamp 61787454 146914689> (DF) 14:28:46.071125 [remote ip].33052 > [eth1 ip].ftp-data: . ack 999 win 6986 <nop,nop,timestamp 146914824 61787454> (DF) [tos 0x8] 14:28:46.102387 [remote ip].33052 > [eth1 ip].ftp-data: . ack 2447 win 10136 <nop,nop,timestamp 146914853 61787454> (DF) [tos 0x8] 14:28:46.126538 [remote ip].33052 > [eth1 ip].ftp-data: F 1:1(0) ack 3499 win 13032 <nop,nop,timestamp 146914875 61787454> (DF) [tos 0x8] 14:28:46.126568 [eth1 ip].ftp-data > [remote ip].33052: . ack 2 win 5840 <nop,nop,timestamp 61787471 146914875> (DF) 597 packets received by filter 0 packets dropped by kernel For the other machine on the line that fails on the first machine: # tcpdump -i eth2 src or dst port ftp-data tcpdump: listening on eth2 14:36:02.825949 [eth2 ip].ftp-data > [remote ip].33057: S 151358595:151358595(0) win 5840 <mss 1460,sackOK,timestamp 249200916 0,nop,wscale 0> (DF) 14:36:02.862518 [remote ip].33057 > [eth2 ip].ftp-data: S 868593832:868593832(0) ack 151358596 win 5792 <mss 1460,sackOK,timestamp 147351489 249200916,nop,wscale 0> (DF) 14:36:02.862612 [eth2 ip].ftp-data > [remote ip].33057: . ack 1 win 5840 <nop,nop,timestamp 249200919 147351489> (DF) 14:36:02.886757 [eth2 ip].ftp-data > [remote ip].33057: P 1:997(996) ack 1 win 5840 <nop,nop,timestamp 249200922 147351489> (DF) 14:36:02.887043 [eth2 ip].ftp-data > [remote ip].33057: . 997:2445(1448) ack 1 win 5840 <nop,nop,timestamp 249200922 147351489> (DF) 14:36:02.887100 [eth2 ip].ftp-data > [remote ip].33057: FP 2445:2678(233) ack 1 win 5840 <nop,nop,timestamp 249200922 147351489> (DF) 14:36:02.955549 [remote ip].33057 > [eth2 ip].ftp-data: . ack 997 win 6972 <nop,nop,timestamp 147351584 249200922> (DF) 14:36:02.985489 [remote ip].33057 > [eth2 ip].ftp-data: . ack 2445 win 10136 <nop,nop,timestamp 147351614 249200922> (DF) 14:36:02.994463 [remote ip].33057 > [eth2 ip].ftp-data: F 1:1(0) ack 2679 win 10136 <nop,nop,timestamp 147351622 249200922> (DF) 14:36:02.994501 [eth2 ip].ftp-data > [remote ip].33057: . ack 2 win 5840 <nop,nop,timestamp 249200932 147351622> (DF) 183 packets received by filter 0 packets dropped by kernel From the client the failing instance looks the same as from the server: # tcpdump -i eth0 src or dst port ftp-data tcpdump: listening on eth0 14:45:44.710894 [server].ftp-data > [client].33059: S 100058036:100058036(0) win 5840 <mss 1460,sackOK,timestamp 61889323 0,nop,wscale 0> (DF) 14:45:44.710978 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147933307 61889323,nop,wscale 0> (DF) 14:45:48.910611 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147937507 61889323,nop,wscale 0> (DF) 14:45:50.692647 [server].ftp-data > [client].33059: S 100058036:100058036(0) win 5840 <mss 1460,sackOK,timestamp 61889923 0,nop,wscale 0> (DF) 14:45:50.692706 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147939288 61889323,nop,wscale 0> (DF) 14:45:54.911425 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147943507 61889323,nop,wscale 0> (DF) 14:46:02.699413 [server].ftp-data > [client].33059: S 100058036:100058036(0) win 5840 <mss 1460,sackOK,timestamp 61891123 0,nop,wscale 0> (DF) 14:46:02.699487 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147951294 61889323,nop,wscale 0> (DF) 14:46:06.913038 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147955507 61889323,nop,wscale 0> (DF) 14:46:26.676678 [server].ftp-data > [client].33059: S 100058036:100058036(0) win 5840 <mss 1460,sackOK,timestamp 61893523 0,nop,wscale 0> (DF) 14:46:26.676738 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147975268 61889323,nop,wscale 0> (DF) 14:46:31.116248 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 147979707 61889323,nop,wscale 0> (DF) 14:47:14.696089 [server].ftp-data > [client].33059: S 100058036:100058036(0) win 5840 <mss 1460,sackOK,timestamp 61898323 0,nop,wscale 0> (DF) 14:47:14.696143 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 148023273 61889323,nop,wscale 0> (DF) 14:47:19.331232 [client].33059 > [server].ftp-data: S 1480641926:1480641926(0) ack 100058037 win 5792 <mss 1460,sackOK,timestamp 148027907 61889323,nop,wscale 0> (DF) And a successful instance: 14:51:10.285392 [server].ftp-data > [client].33061: S 465767684:465767684(0) win 5840 <mss 1460,sackOK,timestamp 61921873 0,nop,wscale 0> (DF) 14:51:10.285476 [client].33061 > [server].ftp-data: S 1835411337:1835411337(0) ack 465767685 win 5792 <mss 1460,sackOK,timestamp 148258803 61921873,nop,wscale 0> (DF) 14:51:10.315404 [server].ftp-data > [client].33061: . ack 1 win 5840 <nop,nop,timestamp 61921876 148258803> (DF) 14:51:10.361253 [server].ftp-data > [client].33061: P 1:999(998) ack 1 win 5840 <nop,nop,timestamp 61921877 148258803> (DF) 14:51:10.361289 [client].33061 > [server].ftp-data: . ack 999 win 6986 <nop,nop,timestamp 148258878 61921877> (DF) [tos 0x8] 14:51:10.391621 [server].ftp-data > [client].33061: . 999:2447(1448) ack 1 win 5840 <nop,nop,timestamp 61921877 148258803> (DF) 14:51:10.391682 [client].33061 > [server].ftp-data: . ack 2447 win 10136 <nop,nop,timestamp 148258909 61921877> (DF) [tos 0x8] 14:51:10.412376 [server].ftp-data > [client].33061: FP 2447:3498(1051) ack 1 win 5840 <nop,nop,timestamp 61921877 148258803> (DF) 14:51:10.412633 [client].33061 > [server].ftp-data: F 1:1(0) ack 3499 win 13032 <nop,nop,timestamp 148258930 61921877> (DF) [tos 0x8] 14:51:10.449436 [server].ftp-data > [client].33061: . ack 2 win 5840 <nop,nop,timestamp 61921888 148258930> (DF) What I can make out is that where it fails the ftp server is making an initial response from port 20 to a high port on the client, and the client is responding from that back to the server, but the ftp server is ignoring that despite tcpdump showing it has received it. So the client tries a second time, and they get into a dance of failed negotiation. In the instances that work the server goes ahead and gives the "ls" data as requested. So it''s not the daemon (same with two different ftp daemons, and the daemons work fine on the other line, and on both lines on the second system), it''s not the line (the other system handles both lines well for this), it''s not the routing (both systems have the the same rules and routing tables), it''s not the client (problem the same no matter which client software or OS). Any guesses as to what it _is_ will be most appreciated. Whit _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
Whit Blauvelt
2003-Sep-20 17:43 UTC
Re: active ftp - failing on just one of two interfaces
Meta question: Can someone please suggest what the right forum is to seek an answer to my question (in my post this is a reply to)? I know it''s not at the ftp daemon level - and haven''t gotten an answer in the appropraite forums there. It''s _probably_ not a routing issue (since the same rules and routes work fine for a second server). It _may_ be a kernel issue (but why would the kernel work fine for the one line and not the other). It''s _not_ a firewall issue (dropping firewalls has no effect). And it''s a puzzle I need to solve. Thanks, Whit _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/