Hi all, I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics. One is an internal network (nfs) and the other is external. PF has this rule for all traffic on the private net: [root@archive /home/jails]# pfctl -sr|grep bge1 pass in quick on bge1 inet from 192.168.1.0/24 to any pass out quick on bge1 inet from any to 192.168.1.0/24 No state since these are "quick" and symmetrical. Doing something like "ls /usr/ports" will just hang until interrupted. Using tcp for nfs makes it workable, but very slow. If I disable pf (pfctl -d), both types of mounts work, and speed is excellent. I also just found that if I remove the "scrub in all" statement and change it to "scrub in on bge0", things are fine. Any idea what's going on? The tcpdump output confuses me (see "bad cksum!"), so I'm posting some snippets here. Looking at tcpdump, things look a bit odd. 192.168.1.111 is the nfs client (6.2-RC1), 192.168.1.100 is the nfs server (4.11): [root@archive /home/spork]# tcpdump -i bge1 -v tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 96 bytes 00:59:16.269659 IP (tos 0x0, ttl 64, id 5395, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e132)!) 192.168.1.111.1861387036 > 192.168.1.100.nfs: 104 access [|nfs] bad checksum before even hitting the wire?? 00:59:16.269920 IP (tos 0x0, ttl 64, id 46705, offset 0, flags [none], proto: UDP (17), length: 148) 192.168.1.100.nfs > 192.168.1.111.1861387036: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] We get a reply (dir is mode 755) 00:59:16.270010 IP (tos 0x0, ttl 64, id 5396, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e131)!) 192.168.1.111.1861387037 > 192.168.1.100.nfs: 104 access [|nfs] Again, bad checksum FROM nfs client to server... 00:59:16.270211 IP (tos 0x0, ttl 64, id 58236, offset 0, flags [none], proto: UDP (17), length: 148) 192.168.1.100.nfs > 192.168.1.111.1861387037: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 00:59:16.270306 IP (tos 0x0, ttl 64, id 5397, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e130)!) 192.168.1.111.1861387038 > 192.168.1.100.nfs: 104 access [|nfs] Now to confuse things further, if I disable pf (pfctl -d), speeds are great, but I still get these bad checksum errors: 01:04:21.498293 IP (tos 0x0, ttl 64, id 5482, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e0db)!) 192.168.1.111.1861387048 > 192.168.1.100.nfs: 104 access [|nfs] 01:04:21.498607 IP (tos 0x0, ttl 64, id 16228, offset 0, flags [none], proto: UDP (17), length: 148) 192.168.1.100.nfs > 192.168.1.111.1861387048: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 01:04:21.498675 IP (tos 0x0, ttl 64, id 5483, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e0da)!) 192.168.1.111.1861387049 > 192.168.1.100.nfs: 104 access [|nfs] 01:04:21.498900 IP (tos 0x0, ttl 64, id 13349, offset 0, flags [none], proto: UDP (17), length: 148) 192.168.1.100.nfs > 192.168.1.111.1861387049: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs] 01:04:21.498924 IP (tos 0x0, ttl 64, id 5484, offset 0, flags [none], proto: UDP (17), length: 132, bad cksum 0 (->e0d9)!) 192.168.1.111.1861387050 > 192.168.1.100.nfs: 104 access [|nfs] 01:04:21.499195 IP (tos 0x0, ttl 64, id 34907, offset 0, flags [none], proto: UDP (17), length: 148) 192.168.1.100.nfs > 192.168.1.111.1861387050: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]
On Wed, 13 Dec 2006, Charles Sprickman wrote:> Hi all, > > I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics. One is > an internal network (nfs) and the other is external. > > PF has this rule for all traffic on the private net: > > [root@archive /home/jails]# pfctl -sr|grep bge1 > pass in quick on bge1 inet from 192.168.1.0/24 to any > pass out quick on bge1 inet from any to 192.168.1.0/24 > > No state since these are "quick" and symmetrical. > > Doing something like "ls /usr/ports" will just hang until interrupted. Using > tcp for nfs makes it workable, but very slow. > > If I disable pf (pfctl -d), both types of mounts work, and speed is > excellent. I also just found that if I remove the "scrub in all" statement > and change it to "scrub in on bge0", things are fine.I believe it's a bad idea to run NFS traffic through scrub unless you use the "no-df" option with it. I just don't scrub my internal network traffic at all. I got this from "man pf.conf": scrub has the following options: no-df Clears the dont-fragment bit from a matching IP packet. Some oper- ating systems are known to generate fragmented packets with the dont-fragment bit set. This is particularly true with NFS. Scrub will drop such fragmented dont-fragment packets unless no-df is specified.
On Wednesday 13 December 2006 07:10, Charles Sprickman wrote:> Hi all, > > I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics. > One is an internal network (nfs) and the other is external. > > PF has this rule for all traffic on the private net: > > [root@archive /home/jails]# pfctl -sr|grep bge1 > pass in quick on bge1 inet from 192.168.1.0/24 to any > pass out quick on bge1 inet from any to 192.168.1.0/24 > > No state since these are "quick" and symmetrical. > > Doing something like "ls /usr/ports" will just hang until interrupted. > Using tcp for nfs makes it workable, but very slow. > > If I disable pf (pfctl -d), both types of mounts work, and speed is > excellent. I also just found that if I remove the "scrub in all" > statement and change it to "scrub in on bge0", things are fine. > > Any idea what's going on? The tcpdump output confuses me (see "bad > cksum!"), so I'm posting some snippets here.As Luke already pointed out, "no-df" on the scrub rule should help. As for the "bad cksum!" - this is a symptom of checksumming done in hardware. ifconfig bge1 -rxcsum -txcsum should get rid of them. -- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061213/489e42bc/attachment.pgp
> I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics. One > is an internal network (nfs) and the other is external....> Doing something like "ls /usr/ports" will just hang until interrupted. > Using tcp for nfs makes it workable, but very slow.Oddly enough I hit precisely this problem last night - with a cvsup from a few days ago. I have tried adding the 'no-df' flag to the scrub rules, but this did not help much. What I ended up doing was this: scrub in on bge0 proto tcp fragment reassemble random-id so that I am not scrubbing UDP traffic. this works fine. -pete.
> As Luke already pointed out, "no-df" on the scrub rule should help. As=20 > for the "bad cksum!" - this is a symptom of checksumming done in=20 > hardware. ifconfig bge1 -rxcsum -txcsum should get rid of them.I am a bit concerned by this - we use a lot of bge interfaces, and I have hardware checksumming enabled on all of them. Are they known to produce bad checksums ? -pete.
On Wednesday 13 December 2006 12:05, Pete French wrote:> > As Luke already pointed out, "no-df" on the scrub rule should help. > > As=20 for the "bad cksum!" - this is a symptom of checksumming done > > in=20 hardware. ifconfig bge1 -rxcsum -txcsum should get rid of > > them. > > I am a bit concerned by this - we use a lot of bge interfaces, and I > have hardware checksumming enabled on all of them. Are they known to > produce bad checksums ?You are misunderstanding. The problem is simply that the bpf device sees bad checksums as it sees the packet before the hardware has calculated it. On the receiver the checksum will be correct. -- /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20061213/4567f525/attachment.pgp
> You are misunderstanding. The problem is simply that the bpf device sees=20 > bad checksums as it sees the packet before the hardware has calculated=20 > it. On the receiver the checksum will be correct.Ah, gotcha. That makes perfect sense now. -pete.