I need some help here, this is not a single case, I get this on a several machines, this is releng_6 , recent, but old problem getting ugly first I get this kind of events in messages, independent if it is client mode or hostap or adhoc Dec 28 16:50:53 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 flags 3 len 1522 > max 1514) Dec 28 16:51:01 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 flags 3 len 1522 > max 1514) Dec 28 16:58:16 ap1-cds kernel: ath0: device timeout ... timeout event repeats I really do not know what this event means (ether type 5e4), for my understandings it is vague in the source, so I am lost here { I get continously: kernel: ath0: link state changed to DOWN kernel: ath0: link state changed to UP when WL client but it recovers when the AP comes back to normal so wl-cli mode is not the issue } but when the machine is running hostap the link state up/down events do not come up but transmission is interrupted, or better, goes slow and stops then - and stops forever until cold reboot, no chance to get this card back, not even unload ath and reload the driver (that was a try but I use it compiled into the kernel) this is not related to any WEP settings or any rate, this problem is coming up with either rate-sample or rate_onoe this is not related to the "tx stopped" problem (OACTIVE) and it is not related to any [TX|RX]BUF value (whatever it is set to) this problem is not a single case and not hardware related, here I mean MB, CPU, memory but is related in a certain way to the ath drv - same machine, but wi0 (prism card) and it does NOT happen this way I am with this problem since 6.0 and would be glad if somebody could convince Mr. Sam L. to attend this since it is a serious issue - any FreeBSD releng_6 has this problem but releng_5 does not depending on the amount of traffic I get this any hour ( when 2-3Mbit/s or more) or several times a day (when 1-2Mbit/s) it get worse when I have more then one ath card installed ath stats: 70777 data frames received 71551 data frames transmit 420 tx frames with an alternate rate 10821 long on-chip tx retries 260 tx failed 'cuz too many retries 11M current transmit rate 10489 tx management frames 1 tx frames discarded prior to association 786 tx frames with no ack marked 80516 tx frames with short preamble 54395 rx failed 'cuz of bad CRC 146438 rx failed 'cuz of PHY err 145013 CCK timing 1425 CCK restart 5295 beacons transmitted 19 periodic calibrations 42 rssi of last ack 31 avg recv rssi -98 rx noise floor 572 cabq frames transmitted 11 cabq xmit overflowed beacon interval 1525 switched default/rx antenna Antenna profile: [1] tx 41285 rx 4 ifconfig ath0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 ether 00:13:46:8b:f1:86 media: IEEE 802.11 Wireless Ethernet DS/11Mbps mode 11b <hostap> status: associated ssid omegasul channel 1 (2412) bssid 00:13:46:8b:f1:86 authmode OPEN privacy ON deftxkey 1 wepkey 1:40-bit wepkey 2:40-bit wepkey 3:40-bit wepkey 4:40-bit powersavemode OFF powersavesleep 100 txpowmax 36 txpower 63 rtsthreshold 2346 mcastrate 1 fragthreshold 2346 bmiss 7 -pureg protmode CTS -wme burst ssid HIDE -apbridge dtimperiod 1 bintval 100 -- thank's Jo?o A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura. Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br
check this message: http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031216.html run "/usr/src/tools/tools/net80211/wlandebug/wlandebug -i ath0 power" and see if one of the hosts on your wlan has powersaving turned on. "stops forever" was not one of my symptoms though, so your issue may be different... On Thu, 28 Dec 2006, JoaoBR wrote:> I need some help here, this is not a single case, I get this on a several > machines, this is releng_6 , recent, but old problem getting ugly > > > first I get this kind of events in messages, independent if it is client mode > or hostap or adhoc > > Dec 28 16:50:53 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:51:01 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:58:16 ap1-cds kernel: ath0: device timeout > ... timeout event repeats > > I really do not know what this event means (ether type 5e4), for my > understandings it is vague in the source, so I am lost here > > { > I get continously: > > kernel: ath0: link state changed to DOWN > kernel: ath0: link state changed to UP > > when WL client but it recovers when the AP comes back to normal > so wl-cli mode is not the issue > } > > > but when the machine is running hostap the link state up/down events do not > come up but transmission is interrupted, or better, goes slow and stops > then - and stops forever until cold reboot, no chance to get this card back, > not even unload ath and reload the driver (that was a try but I use it > compiled into the kernel) > this is not related to any WEP settings or any rate, this problem is coming up > with either rate-sample or rate_onoe > > > this is not related to the "tx stopped" problem (OACTIVE) and it is not > related to any [TX|RX]BUF value (whatever it is set to) > > this problem is not a single case and not hardware related, here I mean MB, > CPU, memory but is related in a certain way to the ath drv - same machine, > but wi0 (prism card) and it does NOT happen this way > > > I am with this problem since 6.0 and would be glad if somebody could convince > Mr. Sam L. to attend this since it is a serious issue - any FreeBSD releng_6 > has this problem but releng_5 does not > > depending on the amount of traffic I get this any hour ( when 2-3Mbit/s or > more) or several times a day (when 1-2Mbit/s) > > it get worse when I have more then one ath card installed > > > ath stats: > > 70777 data frames received > 71551 data frames transmit > 420 tx frames with an alternate rate > 10821 long on-chip tx retries > 260 tx failed 'cuz too many retries > 11M current transmit rate > 10489 tx management frames > 1 tx frames discarded prior to association > 786 tx frames with no ack marked > 80516 tx frames with short preamble > 54395 rx failed 'cuz of bad CRC > 146438 rx failed 'cuz of PHY err > 145013 CCK timing > 1425 CCK restart > 5295 beacons transmitted > 19 periodic calibrations > 42 rssi of last ack > 31 avg recv rssi > -98 rx noise floor > 572 cabq frames transmitted > 11 cabq xmit overflowed beacon interval > 1525 switched default/rx antenna > Antenna profile: > [1] tx 41285 rx 4 > > > ifconfig > > ath0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 > ether 00:13:46:8b:f1:86 > media: IEEE 802.11 Wireless Ethernet DS/11Mbps mode 11b <hostap> > status: associated > ssid omegasul channel 1 (2412) bssid 00:13:46:8b:f1:86 > authmode OPEN privacy ON deftxkey 1 > wepkey 1:40-bit > wepkey 2:40-bit > wepkey 3:40-bit > wepkey 4:40-bit powersavemode OFF powersavesleep 100 txpowmax 36 > txpower 63 rtsthreshold 2346 mcastrate 1 fragthreshold 2346 bmiss 7 > -pureg protmode CTS -wme burst ssid HIDE -apbridge dtimperiod 1 > bintval 100 > > > > > -- > thank's > Jo?o > > > > > > > > A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura. > Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
JoaoBR wrote:> 572 cabq frames transmitted > 11 cabq xmit overflowed beacon interval >> media: IEEE 802.11 Wireless Ethernet DS/11Mbps mode 11b <hostap>So one other thing came to mind. If your ap is operating in 11b and you have many multicast frames q'd up for power save stations then they can effectively saturate the network if they are being trasnmitted at a low tx rate (which they would be). This can effectively DOS your wireless network because the frames are burst immediately following the beacon. The driver limits the burst interval so it does not overflow into the next beacon but it's allowed to fill all available time to the next beacon frame (something I've considered changing for just the reason I described). This has always been an issue. You might try rate limiting these frames or just hack the driver to violate the spec and not buffer them for tx after the beacon (to see if your problem goes away). Further, if you have a machine with a crappy pci bus (such as !4801 soekris boards) it's entirely possible that you are hoarding the bus with these long transmits s.t. other problems are occurring. I do not recommend building ap products out of such equipment. (No disrespect to the 4501, et al they just had substandard pci bus operation.) Sam
JoaoBR wrote:> I need some help here, this is not a single case, I get this on a several > machines, this is releng_6 , recent, but old problem getting ugly > > > first I get this kind of events in messages, independent if it is client mode > or hostap or adhoc > > Dec 28 16:50:53 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:51:01 ap1-cds kernel: ath0: discard oversize frame (ether type 5e4 > flags 3 len 1522 > max 1514) > Dec 28 16:58:16 ap1-cds kernel: ath0: device timeout > ... timeout event repeats > > I really do not know what this event means (ether type 5e4), for my > understandings it is vague in the source, so I am lost hereSeems pretty clear: it's the type field extracted from the ethernet header of the oversized packet. A quick check of sys/net/ethernet.h shows no such ETHERTYPE defined. So something in your network is transmitting packets that either being rx'd incorrectly or, more likely, corrupted in transit.> > { > I get continously: > > kernel: ath0: link state changed to DOWN > kernel: ath0: link state changed to UP > > when WL client but it recovers when the AP comes back to normal > so wl-cli mode is not the issue > }Sorry this is hard to understand. You are saying that when you see packets discarded on the ap the client stations lose their association to the ap? You've said nothing about your environment but I'd guess you've got some heavy interference like a microwave oven operating.> > > but when the machine is running hostap the link state up/down events do not > come up but transmission is interrupted, or better, goes slow and stops > then - and stops forever until cold reboot, no chance to get this card back, > not even unload ath and reload the driver (that was a try but I use it > compiled into the kernel) > this is not related to any WEP settings or any rate, this problem is coming up > with either rate-sample or rate_onoe > > > this is not related to the "tx stopped" problem (OACTIVE) and it is not > related to any [TX|RX]BUF value (whatever it is set to) > > this problem is not a single case and not hardware related, here I mean MB, > CPU, memory but is related in a certain way to the ath drv - same machine, > but wi0 (prism card) and it does NOT happen this way > > > I am with this problem since 6.0 and would be glad if somebody could convince > Mr. Sam L. to attend this since it is a serious issue - any FreeBSD releng_6 > has this problem but releng_5 does notWell "Mr. Sam L" has other things to do that are more important to him. If you want help I can try to provide it but this is not exactly a problem one can diagnose from afar. I suggest you sniff traffic from a separate station and try to identify what is going on in the network when you this event occur. It would also help to do the obvious things like swap ath cards. You've also said nothing about your environment such as the mac+phy revs for the card and the computer this is operating in.> > depending on the amount of traffic I get this any hour ( when 2-3Mbit/s or > more) or several times a day (when 1-2Mbit/s) > > it get worse when I have more then one ath card installedSounds like you've got radio/antenna issues that manifest themselves as noise that drives the radio's into silence. Diagnosing something like that may requires tools like a spectrum analyzer.> > > ath stats: > > 70777 data frames received > 71551 data frames transmit > 420 tx frames with an alternate rate > 10821 long on-chip tx retries > 260 tx failed 'cuz too many retries > 11M current transmit rate > 10489 tx management frames > 1 tx frames discarded prior to association > 786 tx frames with no ack marked > 80516 tx frames with short preamble > 54395 rx failed 'cuz of bad CRC > 146438 rx failed 'cuz of PHY err > 145013 CCK timing > 1425 CCK restart > 5295 beacons transmitted > 19 periodic calibrations > 42 rssi of last ack > 31 avg recv rssi > -98 rx noise floor > 572 cabq frames transmitted > 11 cabq xmit overflowed beacon intervalThis should not happen. You have stations in power save mode in your bss and the transmission of queued multicast frames overflowed the interval following the beacon frame. This should be handled (I explicitly tested it) but you might want to observe if this occurs when you have problems.> 1525 switched default/rx antenna > Antenna profile: > [1] tx 41285 rx 4This makes no sense; you rx'd 4 frames total? That's inconsistent with the "data frames received" counter and makes me question whether these numbers are meaningful.> > > ifconfig > > ath0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 > ether 00:13:46:8b:f1:86 > media: IEEE 802.11 Wireless Ethernet DS/11Mbps mode 11b <hostap> > status: associated > ssid omegasul channel 1 (2412) bssid 00:13:46:8b:f1:86 > authmode OPEN privacy ON deftxkey 1 > wepkey 1:40-bit > wepkey 2:40-bit > wepkey 3:40-bit > wepkey 4:40-bit powersavemode OFF powersavesleep 100 txpowmax 36 > txpower 63 rtsthreshold 2346 mcastrate 1 fragthreshold 2346 bmiss 7 > -pureg protmode CTS -wme burst ssid HIDE -apbridge dtimperiod 1 > bintval 100Unfortunately you've not provide critical info like the mac+phy of the card and the platform (E.g. is this a soekris box). As I said I can try to _HELP_ you but I cannot fix your problem. You need to diagnose what is happening. Sam