Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #1581 (closed defect: fixed)

Opened 12 years ago

Last modified 12 years ago

Interface locks up for several minutes after heavy traffic : ath_mgtstart: discard, no xmit buf

Reported by: david@boreham.org Assigned to:
Priority: major Milestone: version 0.9.5
Component: madwifi: other Version: trunk
Keywords: Cc:
Patch is attached: 0 Pending:

Description

I'm running r2709 on a Gateworks Avila board (IXP425), acting as an 802.11b AP. I have a set of openwrt patches applied in addition. Generally everything is working, except when I perform a throughput test. If I connect two TCP flows (one in each direction), otherwise unthrottled between the AP and a client, after some time the interface seems to lock up.

After many minutes more, the interface appears to resurrect and after that things are working again (no reboot).

I see these messages in the kernel log:

ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
ath_mgtstart: discard, no xmit buf
NETDEV WATCHDOG: wifi1: transmit timed out

Here are the boot dmesg messages (from another board since the bad one's log has wrapped, but the same hardware and software):

wlan: 0.8.4.2 (svn r2708)
ath_hal: module license 'Proprietary' taints kernel.
ath_hal: 0.9.30.13 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, RF2413, RF5
413, RF2133, REGOPS_FUNC)
ath_rate_minstrel: Minstrel automatic rate control algorithm 1.2 (svn r2708)
ath_rate_minstrel: look around rate set to 10%
ath_rate_minstrel: EWMA rolloff level set to 75%
ath_rate_minstrel: max segment size in the mrr set to 6000 us
wlan: mac acl policy registered
ath_pci: 0.9.4.5 (svn r2708)
PCI: enabling device 0000:00:02.0 (0340 -> 0342)
ath_pci: switching rfkill capability off
ath_pci: switching per-packet transmit power control off
wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 3
6Mbps 48Mbps 54Mbps
wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: H/W encryption support: WEP AES AES_CCM TKIP
wifi0: mac 10.5 phy 6.1 radio 6.3
wifi0: Use hw queue 1 for WME_AC_BE traffic
wifi0: Use hw queue 0 for WME_AC_BK traffic
wifi0: Use hw queue 2 for WME_AC_VI traffic
wifi0: Use hw queue 3 for WME_AC_VO traffic
wifi0: Use hw queue 8 for CAB traffic
wifi0: Use hw queue 9 for beacons
wifi0: Atheros 5212: mem=0x48000000, irq=27

Change History

10/06/07 14:50:50 changed by strasak@bubakov.net

see ticket #1272 - which could, or could not be fixed, haven't tested it throughtfully enough, you could try to set /proc/sys/vm/min_free_kbytes to for example 10000 and see if it will happen again - if yep, #1272 is not fixed probably by nbd's memleak fix

10/08/07 22:09:28 changed by david@boreham.org

Well...since the fix was checked in at r2597, and I am running r2709, it seems safe to say that it is not fixed, or that I have encountered another similar problem.

It just happened again, under lower load this time.

Anything I can do to fix this, or reproduce it under controlled conditions so someone else can fix it ?

10/08/07 23:17:32 changed by mentor

Does this occur when OpenWRT patches are removed? Has there ever been a release that has not had this error?

10/11/07 10:01:41 changed by strasak@bubakov.net

according to my experience this problem has been here all the time - see my experiments on mipsel described in #1272. As i have written, this is probably not memleak, nor madwifi alone fault, but it is probably more complex problem - memory allocation mechanism is probably not robust enough to deal with some corner cases properly.

About OpenWRT - i tried either patched madwifi and vanilla one, both - on mipsel and very restricted memory systems - behave similar, with nbd'sNAPI patch it happens less frequently than on vanilla madwifi - at least it happened in the past, haven't tested it very intensively recently.

David, did you tried to set that min_free_kbytes ?

02/08/08 06:38:54 changed by nbd

  • status changed from new to closed.
  • resolution set to fixed.

fixed in r3346

02/08/08 09:09:53 changed by mrenzmann

  • milestone set to version 0.9.5.