Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #969 (closed defect: fixed)

Opened 13 years ago

Last modified 13 years ago

IEEE80211_MLME_DISASSOC & IEEE80211_MLME_DEAUTH drops nodes on all vaps causing crash

Reported by: tharvey Assigned to:
Priority: major Milestone: version 0.9.3
Component: madwifi: 802.11 stack Version: trunk
Keywords: Cc:
Patch is attached: 1 Pending:

Description

in a multi-vap scenario (for example, a STA or WDS-REPEATER vap on the same device as an AP vap) an IEEE80211_MLME_DISASSOC/DEAUTH (for example, from hostapd upon startup) will call ieee80211_node_leave for all nodes in the node table, regardless of what vap they are associated with.

Instead, ieee80211_node_leave should only be called for nodes that are on the specified vap (Note that this same mis-behavior of iterating over all nodes regardless of vap may be alsmo present elsewhere in madwifi)

This can cause a number of issues, including:

  • driver crash when nodes get dropped that shouldn't be (see test scenario below)

to demonstrate a crash caused by this bug, the following startup script can be used with madwifi-1754:

modprobe wlan
modprobe ath_hal
modprobe ath_rate_onoe
modprobe wlan_scan_sta
modprobe wlan_scan_ap
modprobe ath_pci autocreate=none

# create a wds and ap vap
wlanconfig ath create wlandev wifi0 wlanmode wds
iwpriv ath0 wds_add 00:15:6d:50:03:29
wlanconfig ath create wlandev wifi0 wlanmode ap

ifconfig ath0 192.168.3.1 up
ifconfig ath1 up

# create a simple hostapd config file and launch hostapd on the ap vap
cat << EOF > /var/config/hostapd-ath1.conf
interface=ath1
driver=madwifi
EOF
hostapd /var/config/hostapd-ath1.conf &

# cause a packet to go out the ath0 vap
# (crashes driver because the node created for ath0 from the wds_add is free'd when it shouldn't have been)
ping 192.168.3.2

console:

wlan: 0.8.4.2 (svn r1754)
ath_hal: 0.9.18.0 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413, REGOPS_FUNC)
ath_rate_onoe: 1.0 (svn r1754)
ath_pci: 0.9.4.5 (svn r1754)
PCI: enabling device 0000:00:02.0 (0340 -> 0342)
wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: H/W encryption support: WEP AES AES_CCM TKIP
wifi0: mac 5.9 phy 4.3 radio 3.6
wifi0: Use hw queue 1 for WME_AC_BE traffic
wifi0: Use hw queue 0 for WME_AC_BK traffic
wifi0: Use hw queue 2 for WME_AC_VI traffic
wifi0: Use hw queue 3 for WME_AC_VO traffic
wifi0: Use hw queue 8 for CAB traffic
wifi0: Use hw queue 9 for beacons
wifi0: Atheros 5212: mem=0x48000000, irq=27
ath0
ath0: Added WDS MAC: 00:15:6d:50:03:29
ath1
PING 192.168.3.2 (192.168.3.2): 56 data bytes
Configuration file: /var/config/hostapd-ath1.conf
Using interface ath1 with hwaddr 00:15:6d:50:00:a8 and ssid ''
Unable to handle kernel NULL pointer dereference at virtual address 00000174
pgd = c2e64000
[00000174] *pgd=02c8a031, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1]
Modules linked in: ath_pci wlan_scan_ap wlan_scan_sta ath_rate_onoe ath_hal wlan nfs lockd sunrpc bridge ixp400_eth rtc_ds1672 eeprom ixp400 jffs2 zlib_inflate zlib_deflate ixp4xx_gpio
CPU: 0
PC is at memcpy+0x114/0x330
LR is at ieee80211_encap+0x92c/0xf90 [wlan]
pc : [<c00ca334>]    lr : [<bf13dadc>]    Tainted: P
sp : c31e1cfc  ip : 00000003  fp : c31e1dbc
r10: c3cdf440  r9 : c33f1400  r8 : c2e03c80
r7 : c0d9a800  r6 : c2c86260  r5 : c2e03c80  r4 : 00000000
r3 : 00000000  r2 : 00000002  r1 : 00000174  r0 : c3cdf450
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: 39FF  Table: 02E64000  DAC: 00000015
Process cat (pid: 871, stack limit = 0xc31e0194)
Stack: (0xc31e1cfc to 0xc31e2000)
1ce0:                                                                c3cdf450
1d00: 00000000 bf13dadc 00000001 c31e1d54 00000004 00000000 00000000 00000000
1d20: 00000001 00000000 00000018 00000000 00000024 00000018 00000000 c2c33200
1d40: c31e1dd0 c33f1400 c31e1d54 c001d9d0 bf0cfbcc c31e1d74 c31e1d64 c0028450
1d60: c00283d8 c01c97b0 c0191f38 00000002 00000002 00000003 00000000 c31e1da0
1d80: c31e1d8c ffffffff ffff0015 6d5000a8 0806e92c 60000013 c2dd4060 c2c86260
1da0: c2c86000 c2e03c80 c33f1400 c33f1400 c31e1e08 c31e1dc0 bf1a00a4 bf13d1bc
1dc0: 0000aaf8 c31e1dd4 c2c876e4 c2c86260 c31e1dec 00000000 c2dd4060 c2dd4060
1de0: c2dd1ee0 c2c86000 00000000 c2e03c80 c2e03cb0 c0a80302 c0a80301 c31e1e28
1e00: c31e1e0c c011eebc bf19f914 c2c86000 c2e03c80 00000000 c33f1400 c31e1e44
1e20: c31e1e2c c01107dc c011edd8 c3cd9260 c3cdf452 c2e03c80 c31e1e54 c31e1e48
1e40: bf13c548 c01106e8 c31e1e78 c31e1e58 bf13c48c bf13c504 c3cd9000 c2e03c80
1e60: 00000000 c2e03e60 c3cd9000 c31e1e94 c31e1e7c c0110850 bf13c2d0 c0385a20
1e80: c0ca7b00 00000001 c31e1eb4 c31e1e98 c0150044 c01106e8 c2e03e60 c3cd9000
1ea0: c0a80302 c2e03c80 c31e1ed4 c31e1eb8 c0150098 c014ffe0 c0a80301 00000000
1ec0: c3cd90d4 00000000 c31e1f14 c31e1ed8 c014f914 c015005c c0a80301 00000000
1ee0: c3cd90d4 00000000 00000000 c0385a20 c2e03e60 ffff98f7 00000000 c31e1f34
1f00: c01d003c c01cf634 c31e1f30 c31e1f18 c01172fc c014f768 c31e0000 00000100
1f20: c0117090 c31e1f68 c31e1f34 c00433c0 c011709c c31e1f34 c31e1f34 00000020
1f40: 00000011 c01cf3c8 c01d0ed8 0000000a 000050f5 c31e0000 401d2000 c31e1f88
1f60: c31e1f6c c003ea90 c0043244 c31e1fb0 0000001f 00000020 400dc7dc c31e1f98
1f80: c31e1f8c c003ec4c c003ea3c c31e1fac c31e1f9c c001ddc4 c003ec0c ffffffff
1fa0: 00000000 c31e1fb0 c001cb60 c001dd6c 400d7edc 00000000 00000064 400d1000
1fc0: 00053888 bef4dd84 401311d4 400dc7dc 000050f5 00000f20 401d2000 bef4dd80
1fe0: 400dc7dc bef4dd34 400184c0 401a9f54 80000010 ffffffff fffdfbff efffffff
Backtrace:
[<bf13d1b0>] (ieee80211_encap+0x0/0xf90 [wlan]) from [<bf1a00a4>] (ath_hardstart+0x79c/0xaac [ath_pci])
[<bf19f908>] (ath_hardstart+0x0/0xaac [ath_pci]) from [<c011eebc>] (qdisc_restart+0xf0/0x1d8)
[<c011edcc>] (qdisc_restart+0x0/0x1d8) from [<c01107dc>] (dev_queue_xmit+0x100/0x230)
 r7 = C33F1400  r6 = 00000000  r5 = C2E03C80  r4 = C2C86000
[<c01106dc>] (dev_queue_xmit+0x0/0x230) from [<bf13c548>] (ieee80211_parent_queue_xmit+0x50/0x58 [wlan])
 r6 = C2E03C80  r5 = C3CDF452  r4 = C3CD9260
[<bf13c4f8>] (ieee80211_parent_queue_xmit+0x0/0x58 [wlan]) from [<bf13c48c>] (ieee80211_hardstart+0x1c8/0x234 [wlan])
[<bf13c2c4>] (ieee80211_hardstart+0x0/0x234 [wlan]) from [<c0110850>] (dev_queue_xmit+0x174/0x230)
 r8 = C3CD9000  r7 = C2E03E60  r6 = 00000000  r5 = C2E03C80
 r4 = C3CD9000
[<c01106dc>] (dev_queue_xmit+0x0/0x230) from [<c0150044>] (arp_xmit+0x70/0x7c)
 r6 = 00000001  r5 = C0CA7B00  r4 = C0385A20
[<c014ffd4>] (arp_xmit+0x0/0x7c) from [<c0150098>] (arp_send+0x48/0x4c)
[<c0150050>] (arp_send+0x0/0x4c) from [<c014f914>] (arp_solicit+0x1b8/0x1d4)
[<c014f75c>] (arp_solicit+0x0/0x1d4) from [<c01172fc>] (neigh_timer_handler+0x26c/0x300)
[<c0117090>] (neigh_timer_handler+0x0/0x300) from [<c00433c0>] (run_timer_softirq+0x188/0x1f8)
 r6 = C0117090  r5 = 00000100  r4 = C31E0000
[<c0043238>] (run_timer_softirq+0x0/0x1f8) from [<c003ea90>] (__do_softirq+0x60/0xdc)
[<c003ea30>] (__do_softirq+0x0/0xdc) from [<c003ec4c>] (irq_exit+0x4c/0x54)
 r7 = 400DC7DC  r6 = 00000020  r5 = 0000001F  r4 = C31E1FB0
[<c003ec00>] (irq_exit+0x0/0x54) from [<c001ddc4>] (asm_do_IRQ+0x64/0x74)
[<c001dd60>] (asm_do_IRQ+0x0/0x74) from [<c001cb60>] (__irq_usr+0x40/0x80)
 r4 = FFFFFFFF
Code: e211c003 0affffc4 e3c11003 e35c0002 (e491e004)
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

Attachments

madwifi-domlme.patch (2.5 kB) - added by tharvey on 10/19/06 20:40:12.
patch to fix issue
madwifi-domlme-revised.patch (2.2 kB) - added by mrenzmann on 12/08/06 18:09:21.
Revised version of the provided patch.

Change History

10/19/06 20:40:12 changed by tharvey

  • attachment madwifi-domlme.patch added.

patch to fix issue

10/19/06 20:50:52 changed by tharvey

Note that even though dropping all nodes was inappropriate, it should not have caused the driver to crash - I have not found that issue yet, but it has something to do with the wds node not being removed from the node table properly

10/20/06 06:26:11 changed by mrenzmann

  • version set to trunk.
  • milestone set to version 0.9.3.

Thanks for the patch. However, we can not commit the patch unless you have signed it off, so it would be great if you could do that soon.

10/21/06 02:16:50 changed by tharvey

sorry bout that - lets see if I understand how to sign-off

Signed-off-by: Tim Harvey <tim_harvey@yahoo.com>

10/21/06 09:31:18 changed by kelmo

You understand correctly. You can sign off in ticket comments, email (or other communication), or in the header of the patch if you like.

11/22/06 10:17:56 changed by kelmo

  • status changed from new to closed.
  • resolution set to fixed.

Applied to r1819

11/22/06 23:58:44 changed by tobiasoed@hotmail.com

If I weren't such a moron I would have followd up here instead of opening Ticket #1020 (defect).

Tim can you check if it still works for you with my proposed modifs?

Tobias

11/23/06 09:36:36 changed by kelmo

  • status changed from closed to reopened.
  • resolution deleted.

11/23/06 13:08:06 changed by tobiasoed@hotmail.com

As asked by kelmo in #1020, I follow up here with a description of the problems I have with r1819.

I'm running kernel 2.6.19-rc6 on a fedora 6 machine. The only non vanilla driver I use is madwifi and my setup is real simple: one physical device with a single station, without WEP/WPA.

With release 1819 my box hangs: not even sysrq works, nothing in the logs. I can trigger a hang by unscrewing the antenna.

I think (haven't verified) that it does an infinit loop in ieee80211_iterate_dev_nodes because of the change

-               if (ni->ni_scangen != gen) {
[snip]
+               if (ni->ni_scangen) {

That's why I reverted this part of Tim's commit.

The other part I reverted addresses the clearly wrong ieee80211_free_node(NULL) call introduced by Tim's patch. Only reverting this is not enough to resolve my hangs.

Tobias

12/08/06 18:09:21 changed by mrenzmann

  • attachment madwifi-domlme-revised.patch added.

Revised version of the provided patch.

12/08/06 18:11:26 changed by mrenzmann

From what I understand the revised patch should implement the current state of the discussion.

Tobias, Tim, could you two please verify that the patch works for you?

12/09/06 01:08:40 changed by tobiasoed@hotmail.com

Your revised patch corresponds to what I did with Tim's work, so it works for me. I tested it on top of 1821.

Tobias

02/06/07 16:34:23 changed by mentor

  • status changed from reopened to closed.
  • resolution set to fixed.

Comitted as above, r2083. Closing...