Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #1143 (assigned defect)

Opened 15 years ago

Last modified 15 years ago

Oops: On driver_unregister cleanup path

Reported by: proski Assigned to: proski (accepted)
Priority: major Milestone: version 0.9.4
Component: madwifi: other Version: trunk
Keywords: Cc:
Patch is attached: 0 Pending:

Description (Last modified by mrenzmann)

This script causes various oopses with the current madwifi and current Linux from wireless-dev:

modprobe ath_pci autocreate=ap
ifconfig ath0 up
wlanconfig ath1 create wlandev wifi0 wlanmode wds
ifconfig ath1 up
iwconfig ath1 ap 00:01:02:03:04:05

iwconfig
ifconfig ath1 down
ifconfig ath0 down
rmmod ath_pci

First time it was "vap not stopped" bug, then an attempt to use a spinlock on freed memory. The "6b" pattern is a clear sign that freed memory is involved.

It's a x86_64 kernel with most debugging options enabled. The MAC address is made up.

Swapping the line setting the "ap" and the previous line setting the IP address on ath1 causes the problem to disappear, but it may be just a different timing. The script is run on serial console, so the iwconfig output can provide a delay necessary to trigger the bug.

First oops:

VAP not stopped<0>------------[ cut here ]------------
kernel BUG at /home/proski/src/madwifi/ath/if_ath.c:1216!
invalid opcode: 0000 [1] 
CPU 0 
Modules linked in: ath_pci wlan_scan_ap ath_rate_sample wlan ath_hal(P)
Pid: 6512, comm: rmmod Tainted: P      2.6.20-rc6 #20
RIP: 0010:[<ffffffff8806f1d8>]
[<ffffffff8806f1d8>] :ath_pci:ath_vap_delete+0x48/0x350
RSP: 0018:ffff81001d355d18  EFLAGS: 00010296
RAX: 0000000000000012 RBX: 0000000000000004 RCX: ffffffff805d7688
RDX: ffff81001b291100 RSI: 0000000000000001 RDI: ffffffff805d7640
RBP: ffff81001d355d48 R08: ffffffff80679978 R09: 0000000000000000
R10: ffff81001d355c38 R11: 0000000000000246 R12: ffff81001dbc8000
R13: ffff81001c135520 R14: ffff81001dbc8520 R15: ffff81001dbc8000
FS:  00002b2a12635240(0000) GS:ffffffff8061b000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b90d575b0a0 CR3: 000000001d988000 CR4: 00000000000006e0
Process rmmod (pid: 6512, threadinfo ffff81001d354000, task
ffff81001b291100)
Stack:  ffff81001b5c0000 ffff81001dbc8520 ffff81001dbc8000
ffff81001b5c0000
 ffffffff8807f548 ffffffff8807f660 ffff81001d355d68 ffffffff88037ea6
 ffff81001dbc8520 ffff81001dbc8520 ffff81001d355da8 ffffffff8806dbd1
Call Trace:
 [<ffffffff88037ea6>] :wlan:ieee80211_ifdetach+0x26/0x80
 [<ffffffff8806dbd1>] :ath_pci:ath_detach+0x81/0x110
 [<ffffffff804b5625>] wait_for_completion+0xd5/0xe0
 [<ffffffff880778be>] :ath_pci:ath_pci_remove+0x2e/0xa0
 [<ffffffff80354d2f>] pci_device_remove+0x2f/0x60
 [<ffffffff803d8553>] __device_release_driver+0x93/0xb0
 [<ffffffff803d8bb3>] driver_detach+0xe3/0x130
 [<ffffffff803d7fe3>] bus_remove_driver+0x83/0xb0
 [<ffffffff803d8c45>] driver_unregister+0x15/0x30
 [<ffffffff80354f55>] pci_unregister_driver+0x25/0x80
 [<ffffffff88077ce5>] :ath_pci:exit_ath_pci+0x15/0x2c
 [<ffffffff80250b4b>] sys_delete_module+0x1ab/0x1f0
 [<ffffffff804b7840>] trace_hardirqs_on_thunk+0x35/0x37
 [<ffffffff80209b1e>] system_call+0x7e/0x83

Second oops:

general protection fault: 0000 [1] 
CPU 0 
Modules linked in: ath_pci wlan_scan_ap ath_rate_sample wlan ath_hal(P)
Pid: 3857, comm: ifconfig Tainted: P      2.6.20-rc6 #20
RIP: 0010:[<ffffffff8034a81e>]  [<ffffffff8034a81e>] _raw_spin_lock+0x1e/0x130
RSP: 0018:ffff81001dc99be8  EFLAGS: 00010086
RAX: ffff81001d500080 RBX: 6b6b6b6b6b6b6b73 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b73
RBP: ffff81001dc99c08 R08: 0000000000000002 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000286
R13: 6b6b6b6b6b6b6b73 R14: ffff81001b8e0108 R15: ffff81001b8f0520
FS:  00002abed66833b0(0000) GS:ffffffff8061b000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000006b9fb0 CR3: 000000001d75c000 CR4: 00000000000006e0
Process ifconfig (pid: 3857, threadinfo ffff81001dc98000, task ffff81001d500080)
Stack:  6b6b6b6b6b6b6b73 0000000000000286 6b6b6b6b6b6b6b73 ffff81001b8e0108
 ffff81001dc99c28 ffffffff804b82ee 6b6b6b6b6b6b6b6b ffff81001da4d000
 ffff81001dc99c58 ffffffff88041322 ffff81001b8f2128 0000000000000001
Call Trace:
 [<ffffffff804b82ee>] _spin_lock_irqsave+0x3e/0x50
 [<ffffffff88041322>] :wlan:ieee80211_free_node+0x32/0x90
 [<ffffffff8806ac2a>] :ath_pci:ath_tx_draintxq+0x16a/0x1b0
 [<ffffffff80227b90>] default_wake_function+0x0/0x10
 [<ffffffff8806ada4>] :ath_pci:ath_draintxq+0x134/0x160
 [<ffffffff8806b30e>] :ath_pci:ath_stop_locked+0xde/0x1c0
 [<ffffffff8806b45e>] :ath_pci:ath_stop+0x6e/0x90
 [<ffffffff80460d62>] dev_close+0x62/0x90
 [<ffffffff88048c6e>] :wlan:ieee80211_stop+0xae/0x110
 [<ffffffff80460d62>] dev_close+0x62/0x90
 [<ffffffff8046017d>] dev_change_flags+0x6d/0x150
 [<ffffffff8049c48c>] devinet_ioctl+0x30c/0x730
 [<ffffffff8049cb9c>] inet_ioctl+0x4c/0x70
 [<ffffffff80455180>] sock_ioctl+0x210/0x240
 [<ffffffff804b819b>] _spin_unlock_irq+0x2b/0x40
 [<ffffffff8028deab>] do_ioctl+0x1b/0x60
 [<ffffffff8028e151>] vfs_ioctl+0x261/0x280
 [<ffffffff8028e1ba>] sys_ioctl+0x4a/0x80
 [<ffffffff80209b1e>] system_call+0x7e/0x83

Attachments

madtest (232 bytes) - added by proski on 02/23/07 09:37:49.
the script that triggers the problem

Change History

02/08/07 00:38:23 changed by mentor

I believe this is because the ieee80211_stop_running() function does not work correctly. It is checking the IFF_RUNNING flag to determine if it should stop a VAP, which I believe to be incorrect.

02/08/07 06:58:53 changed by mrenzmann

  • description changed.
  • reporter changed from mentor to proski.

The report originally was posted by proski to madwifi-devel.

02/12/07 06:44:36 changed by mentor

Hmmm... OK, I give up, I can't see it.

02/12/07 07:09:36 changed by mentor

Does the plain iwconfig do anything in that script? Is this problem reproducible?

I'm now guessing this is a bug in the scanning state machine.

02/12/07 07:58:05 changed by proski

  • status changed from new to assigned.
  • owner set to proski.

The problem is 100% reproducible for me. The role of iwconfig is probably just a delay. I haven't tried many modifications to the script, but moving "iwconfig ath1 ap 00:01:02:03:04:05" one line up (i.e. before ath1 is brought up) avoids the problem.

02/21/07 04:01:59 changed by mentor

Any progress on this?

02/23/07 09:36:28 changed by proski

The "VAP not stopped" problem is gone. I'm not sure why. It was happening every time. The problem in spinlock in ieee80211_free_node() under ath_draintxq() is still present, but it takes some persistence. I think I could trigger from the tenth attempt or so.

02/23/07 09:37:49 changed by proski

  • attachment madtest added.

the script that triggers the problem

02/26/07 02:45:39 changed by mentor

Was the 'vap not stopped' bug's disappearance related to a change of (trunk) revision?

03/01/07 02:12:47 changed by anonymous

It seems that fixing this bug take ridiculously long time...

03/01/07 03:32:40 changed by mentor

If the former problem is gone, I'd be tempted to bump this issue to the next release. Especially as the second problems looks reference counting related.

03/06/07 08:00:43 changed by dyqith

Tested the script on a Core2Duo x86_64 laptop Kernel 2.6.19.-1.2911.6.4.fc6 (Fedora Core 6).

Can not reproduce oops/panics.

Will try to get wireless-dev kernel and try on that.

03/06/07 09:12:38 changed by dyqith

Tried against linville's wireless-dev git tree. Laptop = x86_64 , using atheros 5212 card.

Can not reproduce the oops/panics.

Can you sure your .config file so I can repro the kernel ?

03/06/07 20:36:31 changed by mrenzmann

  • milestone changed from version 0.9.3 to version 0.9.4.

Rescheduling this ticket for release 0.9.4.