Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #162 (assigned defect)

Opened 14 years ago

Last modified 11 years ago

failed assertion in rate-sample, race condition bringing interfaces up

Reported by: mark@glines.org Assigned to: proski (accepted)
Priority: major Milestone: version 0.9.x - progressive release candidate phase
Component: madwifi: driver Version: trunk
Keywords: Cc: mark@glines.org
Patch is attached: 1 Pending:

Description (Last modified by mrenzmann)

There's a KASSERT() at madwifi-svn-r1326/ath_rate/sample/sample.c:366. The assertion doesn't fail all that often, but it does happen, and seems to be caused when a packet is transmitted too quickly after the interface is brought up. I saw this by bridging an access point interface with an ethernet that had a lot of noisy windows broadcasts on it. I found it easier to reproduce by simply ping-flooding an 802.11 station from a host on the ethernet side, and then bringing the interface down and back up again quickly. But in most real-world scenerios, I don't think this race condition will happen very often.

I think the proper way to fix this is to defer packet delivery until all the data structures are properly set up and ready to handle it. I don't know enough about the network stack to know how to do that, so I can only give you an ugly workaround (change the ath_rate_findrate() API to return an int, -1 on failure, 0 on success), not a real fix.

I'm seeing this issue on armv5b. I have not turned on all debugging, simply because the serial console can't keep up with it.

[  517.620000] bridge0: port 2(ath0) entering disabled state
[  517.640000] bridge0: port 2(ath0) entering learning state
[  517.650000] bridge0: topology change detected, propagating
[  517.650000] bridge0: port 2(ath0) entering forwarding state
[  517.670000] ndx is -1<2>kernel BUG at /home/paranoid/madwifi-svn-r1326/ath_rate/sample/sample.c:366!
[  517.680000] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[  517.690000] pgd = c0590000
[  517.690000] [00000000] *pgd=00774031, *pte=00000000, *ppte=00000000
[  517.690000] Internal error: Oops: 817 [#1]
[  517.690000] Modules linked in: iptable_mangle sch_sfq sch_htb ipt_REJECT bridge tun iptable_filter iptable_nat ip_tables bonding e100 hostap wlan_scan_ap ath_pci ath_rate_sample wlan ath_hal hdlc syncppp lapb ixp400_eth ixp400
[  517.690000] CPU: 0
[  517.690000] PC is at __bug+0x40/0x54
[  517.690000] LR is at 0x1
[  517.690000] pc : [<c0022774>]    lr : [<00000001>]    Tainted: P
[  517.690000] sp : c18d7a20  ip : 60000093  fp : c18d7a30
[  517.690000] r10: 00000000  r9 : c1d7a220  r8 : ffffffff
[  517.690000] r7 : c1dd12dc  r6 : c1dd12dc  r5 : ffffffff  r4 : 00000000
[  517.690000] r3 : 00000000  r2 : 00000000  r1 : c02173a4  r0 : 00000001
[  517.690000] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  Segment user
[  517.690000] Control: 39FF  Table: 00590000  DAC: 00000015
[  517.690000] Process ifconfig (pid: 11114, stack limit = 0xc18d6194)
[  517.690000] Stack: (0xc18d7a20 to 0xc18d8000)
[  517.690000] 7a20: ffffffff c18d7a90 c18d7a34 bf1b2b74 c0022744 c0132080 c0131ea0 c0fd1380
[  517.690000] 7a40: c1e9b9c0 c1e9b9c0 c18d7a90 c18d7a58 c0027d5c 00000000 00000001 c1ea2240
[  517.690000] 7a60: 00000000 c1dd1000 00000018 bf183fb8 ffc028c0 c18d6000 c1d7a220 c1da8c60
[  517.690000] 7a80: c1dd1000 c18d7b48 c18d7a94 bf1baf9c bf1b268c c18d7b1f c18d7b18 c18d7b1e
[  517.690000] 7aa0: 00000000 00000000 00000000 00000001 00000000 00000018 00000000 00000036
[  517.690000] 7ac0: 00000018 00000000 c1d7a220 c18d7bb0 c1dd1000 00000000 00000000 00000000
[  517.690000] 7ae0: 00000000 00000001 c1e9c820 00000000 000000ff 00000052 00000018 ffffffff
[  517.690000] 7b00: 00000000 c1d98000 c1e8f220 c1d7a220 c0fd1380 c1d7a000 00000002 08060000
[  517.690000] 7b20: 00000000 c1d7a220 c18d6000 c18d6000 c1da8c60 c1dd1000 c0fd1380 c18d7be8
[  517.690000] 7b40: c18d7b4c bf1c1ff0 bf1baa74 00000000 bf135bcc c18d7b74 c18d7b64 c0028c58
[  517.690000] 7b60: c0028c14 c0215940 c18d7b90 c18d7b78 c001ebbc c0028be8 c0215940 0000001f
[  517.690000] 7b80: 00040000 c1d86cd4 c1c11268 c1c11368 c003892c c1d7a220 00000000 ffffffff
[  517.690000] 7ba0: 0000001f 00040000 c1d7b6b8 c1d7a000 00000001 00000000 c1da8c60 c1da8c60
[  517.690000] 7bc0: c1afcaa0 c1d7a000 c1e9b9c0 00000000 c1d7a024 c1ea0012 c1d7a000 c18d7c0c
[  517.690000] 7be0: c18d7bec c014477c bf1c10d4 c1d7a000 c1e9b9c0 00000000 00000000 c1d7a220
[  517.690000] 7c00: c18d7c28 c18d7c10 c0137860 c01446a0 c1e9b9c0 c1e8f220 c1dd1000 c18d7c54
[  517.690000] 7c20: c18d7c2c bf193934 c0137774 c1c11000 c1e8f000 c1e9b9c0 00000000 c1e8f024
[  517.690000] 7c40: 00000040 0000ca37 c18d7c78 c18d7c58 c014477c bf193590 c1e8f000 c1e9b9c0
[  517.690000] 7c60: 00000000 c1afc7a0 bf217f68 c18d7c94 c18d7c7c c0137860 c01446a0 c1e9b9c0
[  517.690000] 7c80: c1e9b9c0 c1f89220 c18d7cb0 c18d7c98 bf217e58 c0137774 c0311030 00000000
[  517.690000] 7ca0: c0225b08 c18d7cd0 c18d7cb4 bf217ed0 bf217dcc 00000000 00000331 c18d7cf4
[  517.690000] 7cc0: c1e9b9c0 c18d7cf4 c18d7cd4 bf217fe4 bf217e74 c1f89220 c1afc6a0 bf217f68
[  517.690000] 7ce0: c1e9b9c0 c1f89218 c18d7d18 c18d7cf8 bf2181d8 bf217f74 c1f89220 c1e9b9c0
[  517.690000] 7d00: c1ea0012 00000000 c0224c28 c18d7d38 c18d7d1c bf218ce0 bf2180fc c1ea0012
[  517.690000] 7d20: c1afc8a0 c18d7d68 00000001 c18d7d64 c18d7d3c bf218ec4 bf218bd0 c18d7d48
[  517.690000] 7d40: bf161dec bf1501a4 c1e9b9c0 c1afc8a0 00000000 c0224ef4 c18d7d8c c18d7d68
[  517.690000] 7d60: c0137eb4 bf218cf4 c1e9b9c0 c1a83000 c0224c50 00000000 c18d7dbc c0224c28
[  517.690000] 7d80: c18d7db8 c18d7d90 c013809c c0137d38 c0224d4c c0224c50 c0224c28 0000ca37
[  517.690000] 7da0: c01d92c0 00000000 c18d7ec8 c18d7de0 c18d7dbc c0138218 c0137ff8 0000012c
[  517.690000] 7dc0: 00000005 c021b4c0 c18d6000 00000009 c021b480 c18d7e04 c18d7de4 c00387d8
[  517.690000] 7de0: c0138194 60000013 c1e8f000 00000001 00000000 be89fdac c18d7e18 c18d7e08
[  517.690000] 7e00: c0038890 c0038788 c18d6000 c18d7e2c c18d7e1c c0038900 c0038858 c1f89220
[  517.690000] 7e20: c18d7e44 c18d7e30 bf219a38 c00388a8 bf21fa34 c1e8f000 c18d7e60 c18d7e48
[  517.690000] 7e40: c0040a84 bf219900 c1e8f000 00000000 00001102 c18d7e78 c18d7e64 c01371e4
[  517.690000] 7e60: c0040a58 c1e8f000 00001043 c18d7e98 c18d7e7c c0138808 c0137138 00000000
[  517.690000] 7e80: ffffff9d c0f10460 c1e8f000 c18d7f00 c18d7e9c c0177c78 c01387b0 00000000
[  517.690000] 7ea0: c0f1046c 00008914 10430000 00000029 00000028 0000000c 61746830 00000000
[  517.690000] 7ec0: 00000000 00000000 10430000 00000029 00000028 0000000c 00008914 be89fdac
[  517.690000] 7ee0: be89fdac be89fdac 00000000 c18d6000 00000000 c18d7f18 c18d7f04 c017900c
[  517.690000] 7f00: c017798c c1f40900 00008914 c18d7f3c c18d7f1c c012e0f8 c0178f60 c1f40900
[  517.690000] 7f20: ffffffe7 be89fdac 00000003 00000000 c18d7f58 c18d7f40 c0079494 c012de98
[  517.690000] 7f40: c1f40900 c1f40900 be89fdac c18d7f84 c18d7f5c c0079784 c0079464 c18d7fb0
[  517.690000] 7f60: 00000017 c1f40900 fffffff7 00008914 00000036 c001de44 c18d7fa4 c18d7f88
[  517.690000] 7f80: c00797e4 c00794ec 00000000 be89fdac 0004c788 0005ab5c 00000000 c18d7fa8
[  517.690000] 7fa0: c001dcc0 c00797b0 be89fdac c002507c 00000003 00008914 be89fdac be89fdac
[  517.690000] 7fc0: be89fdac 0004c788 0005ab5c 00000003 00000004 be89fedc 00000000 be89fc9c
[  517.690000] 7fe0: be89fca0 be89fc7c 4008b078 4008afe8 20000010 00000003 00000000 00000000
[  517.690000] Backtrace:
[  517.690000] [<c0022738>] (__bug+0x4/0x54) from [<bf1b2b74>] (ath_rate_findrate+0x4f4/0x55c [ath_rate_sample])
[  517.690000]  r4 = FFFFFFFF
[  517.690000] [<bf1b2680>] (ath_rate_findrate+0x0/0x55c [ath_rate_sample]) from [<bf1baf9c>] (ath_tx_start+0x534/0x13d4 [ath_pci])
[  517.690000] [<bf1baa68>] (ath_tx_start+0x0/0x13d4 [ath_pci]) from [<bf1c1ff0>] (ath_hardstart+0xf28/0x10e0 [ath_pci])
[  517.690000] [<bf1c10c8>] (ath_hardstart+0x0/0x10e0 [ath_pci]) from [<c014477c>] (qdisc_restart+0xe8/0x1d4)
[  517.690000] [<c0144694>] (qdisc_restart+0x0/0x1d4) from [<c0137860>] (dev_queue_xmit+0xf8/0x220)
[  517.690000]  r8 = C1D7A220  r7 = 00000000  r6 = 00000000  r5 = C1E9B9C0
[  517.690000]  r4 = C1D7A000
[  517.690000] [<c0137768>] (dev_queue_xmit+0x0/0x220) from [<bf193934>] (ieee80211_hardstart+0x3b0/0x428 [wlan])
[  517.690000]  r6 = C1DD1000  r5 = C1E8F220  r4 = C1E9B9C0
[  517.690000] [<bf193584>] (ieee80211_hardstart+0x0/0x428 [wlan]) from [<c014477c>] (qdisc_restart+0xe8/0x1d4)
[  517.690000] [<c0144694>] (qdisc_restart+0x0/0x1d4) from [<c0137860>] (dev_queue_xmit+0xf8/0x220)
[  517.690000]  r8 = BF217F68  r7 = C1AFC7A0  r6 = 00000000  r5 = C1E9B9C0
[  517.690000]  r4 = C1E8F000
[  517.690000] [<c0137768>] (dev_queue_xmit+0x0/0x220) from [<bf217e58>] (br_dev_queue_push_xmit+0x98/0xa8 [bridge])
[  517.690000]  r6 = C1F89220  r5 = C1E9B9C0  r4 = C1E9B9C0
[  517.690000] [<bf217dc0>] (br_dev_queue_push_xmit+0x0/0xa8 [bridge]) from [<bf217ed0>] (br_forward_finish+0x68/0x7c [bridge])
[  517.690000]  r4 = C0225B08
[  517.690000] [<bf217e68>] (br_forward_finish+0x0/0x7c [bridge]) from [<bf217fe4>] (__br_forward+0x7c/0x8c [bridge])
[  517.690000] [<bf217f68>] (__br_forward+0x0/0x8c [bridge]) from [<bf2181d8>] (br_flood+0xec/0x134 [bridge])
[  517.690000]  r4 = C1F89218
[  517.690000] [<bf2180f0>] (br_flood+0x4/0x134 [bridge]) from [<bf218ce0>] (br_handle_frame_finish+0x11c/0x124 [bridge])
[  517.690000]  r8 = C0224C28  r7 = 00000000  r6 = C1EA0012  r5 = C1E9B9C0
[  517.690000]  r4 = C1F89220
[  517.690000] [<bf218bc4>] (br_handle_frame_finish+0x0/0x124 [bridge]) from [<bf218ec4>] (br_handle_frame+0x1dc/0x244 [bridge])
[  517.690000]  r7 = 00000001  r6 = C18D7D68  r5 = C1AFC8A0  r4 = C1EA0012
[  517.690000] [<bf218ce8>] (br_handle_frame+0x0/0x244 [bridge]) from [<c0137eb4>] (netif_receive_skb+0x188/0x2c0)
[  517.690000]  r6 = C0224EF4  r5 = 00000000  r4 = C1AFC8A0
[  517.690000] [<c0137d2c>] (netif_receive_skb+0x0/0x2c0) from [<c013809c>] (process_backlog+0xb0/0x19c)
[  517.690000]  r8 = C0224C28  r7 = C18D7DBC  r6 = 00000000  r5 = C0224C50
[  517.690000]  r4 = C1A83000
[  517.690000] [<c0137fec>] (process_backlog+0x0/0x19c) from [<c0138218>] (net_rx_action+0x90/0x170)
[  517.690000] [<c0138188>] (net_rx_action+0x0/0x170) from [<c00387d8>] (__do_softirq+0x5c/0xd0)
[  517.690000]  r8 = C021B480  r7 = 00000009  r6 = C18D6000  r5 = C021B4C0
[  517.690000]  r4 = 00000005
[  517.690000] [<c003877c>] (__do_softirq+0x0/0xd0) from [<c0038890>] (do_softirq+0x44/0x50)
[  517.690000]  r8 = BE89FDAC  r7 = 00000000  r6 = 00000001  r5 = C1E8F000
[  517.690000]  r4 = 60000013
[  517.690000] [<c003884c>] (do_softirq+0x0/0x50) from [<c0038900>] (local_bh_enable+0x64/0x84)
[  517.690000]  r4 = C18D6000
[  517.690000] [<c003889c>] (local_bh_enable+0x0/0x84) from [<bf219a38>] (br_device_event+0x144/0x14c [bridge])
[  517.690000]  r4 = C1F89220
[  517.690000] [<bf2198f4>] (br_device_event+0x0/0x14c [bridge]) from [<c0040a84>] (notifier_call_chain+0x38/0x50)
[  517.690000]  r5 = C1E8F000  r4 = BF21FA34
[  517.690000] [<c0040a4c>] (notifier_call_chain+0x0/0x50) from [<c01371e4>] (dev_open+0xb8/0xc8)
[  517.690000]  r6 = 00001102  r5 = 00000000  r4 = C1E8F000
[  517.690000] [<c013712c>] (dev_open+0x0/0xc8) from [<c0138808>] (dev_change_flags+0x64/0x124)
[  517.690000]  r5 = 00001043  r4 = C1E8F000
[  517.690000] [<c01387a4>] (dev_change_flags+0x0/0x124) from [<c0177c78>] (devinet_ioctl+0x2f8/0x6f0)
[  517.690000]  r7 = C1E8F000  r6 = C0F10460  r5 = FFFFFF9D  r4 = 00000000
[  517.690000] [<c0177980>] (devinet_ioctl+0x0/0x6f0) from [<c017900c>] (inet_ioctl+0xb8/0x104)
[  517.690000] [<c0178f54>] (inet_ioctl+0x0/0x104) from [<c012e0f8>] (sock_ioctl+0x26c/0x2a4)
[  517.690000]  r5 = 00008914  r4 = C1F40900
[  517.690000] [<c012de8c>] (sock_ioctl+0x0/0x2a4) from [<c0079494>] (do_ioctl+0x3c/0x88)
[  517.690000]  r8 = 00000000  r7 = 00000003  r6 = BE89FDAC  r5 = FFFFFFE7
[  517.690000]  r4 = C1F40900
[  517.690000] [<c0079458>] (do_ioctl+0x0/0x88) from [<c0079784>] (vfs_ioctl+0x2a4/0x2c4)
[  517.690000]  r6 = BE89FDAC  r5 = C1F40900  r4 = C1F40900
[  517.690000] [<c00794e0>] (vfs_ioctl+0x0/0x2c4) from [<c00797e4>] (sys_ioctl+0x40/0x5c)
[  517.690000]  r8 = C001DE44  r7 = 00000036  r6 = 00008914  r5 = FFFFFFF7
[  517.690000]  r4 = C1F40900
[  517.690000] [<c00797a4>] (sys_ioctl+0x0/0x5c) from [<c001dcc0>] (ret_fast_syscall+0x0/0x2c)
[  517.690000]  r6 = 0005AB5C  r5 = 0004C788  r4 = BE89FDAC
[  517.690000] Code: 1b004760 e59f0014 eb00475e e3a03000 (e5833000)
[  517.690000]  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!

Attachments

rate_sample_race.diff (2.8 kB) - added by mark@glines.org on 11/18/05 18:53:31.
ugly workaround
sample-plugin-kassert.diff (428 bytes) - added by Mark Glines <mark@glines.org> on 02/09/06 18:14:37.
here's another possible fix
madwifi-ng-r1427-20060202_no_rates_fixup.patch (3.0 kB) - added by Mark Glines <mark@glines.org> on 02/09/06 18:16:27.
Dan's patch
madwifi_old-no_rates_fixup.patch (2.3 kB) - added by Mark Glines <mark@glines.org> on 02/09/06 18:17:30.
Dan's madwifi-old patch

Change History

11/18/05 18:53:31 changed by mark@glines.org

  • attachment rate_sample_race.diff added.

ugly workaround

11/19/05 14:15:11 changed by anonymous

i hink this is old problem with madwifi, sometimes this happens and with old madwifi versions (not ng). We are using similar fix, but insted of returning with error from TX function we use lowest rate as for broadcast frames

11/20/05 09:51:24 changed by mrenzmann

  • description changed.

11/21/05 18:39:40 changed by Mark Glines <mark@glines.org>

I've reproduced this in the same way, with both madwifi-ng and madwifi-old.

11/29/05 18:52:13 changed by mrenzmann

  • milestone set to version 1.0.0 - first stable release.

11/29/05 19:12:18 changed by mrenzmann

  • version set to trunk.

Setting version to trunk although it is valid for -old as well. Unlikely to get fixed in -old.

02/08/06 11:01:51 changed by dan@adelix.com

  • priority changed from minor to blocker.
  • patch_attached changed.

This bug is still happening, I have tried all three rate control algs. on the very latest madwifi-old and madwifi-ng Revision: 1427 snapshots, and both crash about every 2 hours on our test system, i.e. not just when interfaces are brought up, but about every two hours of normal operation.

This is a very serious bug, and as it's happening ragardless of rate control module, I suspect the problem is a more majour architectural problem than a specific bug in the rate control modules. Needs sortin ASAP IMVHO.

Dan...

02/08/06 15:18:37 changed by Mark Glines <mark@glines.org>

  • priority changed from blocker to minor.

This message is a response to dan@adelix.com:

Hi Dan,

Huh?

You have tried all three rate control algorithms, and they all say "kernel BUG at /blah/blah/ath_rate/sample/sample.c:366"? If so, then you haven't really switched rate control algorithms at all... this bug is specific to ath_rate_sample, and the above message could only have come from ath_rate_sample. You may have rebuilt the driver with a different ATH_RATE setting, but perhaps you didn't move the old driver out of the way, or perhaps you didn't unload the driver modules fully before trying the new one?

If your message is different, then it is not the same crash as the one documented in this ticket.

From what I've seen, this bug is *only* reproducable when the link is brought down and back up. It does not occur during normal operation, I've run the sample plugin for days and days without problem. And it *only* occurs with the sample plugin; I have also done lots of testing with the onoe plugin, and have had no issues. The onoe plugin certainly doesn't emit any errors talking about ath_rate/sample/sample.c.

From the symptoms you describe, I think you must be seeing a different bug than this one. Maybe you should add a new ticket, and post a crash dump there?

Mark

02/08/06 16:42:06 changed by Mark Glines <mark@glines.org>

More info from Dan (over email):

Sorry, I am being dumb here. The onoe and amrr modules never load,
they always bomb out on loading the module with "no rate table"
KASSERT bugs. So, I'm only talking about sample.c I guess..

I am patching the rate select alg, so that if ath_rate_findrate fails,
then it selects the multicast rate instead.

I'll let you know how I get on...

Dan...

02/09/06 18:14:37 changed by Mark Glines <mark@glines.org>

  • attachment sample-plugin-kassert.diff added.

here's another possible fix

02/09/06 18:15:39 changed by Mark Glines <mark@glines.org>

Dan sent me some more info...

Hi,

The patch you sent will not solve the particular problem I'm seeing,
as ndx is -1, but sn->num_rates is 0, meaning setting ndx to 0 will
probably cause a null pointer exception later on when the code tries
to access data in the sn->rates[] vector.

I have come up with a patch which fixes the problem. It's not a very
nice fix, more of a kludge really, but it keeps the driver running,
and I only see ath_rate_findrate fail about once every hour or so
anyway.

02/09/06 18:16:27 changed by Mark Glines <mark@glines.org>

  • attachment madwifi-ng-r1427-20060202_no_rates_fixup.patch added.

Dan's patch

02/09/06 18:17:06 changed by Mark Glines <mark@glines.org>

Hi,

Further to my last post, here's a similar patch for the madwifi-old
branch, which I also saw the bug appear about once every hour of
operation. I.e. the ath_rate_findrate in sample.c sometimes failed
because sn->num_rates is 0.

Regards, Dan...

02/09/06 18:17:30 changed by Mark Glines <mark@glines.org>

  • attachment madwifi_old-no_rates_fixup.patch added.

Dan's madwifi-old patch

02/09/06 19:49:30 changed by mrenzmann

  • patch_attached set to 1.

10/15/06 21:41:11 changed by mrenzmann

  • milestone changed from version 1.0.0 - first stable release to version 0.9.3.

Formatting of the patch should be adjusted a little, and ath_rate_findrate most probably needs to be changed from void to int for amrr and onoe as well.

Anyway, the patch still seems to be needed and should be committed soon. Probably needs minor adjustment to apply to the current codebase. Anyone up for this?

10/18/06 20:16:52 changed by mark@glines.org

Hi, mrenzmann!

I'm not even sure what the right solution is. Its been a while since I've looked at this, and my memory isn't what it used to be :) Since I'm so clue-deprived, I guess I'd better start asking questions.

The way I see it, there are a few possible ways to fix this:

1. change the API, allow ath_rate_findrate to return an error if it can't find its state

2. use a multicast rate if state hasn't been set up, as suggested by dan@adelix.com

3. figure out why the state wasn't set up yet, and fix that problem instead

My first patch, rate_sample_race.diff, implements (1). It isn't complete, the API change needs to be applied to the other rate plugins as well. I'm happy to perform this cleanup, if that's what you want.

I think my second patch, sample-plugin-kassert.diff, implements (2). Rate number 0 is the multicast rate, right? I'm unsure of this. In any case, with the added conditional in this patch, the KASSERT below is now only really checking for an invalid sn->num_rates value, so it can probably be simplified a bit.

The patch provided by Dan seems to do a little of both... it changes the API so ath_rate_findrate can return an error, and patches the caller to use the multicast rate. I don't really understand this patch, but I'm willing to figure it out. Reading this patch brings up another question: it looks like there was already some code to handle the case where "rix" got set to 0xff. Is this an already-existing way for the rate plugin to report an error? Is the KASSERT in the sample plugin there to prevent this from ever happening? How does this change tie into that stuff?

What I'd really love to see is a fix for (3), because I'm afraid other race conditions elsewhere may also exist. (It begs the question, why are packets coming in from the rest of the kernel if we aren't ready for them yet?) But I wouldn't know where to start, with that.

So, are you asking for a cleaned up patch for (1), or for (2)? What did you think about Dan's stuff? My first patch just returned -EIO and dropped the packet; do you think sending with the multicast rate is better? I'm happy to issue a new patch, I just need some guidance.

Mark

10/19/06 13:33:15 changed by mrenzmann

I currently have no idea which of the described paths would be the best. Hence I would be glad if some of the other devs could throw in their thoughts here :)

12/08/06 18:14:21 changed by mrenzmann

No thoughts on which of the sketched paths would be the best to take?

02/06/07 15:47:15 changed by mrenzmann

  • milestone changed from version 0.9.3 to version 0.9.x - progressive release candidate phase.

Postponed for now. Patch does not apply cleanly, and there are open questions that need to be discussed first (which probably should happen on madwifi-devel).

07/25/07 05:49:59 changed by dyqith

This doesn't seem to be an issue any more. This code snippet from ath/if_ath.c "fixes" this issue.

			/* Ratecontrol sometimes returns invalid rate index */
			if (rix != 0xff)
				an->an_prevdatarix = rix;
			else
				rix = an->an_prevdatarix;

So far, I don't see a fix for Mark's #3. Will have to take a look.

06/20/08 19:49:30 changed by proski

  • priority changed from minor to major.
  • status changed from new to assigned.
  • owner set to proski.

Ticket #1709 seems to describe the same problem for minstrel.