Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #1404 (new defect)

Opened 12 years ago

Last modified 12 years ago

[patch] transmit data dropped during 1 second after roam to new AP

Reported by: anonymous Assigned to:
Priority: major Milestone: version 0.9.x - progressive release candidate phase
Component: madwifi: driver Version: v0.9.3
Keywords: Cc:
Patch is attached: 1 Pending:

Description

I am testing the roaming of a wireless client between 2 accesspoints. I use the madwifi-0.9.3 driver, wpa_supplicant-0.5.7 and Linux kernel 2.6.16.
I enabled automatic roaming in madwifi like suggested in the madwifi-devel thread "automatic roaming" (2007-03-20) and in #697.
The roaming from AP-1 to AP-2 or vice versa always takes several seconds. This is caused by wpa_supplicant needing to retry authentication due to transmit data being dropped for almost exactly 1 second after association with the new AP.
I simplified the test scenario by revert to WEP encryption and even open mode without WEP and used ping -i 0.2 <ipaddr> to track down transmission data loss. I observed 5 pings getting lost when roaming from AP-1 to AP-2 or vice versa.

In the madwifi-devel list other people have also noticed the 1 second drop of transmit data:

  • 2007-03-20 15:57 Re: automatic roaming
  • 2007-04-05 09:58 1000ms delay after roaming

Finally I revealed that the problem is caused by the madwifi driver and the Linux kernel (2.6.16) in cooperation:

  1. Madwifi does netif_carrier_off() in ieee80211_notify_node_leave() when it leaves AP-1 before it roams to AP-2.
  2. After association with AP-2 madwifi does netif_carrier_on() in ieee80211_notify_node_join(), usually only some milliseconds after netif_carrier_off().
    netif_carrier_off() and netif_carrier_on() both call linkwatch_fire_event() in linux-source-2.6.16/net/core/link_watch.c.
  3. In link_watch.c the rate of linkwatch events is limited to one per second to prevent a storm of messages on the netlink socket.
  4. Due to this limitation the reactivation of the wireless interface intended by netif_carrier_on() is delayed by 1 second and messages sent during that time are discarded, although messages received by madwifi are passed to wpa_supplicant.

Furthermore the wireless event about lost association with AP-1 causes needless trouble in wpa_supplicant.

IMHO madwifi roaming to a new AP should not ieee80211_notify_node_leave() about lost association with the old AP if association with the new AP succeeds within milliseconds.
The attached patch achieves that:
ieee80211_notify_node_leave() delays notify leave old AP hoping that ieee80211_notify_node_join() cancels the delay timer on association to the new AP. If the latter does not succeed within 100 milliseconds the notification is still sent to inform kernel and application.

Attachments

patch-txdrop1sec.diff (5.1 kB) - added by t.schulz@zetesind.com on 06/22/07 22:24:19.

Change History

06/22/07 22:24:19 changed by t.schulz@zetesind.com

  • attachment patch-txdrop1sec.diff added.

06/22/07 22:29:25 changed by t.schulz@zetesind.com

The attached patch delays the notification about leave old AP in order to suppress it if association to new AP succeeds in less than 100 milliseconds. The patch is relative to madwifi release 0.9.3.

Signed-off-by: Thomas Schulz, zetesIND GmbH <t.schulz@zetesind.com>

(follow-up: ↓ 3 ) 06/22/07 22:31:25 changed by mentor

I disagree. The notify message for the AP MAC address should be sent if it changes. If Linux is breaking things by throttling messages, then it should not do that.

(in reply to: ↑ 2 ) 06/22/07 23:16:45 changed by t.schulz@zetesind.com

Replying to mentor:

I disagree. The notify message for the AP MAC address should be sent if it changes. If Linux is breaking things by throttling messages, then it should not do that.

The notify message for the new AP MAC address is of course neither delayed nor discarded. Only the notify message with the all-zero MAC address about leaving the old AP is delayed and normally discarded.

If Linux is breaking things by throttling messages, then it should not do that.

I agree. The Linux kernel people (Herbert Xu) seem to be working on this, see http ://lkml.org/lkml/2007/5/8/181

My patch improves madwifi with unpatched released Linux kernels.
I tested it successfully: normally only the notify message for the new AP MAC address appears, no transmit data is dropped and wpa_supplicant's authentication succeeds within 70..250 milliseconds. If association to the new AP takes longer to succeed (IEEE80211_TRANS_WAIT 5 seconds), then both the zero MACaddr and eventually the new AP MACaddr notify messages appear.
Other wireless drivers, e.g. orinoco and airo, show the behavior achieved by my patch: only one notify message with the new AP MACaddr.

06/27/07 12:29:36 changed by mrenzmann

  • milestone set to version 0.9.x - progressive release candidate phase.

I'm not into that stuff, but the explanation sounds reasonable - thus I vote for taking it in for v0.9.4. But as I don't know as the others think about it, I nevertheless schedule it for 0.9.x instead.

07/10/07 13:34:21 changed by anonymous

After applying the patch, could you tell me the parameters I must configure to get the quick re-association? I´m trying to test it but I´m not getting it so quick.

Thanks