Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #698 (closed defect: fixed)

Opened 13 years ago

Last modified 13 years ago

madwifi-ng not associating with some access-points

Reported by: wollenberg _at_ web _dot_ de Assigned to: mrenzmann
Priority: major Milestone: version 0.9.3
Component: madwifi: driver Version: trunk
Keywords: roamabout associate Cc:
Patch is attached: 1 Pending:

Description

I noticed that madwifi-ng (all versions, I tried up to r1645) does not associate with some access-points at my university. It seems that all non-working APs are "Entrasys RoamAbout?"-devices (fairly old models, 802.11b). All other APs used here are Cisco models and madwifi-ng connects without any problems. This bug did not exist in madwifi(-old) (using the old version I can connect to all APs, even the RoamAbout?-APs).

My card is a TPLINK TL-WN510G. The university's WLAN uses hidden SSID (only on some APs) and no encryption.

Symptoms when trying to connect to the university's WLAN at a place where only RoamAbout?-APs are in range:

#iwconfig ath0 essid wlan

#iwlist ath0 scan
ath0      Scan completed :
          Cell 01 - Address: 00:E0:63:50:20:3A
                    ESSID:"wlan"
                    Mode:Master
                    Frequency:2.437 GHz (Channel 6)
                    Quality=28/94  Signal level=-67 dBm  Noise level=-95 dBm
                    Encryption key:off
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s
                    Extra:bcn_int=100
          Cell 02 - Address: 00:E0:63:50:1D:E6
                    ESSID:"wlan"
                    Mode:Master
                    Frequency:2.417 GHz (Channel 2)
                    Quality=23/94  Signal level=-72 dBm  Noise level=-95 dBm
                    Encryption key:off
                    Bit Rates:1 Mb/s; 2 Mb/s; 5.5 Mb/s; 11 Mb/s
                    Extra:bcn_int=100

The card/driver starts "searching", both LEDs on my card are blinking alternately. After some seconds both LEDs are blinking synchroneously. iwconfig then reports:

#iwconfig ath0
ath0      IEEE 802.11g  ESSID:"wlan"  
          Mode:Managed  Frequency:2.437 GHz  Access Point: Not-Associated   
          Bit Rate:1 Mb/s   Tx-Power:19 dBm   Sensitivity=0/3  
          Retry:off   RTS thr:off   Fragment thr:off
          Encryption key:off
          Power Management:off
          Link Quality=25/94  Signal level=-70 dBm  Noise level=-95 dBm
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

It looks like the card is associated but no AP-address is reported. After 7-8 seconds the card starts scanning again. This process repeats forever. Notice that "Access Point" reads "Not-Associated" but there is a link-quality and a signal-level...

With madwifi(-old) it was possible to use the raw-device (athXraw) to see "what's going on in the air" i.e. which frames are transmitted and received by the driver. Using one VAP in sta-mode and another VAP in monitor-mode I don't see any frames on the monitor-VAP during scanning. I can, however, supply a pcap-file with some beacon-frames from a RoamAbout?-AP.

Attachments

madwifi-agere-patch (2.3 kB) - added by Till Wollenberg (wollenberg _at_ web _dot_ de) on 06/23/06 12:51:11.
allows association with broken access points
agere-assoc-698.diff (2.0 kB) - added by kelmo on 09/13/06 12:20:23.
proposed commit

Change History

06/16/06 18:31:14 changed by T.M.Williams@cs.bham.ac.uk

My university has these access points too, they were installed about 4 years ago and I can report exactly the same problem, the new driver refuses to associate with them, but works fine on other networks with difference access points.

If I backdate my setup to use the old driver then all access points work correctly.

I have a DLINK DWL-G650 wireless card. We use hidden SSIDS and WEP encryption.

06/20/06 01:09:23 changed by harrier

I'm finding something similar with links at distance at 2.4. when the signal is strong enough, sometimes it gets a mac address from the AP. But it sits on the channel for about 10 seconds, then starts scanning again. The accesspoint is a wrap board with a CM9 running staros. Client is a wrap running a hand-rolled linux with 0.9.0 madwifi. It's the same on 5km, 20km and 50km links (I tried all three). Only seems to work with two boards on the bench beside each other.

06/20/06 20:29:04 changed by roland (newsletter at digitalhalo dot de)

Same Problem over here ( r1645 ). Current Madwifi drivers cannot associate with the access point in my University. (Works fine with every other AP)

The University uses unencrypted w-lan.

W-lan works fine with Madwifi-old and madwifi-ng till r1408 there, but not with r1410 (tried those after reading Ticket #428)

06/20/06 23:27:38 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

After some investigation I found out that the most probable reason for this bug is the chipset used in RoamAbout? access-points. It's an Agere/Orinoco-chipset which sends out malformed management frames. The madwifi-ng-driver introduced checks for those malformed frames in r1409. The malformed frames are discarded shortly after reception (in net80211/ieee80211_input.c) and so the 802.11 state-machine does not "see" the positive answer of the access-point in the association-phase.

Disabling those checks is not really an option since without them malformed (probably "prepared"...) frames would cause illegal memory-access on kernel-level. The only way out seems to be some special checks for Agere/Orinoco-tags to handle the illegal length-parameters.

This bug is also discussed in tickets #353 and #428 (the latter one is more recent but does not provide a soloution, too).

06/21/06 05:29:27 changed by mrenzmann

  • reporter changed from wollenberg@web.de to wollenberg _at_ web _dot_ de.

06/23/06 12:49:19 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

I made a quick&dirty patch which allows association with broken Agere-based access-points. It checks for Agere-elements in beacons and association-responses. When such an element is found parsing of the frame is canceled but the frame is not discarded.

06/23/06 12:51:11 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

  • attachment madwifi-agere-patch added.

allows association with broken access points

06/23/06 12:52:22 changed by mrenzmann

  • patch_attached set to 1.

06/23/06 12:52:51 changed by mrenzmann

  • version set to trunk.
  • milestone set to version 0.9.x - progressive release candidate phase.

06/25/06 12:23:44 changed by anonymous

thanks for the patch, I'll try if it works on monday.

06/25/06 14:08:13 changed by harrier

Works for me, on access points from 3 feet to 30 miles. It didnt before the patch. Thanks Till.

06/26/06 14:26:25 changed by roland (newsletter at digitalhalo dot de)

Works :)

06/26/06 14:33:27 changed by mrenzmann

The patch should be modified as suggested by Pavel on madwifi-users. Rough summary: unknown IEs should be skipped when possible, and cause an error only in case of a corrupted or manipulated frame.

06/26/06 15:00:27 changed by christian_perone@yahoo.com.br

I'm experiencing that problem too, according the IEEE standards the AP is an Agere (MAC 00-02-2D), however I'm not sure if the AP is using MAC Filter or the problem is the association. iwconfig shows the link quality but AP remains: 00:00:00:00:00:00. I will test this patch and put the feedback here.

06/27/06 00:07:07 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

The problem when skipping those unknown IEs is that we cannot rely at all on the given element length. My Orinoco Gold card (firmware 8.72, AP-firmware T1085800) reports correct length (6) for IE 128 but the given length for IE 129 is one byte short.

The RoamAbout? APs report length 6 for IE 128 while sending only 5 bytes. IE 129 is not present.

06/27/06 10:34:02 changed by mrenzmann

@Till: Well, IMO we should not work around implementation bugs that don't apply to all devices MadWifi-driven cards might talk to. The correct way would be to implement proper handling for protocol-conform stations, and provide a way to enable work-arounds for known bugs in other stations. If needs be the user can decide to enable the work-around.

06/27/06 12:17:29 changed by T.M.Williams@cs.bham.ac.uk

Just tried the patch with 0.9.1, seems to have done the trick !

06/27/06 13:11:58 changed by christian_perone aatt yahoo dot com dot br

Well, I have applied the patch and now its ok, the association occurs fast even with low signal. using r1650 with a dlink g520.

07/07/06 19:53:07 changed by espy@pepper.com

A few people here at Pepper ran into this bug over the past week. In one case, the access point was an original Apple Airport and the other was a more recent 'WiFlyer?' access point.

The patch when applied to 0.9.1 allows connection to both problem access points.

I understand the reluctance to add device-specific work-arounds to the driver, but asking an end-user to decide whether or not to enable such work-arounds also doesn't seem to the right solution either.

If I get a chance, I'll take a look at the HostAP driver and see how it handles bad IEs and report back.

07/31/06 19:23:39 changed by osch0001@umn.edu

This patch is still required for Madwifi 0.9.2 to connect to the access points at my university.

08/01/06 09:28:46 changed by kelmo

Lets hope someone can find a clean solution for this, so that these association issues can be improved for 0.9.3 release

08/01/06 18:07:20 changed by weinberg@astro.umass.edu

Thanks! Works for me (ThinkPad? T41p with Atheros 5212 connecting to a vintage Lucent AP with a Orinoco Silver inside [don't know the firmware version]), patched onto svn r1696.

The main reason I experimented with the lastest svn was the hope that this might be fixed.

I hope and plea that a fix or work around will make it into the main release.

08/05/06 11:44:55 changed by Tninkpad X60

Hello.

I try madwifi-driver current svc(with madwifi-agere-patch) on Thinkpad X60.

I set AP to stealth mode and try two way test.


TEST:0 ThinkPad? 11a/b/g Wireless LAN Mini Express Adapter Chipset: Atheros AR5006EX Integrated Mac Processor and Radio Chip: Atheros 5424

Result: In stealthmode, can't connect AP. In not stealthmode, connect AP fine.(But too hot thermal) I think power management is not work.



TEST:1 I-O Data Device, Inc. Unknown device d0216:00.0 Ethernet controller: Atheros AR5212 802.11abg NIC (rev 01)

Result: In stealthmode, connect AP fine. In not stealthmode, connect AP fine.(And not too hot thermal) I think power management is work.


09/02/06 09:57:57 changed by anonymous

The patch also works for me using a Linksys BEFW11S4 with firmware version 8.10.1.

Wireless card:

13:00.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC (rev 01)

I am using r1705.

Thanks Till!

09/02/06 10:17:07 changed by anonymous

Ah, I forgot to mention, I had to comment out the printk's in the patch. My /var/log/messages was growing by about 1 kilobyte per second with the printk's.

Did anyone else have this problem with the patch?

09/02/06 16:24:01 changed by weinberg@astro.umass.edu

Yes, I also commented out the printk statements to avoid log bloat.

09/12/06 23:00:43 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

Although I understand the point that it feels somewhat "unclean" to include such workarounds in a clean driver I fully agree with espy@pepper.com that it's better to include the workaround than leaving such decisions to the end-user.

Of course the printk-calls should be removed from the patch. Maybe someone can add a new patch since I currently don't have the possibility to do so.

09/13/06 05:58:42 changed by mrenzmann

  • status changed from new to assigned.
  • owner set to mrenzmann.
  • milestone changed from version 0.9.x - progressive release candidate phase to version 0.9.3.

The patch should go into the next release, without the printk. In case someone is up to commit this patch, please contact me before, as I would like to modify it slightly before it goes in.

09/13/06 12:18:02 changed by kelmo

@ Till, can you please Sign off on it, so that it can be applied to svn trunk?

I am ready and waiting to commit the patch with modifications as approved by Mike.

09/13/06 12:20:23 changed by kelmo

  • attachment agere-assoc-698.diff added.

proposed commit

09/14/06 02:12:19 changed by mentor

Do firmwares that fix this problem exist for all Agere access points? Is it not easy to update the firmware of the Agere AP?

09/14/06 20:15:28 changed by anonymous

The patch works with Ubuntu 6.06 kernel 2.6.15-26 madwifi-0.9.2 Ubiquiti Networks 300mw SuperRange? Cardbus a/b/g apple airport base station graphite / lucent agere Thank you Till!

09/14/06 23:13:23 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

Signed-off-by: Till Wollenberg <wollenberg (at) web (dot) de>

09/15/06 03:15:40 changed by kelmo

Thanks Till, applied in r1712.

Will leave this ticket open for discussion for a short period, as this commit may solicit a resonse from somebody.

Kel.

09/16/06 16:20:27 changed by xes

Also using this patch, my atheros cannot associate with Alcatel Speedtouch 570. Using the patch with syslog messages there are a lot of messagges about wrong packets but the connection is impossible

09/16/06 16:33:43 changed by mrenzmann

Can you please paste an excerpt of these log messages you mention? Don't forget to enclose them with {{{ and }}}. Thanks.

09/16/06 22:18:51 changed by xes

I was talking about the first patch (madwifi-agere-patch) with madwifi svn1704. In this case, in syslog appears al lot of: "*hack* Agere-element in beacon found" But the association is impossible (access point essid correctly detected,10 seconds on the same channel of the access point, then frequency scan..) With svn1713, in syslog doesn't apeear nothing (printk("*hack* Agere-element in beacon found\n") as been removed..) but the problem is the same. Distro is Fedora Core 5 with 2.6.17-2174 and a roper freelan pci card. The firmware of the access point is the latest available. Maybe that this device uses also other proprietary elements differents from 128 and 129? It's there a way to find this?

09/21/06 19:05:21 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

Maybe you can sniff the communcation between your card and the AP on the 802.11 layer. When I tried back in June using two VAPs on one card (one VAP in sta mode and one in monitor mode) it didn't work. Maybe this issue is fixed now. Back then I used a second WLAN card in a second notebook to sniff the association process.

If you can capture the association process try to create a "clean" dump, i.e. without any other wireless traffic. You can use tcpdump oder Wireshark for capturing. Then compress the resulting file and attach it here or mail it to me and I'll see if I can figure out the reason.

10/17/06 11:14:30 changed by anonymous

Applied patch (agere-assoc-698.diff), still won't associate with access points at my uni. Beacon capture available from http ://www.doc.ic.ac.uk/~pcc03/tmp/hux144-3

10/19/06 10:06:47 changed by Till Wollenberg (wollenberg _at_ web _dot_ de)

anonymous: is 00:0b:7d:... your WLAN card and 00:60:1d:... the AP you're trying to connect to? If so, it looks like you are out of range because there are many retransmissions. Actually your AP has a Lucent chipset but does not suffer from the bug described in this ticket (only IE 128 is present and correct length 6 is reported).

12/08/06 16:04:19 changed by mrenzmann

  • status changed from assigned to closed.
  • resolution set to fixed.

The provided patch for the originally reported problem has been committed by kelmo in r1712. The reporters of the three comments regarding association problems which came in after the commit was announced in here did not yet reply with the requested information.

I therefore close this ticket as fixed. In this case, I suggest that a new ticket should be opened when a similar issue is spotted by users.

12/15/06 18:48:39 changed by

Hi I am trying to associate atheros based clients with prism2 based firware AP (linux-wlan-ng-0.2.5 driver). But everytime this is what i see in iwconfig output

ath0 IEEE 802.11g ESSID:"xthru2"

Mode:Managed Frequency:2.417 GHz Access Point: 00:00:00:00:00:00 Bit Rate:1 Mb/s Tx-Power:16 dBm Sensitivity=0/3 Retry:off RTS thr:off Fragment thr:off Encryption key:off Power Management:off Link Quality=63/94 Signal level=-32 dBm Noise level=-95 dBm Rx invalid nwid:3343 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0

The client is able to see the AP in its scan list, but is unable to associate...The problem exists only with firware AP, works fine for all other APs. Any clues? Thanks in Advance..

12/19/06 17:23:21 changed by mrenzmann

As suggested in my last comment, this ticket has been closed. Please open a new ticket for your issue, providing the necessary information to evaluate it. Thanks.