Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #1499 (new defect)

Opened 14 years ago

Last modified 13 years ago

NMI watchdog triggered

Reported by: greg.sieranski@quoininc.com Assigned to:
Priority: major Milestone:
Component: madwifi: driver Version: trunk
Keywords: Cc:
Patch is attached: 0 Pending:

Description (Last modified by mrenzmann)

kernel: Uhhuh. NMI received for unknown reason b1 on CPU 0.
Message from syslogd@ at Wed Aug  8 10:51:09 2007 ...
kernel: You have some hardware problem, likely on the PCI bus.
Message from syslogd@ at Wed Aug  8 10:51:09 2007 ...
kernel: Dazed and confused, but trying to continue

I am running a thinkpad t60p with a Atheros Communications, Inc. AR5418 802.11a/b/g/n Wireless PCI Express Adapter (rev 01) under Fedora 7

Change History

08/09/07 21:42:21 changed by mrenzmann

  • description changed.

08/16/07 01:35:40 changed by mentor

  • summary changed from madwifi-ng-current 07-Aug-2007 Causes Kernal to crash to NMI watchdog triggered.

08/19/07 03:55:57 changed by nolan@peaceworks.ca

FWIW, I'm getting the same thing with the same machine/chip, but running Ubuntu Feisty. Happy to help if I can; please send email if there's anything I can provide that would help track down the source of the problem.

08/21/07 14:18:26 changed by siml@szene1.at

i've the same problem,.. i think the same hardware: "03:00.0 Network controller: Atheros Communications, Inc. Unknown device 0024 (rev 01)" but i get "a1": Uhhuh. NMI received for unknown reason a1 on CPU 0. You have some hardware problem, likely on the PCI bus.

If i work with ndiswrapper no error occurs, i'm using the newest svn source of madwifi drivers. AND: if i unload the modules (ath_pci, ath_hal, wlan and so on) and reload it, i can't find any wireless networks.. i always need to reboot the machine! btw: my connection also breaks something with ndiswrapper.... but in this case a reload of the module helps.

ng siml

08/25/07 19:29:57 changed by GChriss@psu.edu

Hi,

Same problem as described above. The card normally works just fine, but the following error occurs intermittently (once every two or three days) necessitating a reboot. The best guess I have is that the card receives a garbled frame and dies. I have also noticed something akin to "Rx buffer overrun" message while rebooting.

Linux hostname 2.6.21-1.3228.fc7 #1 SMP Tue Jun 12 14:56:37 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

03:00.0 Network controller [0280]: Atheros Communications, Inc. AR5418 802.11a/b/g/n Wireless PCI Express Adapter [168c:0024] (rev 01)

Message from syslogd@ at Fri Aug 24 22:09:23 2007 ... hostname kernel: Uhhuh. NMI received for unknown reason b0. [sometimes a0, a1, b1, etc...] Message from syslogd@ at Fri Aug 24 22:09:23 2007 ... hostname kernel: You have some hardware problem, likely on the PCI bus. Message from syslogd@ at Fri Aug 24 22:09:23 2007 ... hostname kernel: Dazed and confused, but trying to continue

Thanks, George

08/29/07 16:19:40 changed by jlquinn@optonline.net

I get the same types of dazed and confused messages from Debian testing as well. This happens with kernels as far back as 2.6.18. I haven't tested further back. The only system is the kernel complaint about NMI. Otherwise, the system appears to behave normally, except that perhaps one time in 10 or 20, the machine freezes on waking up from suspend.

09/04/07 04:17:56 changed by Brian Cunnie

I see the same problem on my Thinkpad T60p (the error messages followed by loss of networking. Though I have suspended & resumed approximately 20 times, I have yet to see an error with respect to suspend/resume, though I take my network card down & up immediately after every resume) (note I see this problem every day or two):

Sep  3 18:41:43 tutu kernel: Uhhuh. NMI received for unknown reason b0.
Sep  3 18:41:43 tutu kernel: You have some hardware problem, likely on the PCI bus.
Sep  3 18:41:43 tutu kernel: Dazed and confused, but trying to continue
Sep  3 18:41:44 tutu kernel: wifi0: rx FIFO overrun; resetting
Sep  3 18:42:15 tutu last message repeated 43 times
Sep  3 18:42:52 tutu last message repeated 50 times

I'm running Fedora 7. Uname -a:

Linux tutu.nono.com 2.6.22.4-tutu #1 SMP Wed Aug 22 16:31:32 PDT 2007 x86_64 x86_64 x86_64 GNU/Linux

Here are the messages upon insertion:

Sep  3 18:45:45 tutu kernel: wlan: 0.8.4.2 (svn r2665)
Sep  3 18:45:45 tutu kernel: ath_pci: 0.9.4.5 (svn r2665)
Sep  3 18:45:45 tutu kernel: ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 17 (level, low) -> IRQ 17
Sep  3 18:45:45 tutu kernel: ath_pci: switching rfkill capability off
Sep  3 18:45:45 tutu kernel: rtc_cmos 00:07: rtc core: registered rtc_cmos as rtc0
Sep  3 18:45:45 tutu kernel: rtc0: alarms up to one month, y3k
Sep  3 18:45:45 tutu kernel: ath_rate_sample: 1.2 (svn r2665)
Sep  3 18:45:45 tutu kernel: ath_pci: switching per-packet transmit power control off

lspci | grep Atheros

03:00.0 Network controller: Atheros Communications, Inc. AR5418 802.11a/b/g/n Wireless PCI Express Adapter (rev 01)

Thanks,

--Brian

11/16/07 07:20:19 changed by heath.parker@gmail.com

Also running into the same/similar issue with a Ubiquiti SRC Atheros AR5213 cardbus card in a IBM/Lenovo T60p. I'm currently running Ubuntu 7.10 (it also did this with 7.04). I received the NMI errors once while running Kismet in Ubuntu 7.04, and it hard froze the system another time. So I went ahead and did a dist-upgrade to 7.10, now it consistently hard freezes the system. I have to power cycle it; the one time I tried removing the card, the system rebooted. It seems to be related to monitor mode, everytime I use Kismet it seems to trigger the problem. The important parts of my current uname output: linux 2.6.22-14-generic #1 SMP i686 GNU/Linux Version of ath_hal: 0.9.18.0 Version of ath_pci: 0.9.4.5 (0.9.3.2) Version of wlan: 0.8.4.2 (0.9.3.2)

If it matters at all, I'm also running the ipw3945 driver for the onboard Intel miniPCI, Version 1.2.2mp.ubuntu1

11/16/07 07:27:10 changed by GChriss@psu.edu

I believe this is the same problem as ticket 1017.

Running '/sbin/iwpriv ath0 bgscan 0' as root is the only workaround that works for me.

Thanks, George

12/02/07 21:52:24 changed by sean@mess.org

I don't think it is the same. At least, on my T60 the driver fails to initialise and I never get a network interface. This is, at the time of writing, the latest version of wireless-2.6 in git.

PCI: 03:00.0 0200: 168c:1014 (rev 01)

dmesg:

[   31.337302] ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 17 (level, low) -> IRQ 21
[   31.337440] PCI: Setting latency timer of device 0000:03:00.0 to 64
[   31.356971] Uhhuh. NMI received for unknown reason a1 on CPU 0.
[   31.357037] You have some hardware problem, likely on the PCI bus.
[   31.357103] Dazed and confused, but trying to continue
[   32.394675] ath5k_hw_nic_wakeup: failed to resume the MAC Chip
[   32.394752] ACPI: PCI interrupt for device 0000:03:00.0 disabled

12/10/07 16:55:54 changed by jimf@aweber.com

Same issue. Once it starts, /var/log/messages gets filled with "wifi0: rx FIFO overrun; resetting" messages, and attempting to bring the device down/up results in the message "ADDRCONF(NETDEV_UP): ath0: link is not ready." Only a reboot seems to help.

Seems that unloading/reloading the modules should do the job, but it doesn't. What else happens during a reboot that could be fixing this? Presumably, whatever it is can be invoked manually. Still not ideal, but certainly better than a full reboot.

12/11/07 00:13:32 changed by Steve

Hi there,

I have the same problem on the same hardware as sean@mess.org. T60p on boot up has the following messages:

ACPI: PCI Interrupt 0000:03:00.0[A] -> GSI 17 (level, low) -> IRQ 22 PCI: Setting latency timer of device 0000:03:00.0 to 64 Uhhuh. NMI received for unknown reason a1 on CPU 0. You have some hardware problem, likely on the PCI bus. Dazed and confused, but trying to continue ath5k_hw_nic_wakeup: failed to resume the MAC Chip ACPI: PCI interrupt for device 0000:03:00.0 disabled ath5k_pci: probe of 0000:03:00.0 failed with error -5

Please help!

12/28/07 07:21:06 changed by wendschh@alumni.princeton.edu

Yes. I have the same problem as well. Is there any workaround that works without having to reboot the computer? Also, I found something potentially interesting by looking at my kern.log:

Does anyone know what this means:

Dec 27 18:56:43 localhost kernel: [13641.904000] Uhhuh. NMI received for unknown reason b0 on CPU 0. Dec 27 18:56:43 localhost kernel: [13641.904000] You have some hardware problem, likely on the PCI bus.

Dec 27 18:56:43 localhost kernel: [13641.904000] Dazed and confused, but trying to continue

Dec 27 18:56:46 localhost kernel: [13644.560000] wifi0: rx FIFO overrun; resetting

The "FIFO Overrun" thing keeps on appearing onece every few seconds, but somewhere down the line, this pops up:

Dec 27 19:26:57 localhost kernel: [15455.724000] ADDRCONF(NETDEV_CHANGE): ath0: link becomes ready

Dec 27 19:27:05 localhost kernel: [15463.684000] ADDRCONF(NETDEV_UP): ath0: link is not ready

Dec 27 19:27:07 localhost kernel: [15465.208000] ath_rate_sample: no rates for 00:19:7e:52:18:44?

Dec 27 19:27:12 localhost kernel: [15470.444000] ADDRCONF(NETDEV_UP): ath0: link is not ready

No clue if that helps at all.

Please help with this!

Dooma

12/28/07 11:38:23 changed by anonymous

Please see Bug 1017.

02/22/08 22:12:06 changed by ondra@blami.net

Hello, I'm running IBM TP X60s with 2.6.24.2 and I'm receiving exactly same NMI message (sometimes it's reason b1, sometimes a1). But I don't have atheros based wifi card inside - mine is Intel ipw3945. I've found some other people across the internet experiencing same behavior without atheros. Maybe this is not madwifi problem - only related to network adaptors (somehow). I'm just googling around ...

04/01/08 23:12:23 changed by anonymous

Same problem with FC8

Message from syslogd@laptop at Apr 2 00:05:26 ...

kernel: Uhhuh. NMI received for unknown reason a0.

Message from syslogd@laptop at Apr 2 00:05:26 ...

kernel: You have some hardware problem, likely on the PCI bus.

Message from syslogd@laptop at Apr 2 00:05:26 ...

kernel: Dazed and confused, but trying to continue

08/10/08 20:34:58 changed by anonymous

same problem with fedora9 on thinkpad t60

i have only noticed it when using encrypted (e.g. wep) wifi, and never when the wifi is unencrypted

09/10/08 19:57:41 changed by Kegeruneku

I'm getting the same error randomly (b1 or a1) on a server with mini pci atheros device ( AR2413, RF5212 ) used as an AP.

Parameters set at boot are :

bgscan 0 bintval 500

It goes on with stuck beacons and no connectivity.