Please note: This project is no longer active. The website is kept online for historic purposes only.
If you´re looking for a Linux driver for your Atheros WLAN device, you should continue here .

Ticket #1405 (new defect)

Opened 14 years ago

Last modified 14 years ago

Kernel OOPS on insmod of ath_pci.o on IXP425 (gateworks gw2348)

Reported by: apace@ibahn.com Assigned to:
Priority: major Milestone:
Component: madwifi: HAL Version: trunk
Keywords: ixp425 pci oops Cc:
Patch is attached: 1 Pending:

Description

Kernel 2.4.31 Hardware: Wistron CM9 (5213) Revision: 2509

The below crash occurs when attempting to insmod ath_pci.o. I tracked the faulting address down as follows:

ath_pci_probe calls ioremap(phymem, pci_resource_len...) ioremap returns 0x4bff0000 (a valid address in the pci space, but not in virtual memory). 0x48000000 -- 0x4bffffff is reserved as the PCI space.

This value is assigned to dev->priv->aps_sc.sc_iobase. sc_iobase is passed into ath_hal_attach. Perhaps the HAL uses this address directly as a virtual address?

Crash follows:

# insmod ath_hal
Using /lib/modules/ath_hal.o
ath_hal: 0.9.30.13 (AR5210, AR5211, AR5212, AR5416, RF5111, RF5112, RF2413, RF54
13, RF2133, REGOPS_FUNC)
# insmod wlan
Using /lib/modules/wlan.o
wlan: 0.8.4.2 (svn r2509)
# insmod ath_pci
Using /lib/modules/ath_pci.o
ath_pci: 0.9.4.5 (svn r2509)
pci_register_driver c5211d7c adding node 00000000
dev c02e3c00 irq 0 name PCI device 8086:8500
pci_dev_driver for c02e3c00
driver is 00000000
cycling to 6
pci_announce_device c5211d7c, c02e3c00
cycling
dev c02d2000 irq 26 name PCI device 168c:0013
pci_dev_driver for c02d2000
driver is 00000000
cycling to 6
pci_announce_device c5211d7c, c02d2000
id -987686020
probing
Unable to handle kernel paging request at virtual address 4bff8004
pgd = c18f0000
[4bff8004] *pgd=00000000, *pmd = 00000000
Internal error: Oops: f5
CPU: 0
pc : [<c519727c>]    lr : [<c51aaae0>]    Not tainted
sp : c18f7de8  ip : 00000000  fp : c18f7e08
r10: c18f7e94  r9 : c031a160  r8 : c0310000
r7 : 00000000  r6 : 00004004  r5 : 00000023  r4 : c0310000
r3 : 4bff0000  r2 : ff00c020  r1 : 00008004  r0 : c0310000
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  Segment user
Control: 39FF  Table: 018F0000  DAC: 00000015
Process insmod (pid: 66, stack limit = 0xc18f6368)
Stack: (0xc18f7de8 to 0xc18f8000)
7de0:                   00000014 c51aaae0 00000000 c0310000 00000001 c18f7e24
7e00: c18f7e0c c51aabfc c51aaa4c 00000013 c0310000 c031a160 c18f7e60 c18f7e28
7e20: c51a762c c51aabd0 4bff0000 c18f7e94 00000055 c01c4498 00000013 00000000
7e40: c031a160 c031a160 c031a000 c031a160 4bff0000 c18f7e74 c18f7e64 c5197764
7e60: c51a75c8 c18f7e94 00000007 c18f7e78 c519707c c5197668 c18f7e94 c51fd188
7e80: c18f7e94 c18f7ea6 c02d2000 c02d2000 00000001 0000001a c1699d80 c51ff708
7ea0: c031a000 c02d2000 c031a160 00000013 4bff0000 c5211b7c 4bff0000 00000007
7ec0: c520ed88 c031a000 00000000 c5211b08 c5211b7c c5211d7c c02d2000 00000000
7ee0: c1699d20 00000007 c5214000 c00db1f8 c02d2000 c5211d7c 00000000 0015ae50
7f00: c00db29c ffffffea c51fd000 c019d4c0 c520f004 c00287d4 c1300000 c1300000
7f20: c1302000 00000060 c51d4000 c51fd060 000152a8 00000000 00000000 00000000
7f40: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7f60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7f80: 00000000 00000048 00128858 0015ae50 00000080 c001a684 c18f6000 000152a8
7fa0: 00000000 c001a4c0 00128858 0015ae50 00900080 00128660 0015ae50 00000000
7fc0: 00000048 00128858 0015ae50 c51fd000 00128660 00000002 000152a8 0005c190
7fe0: bfffdbc8 bfffdbbc 0003f360 4010c010 60000010 00900080 33cc33cc 33cc33cc
Backtrace:
Function entered at [<c51aaa40>] from [<c51aabfc>]
 r6 = 00000001  r5 = C0310000  r4 = 00000000
Function entered at [<c51aabc4>] from [<c51a762c>]
 r6 = C031A160  r5 = C0310000  r4 = 00000013
Function entered at [<c51a75bc>] from [<c5197764>]
Function entered at [<c519765c>] from [<c519707c>]
Backtrace aborted due to bad frame pointer <00000007>
Code: 13e00000 059d0000 ea000002 e5903014 (e7933001)
Segmentation fault

Attachments

madwifi-reg-access-io-badger.diff (2.0 kB) - added by mentor on 07/17/07 03:04:28.

Change History

06/23/07 02:56:16 changed by mentor

ARM uses memory mapped IO.

A log with debugging symbols or run through ksymoops might be more helpful.

Is this a new problem?

06/23/07 03:01:21 changed by mentor

  • priority changed from minor to major.

06/23/07 05:06:29 changed by apace@ibahn.com

ARM does use memory mapped IO, but the IXP425 is ... 'funny' with its PCI space. There is no transparent memory access to the PCI mem region (although you can still get away with direct access if the memory range is pre-fetchable, which the wireless card is not). In the ixp-425 implementation of ioremap, a physical address is returned for any ioremap call within the PCI space. Thus, when the PCI range for the card is passed to ioremap, it gets the same value right back again. Now, what should happen on this architecture is a write(bwl) or read(bwl) to access the memory. My fear is that the HAL code is using the memory as a direct address. On Monday, when I can get at my development box again, I will try to get a ksymoops for you.

My question would be: I understand that others have used madwifi on ixp425 boards. Thus, it seems that either other cards are pre-fetchable, or something changed in the HAL. I notice that there was a commit to xscale-be-elf.hal.o.uu a month ago. Is it possible/likely that something changed in the access here?

This is a new problem to me, but this is also my first time using madwifi.

06/23/07 09:13:15 changed by rozteck@interia.pl

Hmm. I was using madwifi on IXP425 for about two years amd works. I was using different HALs and didn't observed any problems like described above. The last revision I tried is r2449 so it uses the same HAL as the revision mentioned by you. If the IXP425 would be broken in HAL I wouldn'h have used madwifi anymore. There was some patches for OS_REG_READ and OS_REG_WRITE functions uploaded here - maybe try to find them and apply - maybe it will help.

06/26/07 19:54:25 changed by anonymous

OK, so here is the end result.

I apologize for not having the crash broken out earlier. I am working with a tiny ramdisk and don't customarily have the tools included to run ksymoops. After getting the tools done, I discovered that the crash was not in the HAL as I had supposed (sorry about the false lead, it was as far as I could track with printk()). The crash actually occurred as a result of the _OS_REG_READ and _OS_REG_WRITE macros. Armed with this new information, I went back and found that this issue has already been hashed out many times in previous bugs (particulary #1049)

So, I went in and did a little research. Since this issue has been gone over so many times, I'll just throw out the results here for discussion.

The current trunk version of ah_os.h defines OS_REG_WRITE and OS_REG_READ for big endian systems as a (read/write)l if the value is between 0x4000 and 0x5000, and a raw_(read/write)l otherwise. On the IXP425 platform, this is a problem.

In arch-ixp425/io.h readl is defined to return _raw_readl if the address is within virtual memory, and returns ixp425_pci_read otherwise. The problem with the macros is that they bypass this (necessary) bit of code for the IXP425 and jump straight to _raw_readl. _raw_readl simply casts the pointer as a volatile, and reads it. This cannot be done on the ixp425 for pci addresses for the reason mentioned in my previous comment.

So, the short version of the above is that all addresses in the OS_REG functions need to be run through (write/read)l rather than raw_(write/read)l. Making this change fixes the crash for me.

Now, I am assuming that the defined ranges (4000-5000) were put into these two macros to work around some other bug, perhaps not involving the IXP425. So, assuming that this check is needed for some other big-endian processor, my patch would be to do a separate macro section for CONFIG_CPU_XSCALE which simply defines OS_REG_* as (read/write)l.

Discuss?

06/26/07 19:54:52 changed by apace@ibahn.com

Previous post by me (forgot to put email in box)

06/27/07 22:52:38 changed by apace@ibahn.com

Further information:

I found that using readl and writel for OS_REG_WRITE and READ fixed the crash problem. However, the card would not function. Upon ifconfig up, a continus 'hardware error:resetting' would scroll.

Revertin OS_REG_WRITE and READ to the versions found in r1449 have fixed the issue. IXP425 requires that all pci reads be done using readl/writel, and that those outside of the defined ranges (4000-5000) be byteswapped. I can put a patch together that essentially restores r1449's special case for the IXP425, if that is preferred.

07/12/07 08:49:46 changed by anonymous

quote

Revertin OS_REG_WRITE and READ to the versions found in r1449 have fixed the issue. IXP425 requires that all pci reads be done using readl/writel, and that those outside of the defined ranges (4000-5000) be byteswapped. I can put a patch together that essentially restores r1449's special case for the IXP425, if that is preferred.

please attach this patch here ?

07/16/07 16:25:01 changed by mentor

I've updated the register access code to use the new primitives when it can, and I've hacked the old code. Testing on 2.6 and 2.4 kernels on big endian systems would be appreciated.

07/17/07 03:04:28 changed by mentor

  • attachment madwifi-reg-access-io-badger.diff added.

07/17/07 03:06:14 changed by mentor

  • patch_attached set to 1.

08/13/07 09:10:38 changed by anonymous

madwifi-reg-access-io-badger.diff on line 180 need "\" on end of the line

08/13/07 22:35:49 changed by mentor

08/17/07 00:22:37 changed by mentor

The question is... does it actually work?

09/28/07 06:32:56 changed by david@boreham.org

I'm experiencing kernel panics on a gateworks board (ixp425) running r2568 with a bunch of patches (OpenWrt? 7.07), but NOT this patch. I'm wondering if applying this patch will do me some good ? I don't see crashes upon insmod though -- only when clients associate to my AP. In fact I can achieve a working 802.11a association in the lab between two boards and pass traffic just fine. So I'm not sure if I'm seeing the same problem or not.