Madwifi Regression Ideas
Trust, but verify -- Quality is built into software, however making it easy to very changes and identify problems goes a long way to helping people "Do the right thing". This page aims to describe [g2]'s thoughts on MadWifi regression testing. Just like Trac and Subversion serves enables developers to focus more on development and less on distractions, a regression test suite should aid too. Hopefully, this effort will serve as a catalyst to building a robust MadWifi regression test suite. Other ideas are welcome, just chat with me in #madwifi.
Here are goals I'm aiming for:
- Verifying high-quality Madwifi drivers on Linux (and later on possibly POSIX platforms)
- Easing of developer testing
- Testing drivers and all arches (I'll be testing XScale LE and BE)
- Fully automated testing with report/web generation
- Eventually, comprehensive driver testing
- Single test or suite execution
Strategy and Terminology Kick-off Phase
There is currently no regression suite for Madwifi. This wiki page kicks off my public notice of attempting to create such a regression suite.
Note: These tests and strategy should work well on other hardware. Other OSes and 2.4.x kernel will work too, I've just got zero motivation and desire to do those. My efforts will be on building a robust suite for Linux 2.6.16+ on XScale hardware.
Let's begin with a discussion of the elements of the regression test system.
- Build system to build kernel and madwifi drivers
- First Madwifi Device under test 1 (DUT-1)
- Test machine and test scripts
In the simplest case, all of this reside on the same machine. A PC or laptop that builds the kernel and madwifi drivers also contains a wifi card and runs the test scripts.
Even in this configuration, it's possible to test some features. For instance, a WDS interface could be established and packets send out. Capturing packet on the master interface could verify that it's correct WDS frames. Likewise, beacons could be tested in the AP mode.
N.B. There is an opportunity to run many of the driver tests under emulation. This is ignored for now.
Full Testbed Components
- Second Madwifi Device under test 2 (DUT-2)
- Wireless monitoring and capture system Kismet (Monitor)
- Dedicated test driver machine (Driver) -- Note initially the Monitor and Driver will be the same machine.
- Optional Web site reporting results
Components not addressed
- Multiple interface cards on the DUTs
- Testing via emulation (no hw available)
- Signal attenuation (fading of RF signals)
The Strategy discussion is about the full testbed configuration. The full testbed configuration contains 2 madwifi DUTs (Device Under Test), a dedicated wireless monitor running Kismet, a test master (running on the monitor for now). Initially, only one set of Radio links will be verified. The XScale DUTs can have 3 or 4 wireless MiniPCI cards, but only one is being tested for now.
- Running Madwifi on Linux 2.6.X (I'm starting with 2.6.16)
- DUTs are running SSH
- DUTs have Madwifi capable card
- DUTs have wireless-tools
- Monitor runs kismet-svn
- Test machine has out-of-band (Ethernet or serial) communication to the DUTs
- Test machine has Internet connectivity
- Test machine has at least the following sw:
- Some DB back end (this needs a little investigation, Sqlite3 look like a good starting place)
- Some logging mechanism TBD
The Test machine polls the Madwifi SVN repository every 15/30/60 minutes looking for a new Madwifi release. When new releases are found:
- the Driver builds and installs the latest release on the DUTs
- The DUTs are rebooted for now to start with a clean testing point for now
- The regression suite is started by running individual tests whose results are aggregated.
All packets over the wireless link are capture for all the tests.
Initially, the tests start out as simple AP/Client or ADHOC tests. As the drivers are verified or stabilized, the tests can cover much more extensive testings. I.e., testing all the bitrates, all the channels, all the features.
Via a cron job check the version every 30 minutes
svn info http://madwifi-project.org/svn/madwifi/trunk > version.txt grep "Revision:" version.txt | sed -e 's/Revision: //' > version cat version
Build system testing
Use buildroot (http://buildroot.uclibc.org/) to create toolchains for several architectures. No need to build anything for the target, not even busybox. uClibc is needed to compile the userspace tools. Use a fixed version of uClibc, not a snapshot.
Build against different kernel versions, possibly against different configurations. Try integrating madwifi into the kernel. This is especially important for 2.4 kernels, where unresolved symbols are only detected at the runtime.
Questions on Design and Goals
What measures are you expecting to quantify & record?
- Compile and OS related
- Compile time errors
- dmesg & /var/log/messages
- Syslog Error and Panic Traps
- Performance and Measures
- Transmit Power/RSSI
- CPU utilization
- Packets per second
- MB per second
- Association times
- Dropouts/scanning & other intermittancies
- nwid & crowded channel interactions
- number and type of VAPS on a radio
- Changing wlanconfig/iwconfig/ifconfig settings
- Required time to delay to avoid HAL & radio errors, race conditions
- Instruction/invocation Order
- VAP modes
- verify wpa_supplicant configuration of various authentication and encryption options (none, WEP, WPA, WPA2, PSK, EAP)
- this requires either dedicated APs for each configuration, or else a way to dynamically change AP configuration
- could either use hostapd for AP, or else standard access points
- Multiple radios
- Multiple VAPS
- Crypto VAP + non-crypto VAP
- Kismet logs
- aireply (etc) logs
- ethereal logs
- Package up all logs & post to madwifi-project.org, another site, email them to developer list...?
- Script a synthesis report: Compile & Ran OK, or not?
- Text reports +/vs HTML
- Report SVN rev, OS, HW - what essential components make for a good cross release/cross hardware analysis
- Basic compile & run success/failure
- Success/Failure of modes
- Stats on packet contents & rates
- Offer ssh/X11 access to the test bed for selected developers to refine the suite & analysis
- Improve radio, CPU & OS selections to broaden the scope of the test suite.
Response to Questions on Design and Goals
First, THANKS for the questions, I was hoping to generate some discussion on the topics.
Here's the what and why on things to quantify and record. Many moons ago I was part of a very large
multi-billion dollar cellular deployment. One of my tasks on the program was working on an multi-million dollar automated test rig. We had 120 phones all fully for automated testing. There were 6 cells sites, a PSTN telephone switch, the whole 9 yards. This included hardware for measuring voice signals and the ability to programmatically attenuate the RF path to trigger handoffs between multiple cell sites in the test rig. The test rig worked very well. Sure there were lots of issues, but on the whole it proved very useful. Both in finding problems before deployment and reproducing critical bugs found in the field.
The Loft boxes Giant Shoulder, Inc. sells are well suited for commercial wireless deployment. I want the software and firmware to be of good quality. Initially, I'm looking to a deploy point-to-point backhaul distribution. A wireless test kit is another product. So I see the regression suite project as something I've got a vested interested in developing. It's also something that benefits many people.
Regression Testing from 20,000 feet
So the regression testing overview breaks down into many areas:
- Madwifi-ng compile time and build issues including:
- Multiple architectures
- Multiple kernels
- Multiple toolchains
- Clean compiles
- Code coverage
- Unit test cases
- All the above for support tools like kismet, ssh, etc ....
- Madwifi run-time issues including:
- This includes testing many of the different areas you mentioned above
- Performance and Tuning issues
- System issues such as multiple units
Phases of regression testing
Here's one breakdown of possible phases:
- Phase 0 (current phase) -- There's nothing so anything is an improvement
- Phase 1 (basic support works for a known set of features) -- This covers the 20% of the features that are used 80% of the time. Here we've got a descent set of test built up from phase 0 so that things like Master mode, managed, and maybe things like WDS and ADHOC just work. Then the focus can be on either feature set testing or peformance.
- Phase 2a (inter-operability testing basic) -- Perform all the test from Phase 1 with other non-Atheros equipment (IWP2xxx, prism) and other drivers.
- Phase 2b (pretty full support many radios) -- This phase covers that other 80% that's used 20% of the time. These are less mainstream uses.
- Phase 2c -- Interoperate with 2b tests.
My view of quality metrics are the following:
- Software compiles and links
- Software has been known to function properly once
- Software usually functions properly
- Software usually functions properly and is stable
- Software almost never fails
- Software almost never fails and MTBF (mean time between failures) is measured in years of significant portions of years
IMHO, we are between the second and third bullets with madwifi.
More Phase 0 Details
Kismet-stable is running and working. Kismet-newcore is built, but untested. An attempt to migrate to kismet-newcore will happen in the next month.
A little more testing on kismet-stable is needed to setup the config file to match the wifi channel being tested and name the .dump file appropriately. Tri-band (A/B/G) card are in my three devices. All configs can be tested, but the A band will be focused on. The two reasons for me to focus on the A band are:
- I'm looking to deploy in the A band.
- There are 0 broadcasters in the A band around the testing and 29 APs in the B/G band.
By using the A band, I'll eliminate the interference issue. By using the B/G bands, link quality will be able to be tested.
I think sqlite3 and a python or web front-end can collate and display the test results.
There's no shortage of work to be done. I'm sure interested individuals or organizations will show up as the ball gets rolling to voice their opinion on things.
Last thoughts for the wiki update pass
As the response you made pointed out, determining what to record and quantify is important. Initially, I'm selecting what I there are the best-of-breed components for the task at hand while trying to minimize re-inventing the wheel. An key element will be having running scripts and a clear strategy laid out so it's easy for others to duplicate and contribute. I think the "meta data" for the testing is very important. I think that includes both the testing frame work and the results.
Cheers, and thanks again to one and all.