|
March 21, 2005
The State of Betas, Releases, and the Public (AKA Xounds and FM Betas)
Please see the bottom of this entry to find out how you, the reader of this blog entry, can help us increase the size of the testing pool. People often complain about some hardware/software Apple just released and how "everyone" is experiencing the same problems and how dare they have such horribly horrible Quality Assurance. It isn't an issue of having or not having a bad QA period. It's more of a problem of having a very limited amount of testers for the stuff. Often the tester pool consists of developers and pro users with very specific (and often common) setups. For software, often the case is the testers are either upgrading from a previous beta release, or upgrading from a "stock" previous release that is very gently used. This doesn't apply to all testers of course, some will have backups of their genuinely used configurations and revert to those then upgrade on each seed. But either way, there is only so much software that such a person can also possibly have installed alongside the software to be tested. And only so many actions (in so many permutations) they are ever likely to perform because they are pro users. They know what they're doing. And once they test a certain set of actions that isn't plainly obvious and it has no problems, they are unlikely to ever check it again. Even though some fix in the future to another part of the program may break that unlikely action. Then there's also the chance that every single one of these people have the same preference set and something only breaks when it is unset or vice-versa. Then when the software actually gets released to the public, people (en masse) find out that a specific action doesn't work in some bizarre circumstances (often circumstances that they think are common) and blame the testers. Even when there was no way that any tester would have ever been in similar circumstances. Evar! Or even though testers might have seen the problem, they might have attributed it to another component that was also distributed with the software or some other thing they were testing (also very possible, especially if the other component really does make the problem worse). For hardware, it's a different issue altogether. Hardware testers (as far as I know) are an even smaller group of an even more "professional" user. Note: In both contexts, pro does not equal intelligent, it means the user has a lot of things set up and knows how to do anything they wish to get done on the computer, even if the methods they use are just plain stupid. Often hardware testers will get hardware in a prototype fashion. Some things may not be screwed as tightly as they would be in the final stages. The hardware may be assembled locally instead of a normal full service assembly or production plant (like in China). The final design might not be ready. Some components may be in different locations than in the final build. Things like that. And although the testers may test for every possible permutation during the design, they can't test for things that change in the production build. To make up some examples... In the prototype phase of a PowerBook, the Airport chips might be on the left of the Bluetooth chips with no problem. In the production build they might be reversed (say they're on free standing PCBs of their own and are connected to the motherboard via a cable so changing positions doesn't require a recircuiting) and might not cause any problem unless the speakers were at full volume, you were playing a network game, used Salling Clicker for your Bluetooth phone, and just happen to get a message via AOL's official AIM client and the sound from it wipes out the all network activity. Even this behaviour may be very uncommon via testing, but extremely common once the PowerBook is released to the public. Alternatively, the people that make the PowerBook prototypes "on site" might not completely screw in everything all the way either because a machine does it and has an automatic stopping point (so it doesn't strip the screw), because the human just isn't all that strong, or just because the componets change so often it's a waste of time. So users testing the prototype might never see white spots on their laptop just because there was never enough pressure put on the sides to keep the LCD attached to the case. But when it gets to China and their super strong humans or machines that don't stop as soon as the on site ones do people immediately complain about white spots, color variations along the edge, or other symptoms of a case putting too much pressure on an LCD. Testers also have an issue with testing items that come up after so many months in a certain temperature. Like the Panasonic AG-DV2000's FireWire port would fail, without exception, after about a year's of usage in a high temperature environment. The chip would just burn out. The high school (in Phoenix) I went to had about 12 of the units, in use at least 4 hours a day (up to 8 hours on some days). And more than once the Air Conditioner had to be turned off in the "winter" and the doors shut closed because the mics we used would pick up the loud noise from the A/C and the "boiler" room was right outside the door. All the AG-DV2000s we had failed and had to be exchanged via the warranty. They eventually just gave us a direct number to call after 3 or so such failures. I can't blame the testers for this failure. I just don't think they could have ever been in these specific circumstances since, IIRC, Final Cut Pro had just been released and there was little reason for testers to beat on the FW port as much as we had. The point of all this is that you just cannot fairly blame the QA period, the testers, or the developers for all the problems that software or hardware may experience. Many times it's just that there is no way they would have ever, "in a million years", have had the same circumstances that caused a problem "everyone" has. Also remember that people with no problem often don't state as such, so you only read about those that are complaining about the issue. Now, about Xounds and FruitMenu. They both suffered from such a problem. The Xounds daemon would crash if a particular key was not available in the preferences, relaunch, then crash again causing really bizarre problems such as receiving a -1 error when emptying the trash, some weird "unable to save library" messages in iTunes (these will not lead to corruption of the files or file system), and some crashiness in Office applications. The problem with these was that testers would immediately check out the new widget in the Xounds preference pane and that would immediately create the missing key. So testers would never have this crash. FruitMenu has some more deep down issues. The main one being that it wouldn't work on 10.2.8. Apple likes to use constant string exports for some APIs and sometimes, although the value would work on 10.2.8, it just isn't available as a constant symbol. This was reported during the testing phase and was fixed. However, somewhere along the way it broke again. Since I don't believe any testers run 10.2.8, we had to test it in house and Apple doesn't exactly make it easy to test software against multiple OS X versions without a huge number of machines. The other huge issue with FM was that due to the number one complaint about FM being that it slowed down application launches, 3.3 would defer building of the Apple menu until some event occurred. Problem was, the event that caused the build in most cases was the first key press during the launch of an application. For applications without normal key input, it'd occur when an application was quit using the Command Q keyboard shortcut. On my G5, the delay was negligible because I don't have very many items in my Apple Menu and didn't have the hotkey prescan depth set high. Both of these meant it took no time at all to build the Apple menu. Conversely, on my AlBook the delay was HUGE but since I was testing a "future version of OS X", I just attributed it to that and the huge memory usage of Safari. I thought it was doing it because when I was quitting, the host application would have to page in huge amounts of memory that Safari forced to get paged out. I know that was a silly excuse now. So I present you with the following FM and Xounds betas. As far as I know, aside from the version numbers, these things are good. They are not to be posted to any update service such as Version Tracker or MacUpdate. However, if you have a friend/lover/child (or one that's all 3) that is experiencing these problems, please feel free to point them here. We do apologize for any inconvenience/problems/psychotic episodes Xounds or FM may have caused. We take full responsibility (not like we can blame someone else for this one even if we were like that) and we hope you'll forgive and forget about this recent bugginess (especially the Xounds one).
Please test these on 10.2.8 if you have it! I should note that the only time I support public betas is when the software is extremely close to release and is mostly feature complete (like before adding features that may make it difficult to test the other new things). I don't like public betas that are done with a castrated version of the software or software that's just released to figure out the features that people would want in the software. Version updates are for the implementation of features people request in the original release of the software. Trackback Pings: TrackBack URL for this entry: Related:
Comments
Wow, I had noticed the lag and hadn't thought to attribute it to a haxie! -- SirG3 Posted by: SirG3 on March 21, 2005 4:14 PMOne suggestion for testing is to keep (or buy on eBay) a relatively older machine. Something top-of-the-line when OS X came out, and create a lot of partitions with different OS versions to test against. It requires a lot of rebooting, but doesn't require a lot of machines. Or, if you're really daring, boot from Linux and run all of these versions of OS X in Mac-On-Linux and skip the rebooting. ;) Posted by: Joshua Ochs on March 21, 2005 9:15 PMRe: Mr. Ochs, Won't work, love. Having an older machine just to test software on means it is barely used and isn't likely to have software set up similar to actual users. Josh, why did you post that link? It's linked to the in article. Or did you have some point in that? Posted by: Rosyna on March 21, 2005 9:19 PMOops. Sorry; didn't see that. (I promise I read your article; but at the time I didn't realize you meant customers when you referred to Apple's Quality Assurance team) Posted by: Josh Zerin on March 22, 2005 6:11 PMJosh, yeah, I figured as much. I was just in a "playful" mood. Like a cat. Posted by: Rosyna on March 22, 2005 6:31 PMI'm testing FruitMenu 3.3.1b1, but it still seems to be slow to pre-scan "Hot Keys". Posted by: Eiichi on March 22, 2005 7:14 PMAs soon as I log in, I press "Hot Keys" to launch an application. Ah, i see. That's not exactly a bug. It's just the fake menu hasn't been built yet. If you launch apps as soon as you log it, it might just be better to make them login items ;) BTW, a future version of OS X will resolve this issue. Posted by: Rosyna on March 23, 2005 1:13 AMI assign "Hot Keys" to many applications no need to make them login items, and FruitMenu 3.2.1 works perfectly. I don't know... your approach to a bug you actually *saw* sounds pretty unprofessional to me. And building the menu when a key is first pressed - what kind of a strategy is that? Why don't you set up a Carbon Timer to build it gradually one line at a time instead of locking up the application? Gee, I think I'll look elsewhere for my GUI enhancements... Posted by: Jake on March 26, 2005 12:44 PMWhat good would setting up a carbon timer be? Then you'd have the chance of the user opening a menu that doesn't have all the items they need in it (a partially built menu). A partially built menu is completely useless. Thus, building a temp menu when a key is first pressed is the correct strategy. Posted by: Rosyna on March 26, 2005 2:39 PMRosyna: obviously in the rare occasion that the menu is opened before it's ready, you'd have to finish the job on the spot, synchronously. But it's much better for the user to have to wait a small amount of time for the menu to open *when they're opening the menu*, than to have to wait for that in the middle of doing something else. Posted by: Jake on April 3, 2005 2:24 AM |

