Andrew

From CSEP590TU
Jump to: navigation, search

Complexity of systems and prevalence of failure

Computers and software are immensely complex engineering feats. Windows Server 2003 was written by about 5000 developers working on over 50 million lines of code.1 Intel's "Prescott" microprocessor contains 330 million transistors in an area the size of a fingernail.2 Given their complexity it's not surprising that we've had the kind of problems with computers that everyone has experienced.

While popular awareness of hardware bugs is limited, there have been a number of issues in consumer-grade microprocessors. Two of Intel’s most famous bugs were the FDIV bug of 1994 and the F00F bug of 1997.3 The former caused the processor to return incorrect results for a particular class of floating-point division operations. The incorrect results affected industries ranging from Wall Street to scientific research. The latter bug causes the microprocessor to halt, freezing the system and—on a network server—any systems relying upon that server.

Software problems, though often less serious, are more well known. One issue which had serious repercussions is that a bug in Windows 95 and Windows 98 can cause the computer to hang after 49.7 days of continuous operation.4 The LAX air traffic radio communications system relies on a monthly manual reboot of the system as a workaround to this bug.5 A similar bug exists in Windows 2000 wherein a message to the program which causes the program to update itself can stop being sent after the computer has been running for 497 days causing the program to cease functioning.6 As this is a server-oriented operation system it can be assumed that more than the single program will be affected by this bug.

Computer users have come to expect that computers are naturally prone to failure. We have come to accept that a computer is an imperfect tool and will work around any problems we encounter. Users will in fact make excuses for bad computer program designs rather than blame the programmers and design engineers.7 One reason users frequently don’t begrudge a computer its faults is that we often use computers to do things we could easily do without computers: the proper functioning of the computer is essentially unimportant because we can do what we need to do without the aid of the computer or we can force to computer to give us the desired results despite what the computer “thinks” is correct.

What happens when we rely on computers to do more important tasks? A software error in General Electric’s XA/21 system contributed to the scope of the 14 Aug. 2003 blackout which affected much of the northeastern American continent.8 When a software error affects so many people should someone be held liable?

Example: a consumer-grade software tool

At first blush it seems obvious that an individual or corporation which produces faulty software should be held liable for the errors. We have product liability laws and computer systems—hardware and software—are just another product. In fact the IRS ruled in 1985 that authors of tax software may be liable for advice given to taxpayers.9 Now that the industry has matured, however, the IRS no longer addresses the issue of liability for software errors in tax preparation software. In fact, the IRS’s current guidelines for e-File providers specify that the software used to file tax returns must pass a test which verifies, in part, that “returns have few validation or math errors.”10 One popular consumer-grade tax software package, TaxCut 2003, includes a standard license exempting the publisher from any liability and makes clear that using the software does not make Block (the publisher of TaxCut 2003) the consumer’s tax preparer.11 This kind of license is ubiquitous in consumer software freeing the publisher of software from any legal liability.

This kind of end-user license may not be enforceable due to an implied warranty of merchantability.12 Despite the IRS’s requirement of “few math errors” one would expect that tax-preparation software would do math correctly. Yet there is very little case law explicitly addressing the liability of software vendors. Because of this legal ambiguity large software vendors are advocating the adoption of a modification to the Uniform Commercial Code Article 2B called the Uniform Computer Information Transactions Act (UTICA). UTICA, if adopted by the individual states, would free software vendors from the responsibility to guarantee that their software works.13

Liability for negligence

How do we determine who should be liable for software errors? In the case of United States vs. Carroll Towing Co. Judge Learned Hand stated a formula to test a company's duty to see that their product does not cause harm. This formula was a function of three variables: P, the probability of failure, L, the gravity of resulting injury given a failure, and B, the burden of adequate precautions. If B is smaller than the product of P and L (B < PL) then the corporation should be found to be negligent.14

What is the probability of failure? Few programmers would claim to have written a program which is free of bugs. One of these bold few was Dr. Donald Knuth author of (amongst other things) the TeX typesetting software. Dr. Knuth said in the preface to his book TeX: The Program that he “believe[s] that the final bug in TeX was discovered and removed on November 27, 1985.” And yet bugs were still being uncovered—though very infrequently—ten years later.15 It is impossible to completely test any reasonably complex system. At over 50 million lines of code (for Microsoft Windows) or 330 million transistors (for an Intel microprocessor) existing computer software and hardware is clearly to [JS "to"->"too"] complex to be completely tested.

The more interesting question is the probability of a failure which causes harm. This approaches Judge Hand’s second question: what is the gravity of resulting injury given a failure? While a physical product is normally intended for one single purpose software is often used for purposes which the original vendor never imagined possible. For example, Lotus 1-2-3 spreadsheet software is used for nuclear reactor core simulation and the design of a lunar base.16 It is impossible to completely test any reasonably complex software program and even more impossible to test it in usage situations for which the software was not originally designed. The manufacturer of wrench would not be held liable for an injury stemming from the use of that wrench as a hammer or screwdriver. Likewise it’s difficult to ascribe liability to a software vendor when software experiences such dynamic application during—and after—its intended lifetime. A good example of this is the Y2K “crisis” which resulted from the fact that software vendors had used two digits to represent a four-digit year. While we look back and say it was stupid to write a program in 1975 that wouldn’t be able to handle the year 2000 we can easily imagine that the software developers of 1975 never thought their programs would still be in use in 25 years.

Burden of software testing

What is the burden of adequate precaution as it applies to testing computer software and hardware? Testing clearly adds more cost to the development cycle in terms of total time spent working on a product. Is the incentive to produce software high enough to include the cost of adequate testing? Consider on one side of the spectrum the open-source free software project. What would be the consequences if the developers of “Open Tax Solver” (an actual open-source tax software project) were held liable for program errors in the same way that a professional tax advisor is liable for incorrect advice?

Software developers generally create software because they have identified a problem they want to solve. As Eric Raymond wrote, “Every good work of software starts by scratching a developer's personal itch.”17 The programmer had a problem to solve and wrote a program to solve the problem. Having already done the work and not having intended to profit from the work the programmer releases the code for other people to use.18 Because the developer has minimal incentive to release software for other people to use the warranty expressed by most open-source licenses can be reduced to “if it breaks, you get to keep both pieces.”

For “Open Tax Solver”, then, we can assume that the burden of adequate precaution is unreasonably high for the software developers. No one is paying for the software, thus, no one should hold the developers responsible for faults in the software. In another way of thinking, “you get what you pay for.” Given that the marginal cost of software is zero, however, at what price point can we say that a commercial developer should be liable for defects in the software?

Incentives for quality assurance testing

The purpose of holding manufacturers liable for negligence is to provide an incentive for manufacturers to produce products of sufficient quality to not pose an unreasonable threat to the users of those products. But product liability generally applies to manufacturing defects as opposed to design defects. Software flaws almost never result from manufacturing defects (as a manufacturing defect—such as an error in the CD replication process—is more likely to make a program not function at all than to function incorrectly.) Hardware manufacturing defects—such as “bad memory” or a defective connection—generally produce unpredictable errors. The design process cannot account for all possible errors. Quality assurance testing cannot trace all possible execution routes through a software program with all possible inputs. And the designer of a hardware or software product often cannot predict all of the execution environments in which the product will be used. Interactions between software programs or hardware components often reveal bugs that weren’t revealed in the normal course of testing.

Companies and individuals have a dual incentive in producing quality software. The first incentive is to produce the software, the second to produce a quality product. There is a balance between features and bugs which exists from the very beginning of the development cycle. Fixing every bug—or deciding exactly what represents a bug—would prevent most products from ever shipping. Consumers pick a products based upon a balance between features and usability. The markets do a good job of sorting out which products are of sufficient quality to be usable.

Future scenarios

The future holds even more uncertainty regarding product liability for computerized products if for no other reason than our increasing reliance on computers in everyday life. Some producers of computerized technologies are being exempted from defect liability: the SAFETY Act of 2002 (Support Antiterrorism by Fostering Effective Technologies) provides liability protection for sellers of “Qualified Anti-Terrorism Technologies.”19 Some producers, however, are being informed of their liabilities up front: the U.S. Food and Drug Administration has published guidelines for computerized systems used in clinical trials20 and informed manufacturers of medical devices that they are liable for embedded computer errors in medical devices with regards to the Year 2000 problem.21

The prevalence of embedded computer systems causes even more problems for the field of liability for computer flaws and security exploitations. Previously manufacturers would use custom-designed hardware and software systems to control consumer products ranging from blenders to automobiles. Now it is often cheaper to use a general-purpose microprocessor and an off-the-shelf operating system such as Embedded Linux or Windows CE to control the consumer product. Who is liable when a flaw or security exploit compromises these products? The flaw in General Electric’s XA/21 system which contributed to the northeast blackout is a known problem in the Linux kernel.22 The SAFETY act protects sellers of qualified anti-terrorism technologies and the FDA regards a medical device as a single unit regardless of the origin of embedded hardware or software. In these cases the makers of the final product, be it an anti-terrorism technology or a medical device, are held liable (or not liable) for the final product.

Endnotes

1 Thurrott, Paul. “SuperSite for Windows.” 20 Nov. 2004 <http://www.winsupersite.com/reviews/winserver2k3_gold2.asp>

2 Intel Corporation. “Intel Unviels World’s Most Advanced Chip-Making Process.” 20 Nov. 2004 <http://www.intel.com/pressroom/archive/releases/20020813tech.htm>

3 Collins, Robert R. “Intel Errata.” Dr. Dobb’s Journal. 20 Nov. 2004 <http://x86.ddj.com/errata/errataseries.htm>

4 Microsoft Corporation. “Computer Hangs after 49.7 Days.” 20 Nov. 2004 <http://support.microsoft.com/kb/q216641>

5 Dallman, John. “49.7 day ‘overloaded with data’ in Los Angeles.” Risks Digest 23.54. 20 Nov. 2004 <http://catless.ncl.ac.uk/Risks/23.54.html#subj10.1>

6 Microsoft Corporation. “WM_TIMER Messages May Stop Being Delivered to Programs in Windows 2000.” 20 Nov. 2004 <http://support.microsoft.com/kb/q322913>

7 Norman, Donald A. The Psychology of Everyday Things. New York: Basic Books (Harper Collins), 1988. pp. 34-36.

8 Poulsen, Kenneth L. “Software bug contributed to blackout.” Risks Digest 23.18. 20 Nov. 2004 <http://catless.ncl.ac.uk/Risks/23.18.html#subj1>

9 Schauble, Paul. “Software developer’s liability.” Risks Digest 3.4. 20 Nov. 2004 <http://catless.ncl.ac.uk/Risks/3.04.html#subj2.1>

10 Internal Revenue Service. Handbook for Authorized IRS e-file Providers of Individual Income Tax Returns. 20 Nov. 2004 <http://www.irs.gov/pub/irs-pdf/p1345.pdf>

11 Block Financial Corporation. TaxCut End-User License Agreement. 20 Nov. 2004 <http://www.taxcut.com/license/TCB2003.pdf>

12 Kaner, Cem. Bad Software: What to do when Software Fails. New York: Wiley Computer Publishing, 1998. 178-181.

13 IDG.net. “UTICA: Summary Information.” 20 Nov 2004 <http://cgi.infoworld.com/cgi-bin/displayStory.pl?/features/990531ucita_home.htm>

14 Landau, Philip J. “Comment: Products Liability in the New Millenium: Products Liability and the Y2K Crisis.” The Richmond Journal of Law and Technology VI:2. 20 Nov. 2004. <http://law.richmond.edu/jolt/v6i2/note4.html>

15 Interestingly, Dr. Knuth pays a small bounty to every person who finds a bug in his software, documentation or books. Kinch, Richard K. “An example of Donald Knuth’s Reward Check” 20 Nov. 2004 <http://truetex.com/knuthchk.htm>

16 Heckman, Corey. “Two Views on Security Software Liability: Using the Right Legal Tools”. IEEE Security & Privacy. 1:1 (2003): pp. 73-75.

17 Raymond, Eric. The Cathedral & the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary. Sebastapol, CA: O’Reilly, 1999. p. 32.

18 The secondary effect of getting contributions to make the program better without extra effort is often not seen in open-source software. Usually the number of contributors is extremely small compared to the number of users.

19 Office of the Secretary, Department of Homeland Security. Federal Register 68:200. (2003).

20 Office of Regulatory Affairs, Department of Health and Human Services. “Guidance for Industry: Computerized Systems Used in Clinical Trials” 20 Nov. 2004 <http://www.fda.gov/ora/compliance_ref/bimo/ffinalcct.htm>

21 Assistant Secretary for Legislation, Department of Health and Human Services. “Statement on Medical Devices and the Year 2000 Computer Problem by William K. Hubbard” 20 Nov. 2004 <http://www.hhs.gov/asl/testify/t990525a.html>

22 The XA/21 system had a bug due to a race condition which is a well-documented problem in the Linux kernel’s swapping of pages in memory. Bovet, Daniel and Cesati, Marco. Understanding the Linux Kernel. Sebastapol, CA: O’Reilly, 2001. pp. 474-474.