Software, bugs, and unintended consequences

I’m a software developer. I code for a living. Recently, I’m more of a business systems analyst, which means I look at business problems and determine how existing software can be used to solve said problems. Sometimes, customizations to the system or entirely new programming is required.

I’d like to admit I write bug-free code. The vast majority of programming I develop works as intended. Sometimes there are edge cases and other unintended side effects in the programming that either creates “interesting” output, or breaks the code entirely. Other times, “time to complete”, meaning “get it done now” require work to be done as quickly as possible to meet a business need. Usually this is okay when the end users (my customers) understand this code didn’t undergo the strenuous testing, edge case checks, etc., that would otherwise be performed.

Software is complex. I do not say this to excuse, it is a fact. Software can be expensive. Requirements gathering, architecting, developing, testing, user acceptance, polishing, documenting - none of these things are free in either time or money. There’s an old adage, “you can have it fast, cheap, or good; pick two”. As a developer, I strive for “fast and good”, meaning code that performs quickly, does its intended job, and is free from known side effects. Much of the time, I’m asked for “fast and cheap”, meaning get it done as soon as possible. So… do I test less? is the code less than optimized for speed? are there fewer user controls or inputs, or output options that would otherwise be available? will it sometime do “wierd” things if the inputs or data aren’t just so? You bet! I try to let my users know this up front - if you want it sooner, this is compromise.

One of the guys I respect on the iSeries mailing list http://www.midrange.com/ has an adage, “you can test all you want; production data will somehow screw it up”. Meaning: you can have all the test data and use cases you want in your test environment. The live environment always will have something different that is not accounted. Software is complex. There are many interactions with regards to configuration settings, assumptions about the data, assumptions about the process, or even the assumed expectations of the system.

For example: I have a report that allows the user to enter a date range. The report aggregates data based strictly on the date range. Now, a user said, “I put in two months, and I expected the program to provide a summary for each month”. “Is it a bug? Is it an assumed action?” I contend it is not a bug, because the specification for the report in question did not mention summarization across monthly boundaries. Could the report be changed to accommodate - absolutely. Should it? What is the cost/benefit analysis? If it takes say four hours to perform the change (determining what needs to be changed design-wise, programming the change, testing, etc.), how much benefit is derived? If I ask the user to instead run the report for the first month in question, then the second month, which can be done in two minutes, it that a bug? What if the user assumed if two weeks are entered, should a weekly summary be expected?

These are the kinds of “unintended consequences” or unexpected usage of programs that developers don’t necessarily see. In the above situation, we are talking about a report that has been production for three years without change. The “summary by month” request was first made last week. Honestly, we have more pressing issues and higher priorities than summarizing a report by month. Should it be done? Not my call - I do have input on what tasks and projects should be worked on in a particular time frame, but management ultimately decides the priorities. Especially if there is an easy work around that can be implemented by the user, for a situation that occurs infrequently.

Beware of bugs in the above code; I have only proved it correct, not tried it. — Donald Knuth

Is it possible to write perfect software? Yes, when the business inclination is to do so. Two industries come immediately to mind: healthcare and aerospace.

In healthcare, people can die if software is not perfect. I remember a few cases where computer controlled dosage equipment accidentally gave the wrong drug dosage to patients. Any life critical system must be as perfect as possible for obvious reasons. Outside forces can cause otherwise “perfect” code to misbehave.

Can we write rigorous, bug-free software? Absolutely. The Space Shuttle requires absolutely perfect software - in space, you don’t test new code, no matter what you see in Star Trek. The FastCompany article They Write the Right Stuff indicates it’s not only possible, but getting better. But it isn’t the software, or the coders, that make the software nearly perfect.

“It’s the process by which the software is developed that makes it perfect.”

Things such as the development history, extensive testing, code reviews, and voluminous documentation for the slightest change. There is no “quick fix”, and certainly no change without documentation, testing, and review. Oh yeah, and these guys mostly work regular hours.

How about bought software? All software has bugs. Again, not an excuse, but a tradeoff. Your car has bugs. Your house too (and not only the crawling kind). For example, is your house framed with clear, perfectly straight, knot-free lumber? No it isn’t. Should it be? If you have the money, you can buy the best lumber for framing. Does it make sense? Probably not. Building codes specify the minimum grade of lumber needed for habitable housing. Are all the walls perfectly plumb, level, and even? Are all corners exactly 90 degrees?

You can see what I’m getting at. There are tolerances in the expectations and performance of software, just as in house (or furniture or car) building. Some bugs only appear in a blue moon on February 29 when that day falls on a Tuesday. Should that bug be fixed? Should the program not be released in lieu of fixing that bug? If a program doesn’t work the way you expect, is that a bug? Or is it an assumption or expectation?

Wow, this article is long… hopefully I’ve made a point. If not I’ll pick up the thought again.

Update 2014-11-02: The URL for They Write the Right Stuff has been updated.