Fail Fast, Fail Often

 

shutterstock_24567307There is a paradigm in software (and other engineering disciplines, I suspect) called “fail fast.” On first encounter, it sounds odd. Why fail fast? Don’t we want to delay failures? Of course, the context drives the design; and far too often, in software, one deals with a increasingly complex system that comprises of many moving pieces. So, sometimes, if something fails, it’s much better to know about it sooner than later, so it can be addressed. Jim Shore wrote the best article on the advantages of failing fast to my knowledge, here. He says:

Some people recommend making your software robust by working around problems automatically. This results in the software “failing slowly.” The program continues working right after an error but fails in strange ways later on. A system that fails fast does exactly the opposite: when a problem occurs, it fails immediately and visibly. Failing fast is a nonintuitive technique: “failing immediately and visibly” sounds like it would make your software more fragile, but it actually makes it more robust. Bugs are easier to find and fix, so fewer go into production.

I find parallels between my software experiences and my life. I often notice what works in one paradigm has uses in others and — in many cases — I want my experiences to be “fail fast.” In other words, I want to know sooner rather than later if a new product will be successful, if a new business relationship will work, if a new idea will manifest, etc.

If you want to succeed, double your failure rate. – Thomas J. Watson

In business, it is better to know of upcoming failures early in order to avoid the loss of putting time and money into ideas that are likely to fail. In dating it’s better to dump a guy up front if he doesn’t want kids and you do. I’d rather know this sooner than later. How does this translate to actions? To me, it translates to more encounters, more interactions, more ventures, more things to “start.” Not all ventures will succeed, but in order to increase successes, there must be a higher rate of corresponding failures. And if we have to have failures, why not get them over with quickly?

In order to do so, however, we need to rethink failure as a feedback mechanism. We often make failure out to be a stigma. While failure is not optimal in all situations, sometimes failures are nothing but lessons in disguise. They can be feedback mechanism for actions; they tell us how to improve on things, how not to do something. How else can you discover a new passion without exposing yourself to failure? How many times have we started a project, failed, but discovered a new way to do things? We have all heard stories of Edison failing to invent the light bulb 1,000 times.

So, my friends, fail fast, fail often.

Published in Culture, Marriage
Tags:

Like this post? Want to comment? Join Ricochet’s community of conservatives and be part of the conversation. Join Ricochet for Free.

There are 43 comments.

Become a member to join the conversation. Or sign in if you're already a member.
  1. Carey J. Inactive
    Carey J.
    @CareyJ

    Tuck: I hadn’t started this way, but learned pretty quickly.  But some of the old code didn’t have the error-handling.  I had a default error message at the root of the thing that was never supposed to be hit, but would let me know that the error-handling hadn’t been implemented for the code that had failed.  In a whimsical moment, I unfortunately used the phrase “Who the f— wrote this thing?”

    All computer software error messages ought to include “(insert programmer’s name) wrote this program. Call him at (insert programmer’s phone #) and tell him about it.”

    If Microsoft did this, their 1.0 software releases would be fit to use, because the programming staff would actually test their stuff thoroughly before dumping it on the suspecting public.

    • #31
  2. Pilli Inactive
    Pilli
    @Pilli

    Speaking of failures…

    When should software be refactored?  It seems that there comes a time in the life of every project when programmers start  saying, “We ought to leave that alone.  It’s so full of patches that we are bound to break  a bunch of other things.”

    • #32
  3. david foster Member
    david foster
    @DavidFoster

    It’s true that many people are too afraid of failure, and this phenomenon has surely been made worse by the “self-esteem” fetish.  But there are failures, and then there are failures.  Tom Watson of IBM may have said “double your failure rate” to succeed in business, but if IBM’s System/360 project had failed, that would likely have been the end of the company.

    It is often desirable to release less-than-perfect software to gain market feed back and see where to focus the improvement efforts…but if the software is the system that conducts low-visibility landings for airliners, then failures can be catastrophic.

    The aerodynamicist von Karman telegraphedUSAF General Bernard Shriever , after a series of successful missile tests, “Bennie, you must not be taking enough chance”….and he may have been right…but if the system in question were the command-and-control system for nuclear consent, “failure” would have a very different meaning from merely losing an unmanned missile.

    Just about everybody gets their heart broken at some point in early relationships with the opposite sex, and that is normal…indeed, I would have to wonder about someone who this never happened to…but when a relationship blows up after 15 years and 3 kids, that is another matter entirely

    • #33
  4. Midget Faded Rattlesnake Member
    Midget Faded Rattlesnake
    @Midge

    david foster:It’s true that many people are too afraid of failure, and this phenomenon has surely been made worse by the “self-esteem” fetish.

    I believe it’s quite possible that fetishizing self-esteem has made it worse, but it’s hard not to notice that people with high levels of genuine self-esteem (whether due to an innate personality quirk or to prior feelings of earned success) tend to bounce back from failure better. Or perhaps there is an optimum, and both the very overconfident and the very underconfident (people who could genuinely said to need more “self esteem”) are missing the mark.

    • #34
  5. Barkha Herman Inactive
    Barkha Herman
    @BarkhaHerman

    I was gonna say… willingness to fail = High self esteem.

    • #35
  6. Barfly Member
    Barfly
    @Barfly

    Pilli:Speaking of failures…

    When should software be refactored? It seems that there comes a time in the life of every project when programmers start saying, “We ought to leave that alone. It’s so full of patches that we are bound to break a bunch of other things.”

    Ideally: all the time, every release.

    Practically:  the first time the existing stuff causes you to blow a release.

    Mostly: why do you want to rebuild stuff we already have? I need this other thing …

    • #36
  7. SParker Member
    SParker
    @SParker

    Barkha Herman: How many times have we started a project, failed, but discovered a new way to do things? We have all heard stories of Edison failing to invent the light bulb 1,000 times. So, my friends, fail fast, fail often.

    You know the Rule of Proverbs, right?  For every piece of good advice, there’s an equally good piece of advice that tells you to do the exact opposite. (e.g., “He who hesitates is lost”/”Look before you leap.” )

    Old-timers will recognize “fail fast, fail often” as uncomfortably close to the original meaning of “hacking.”  It meant treating a problem the way an amateur butcher treats a steer carcass.  Naturally, a pejorative–it’s just not a sure sign of brightness–made all the worse in an age when computer time was a precious resource and the “hacker” was using a seat at a terminal that you really needed to finish that compiler project due tomorrow and that in hindsight you should probably have been working on a little more steadily since the 2nd week of the quarter.  Naturally you don’t blame yourself for sloth, but the effing idiot who just can’t seem to work out the details of his bisection algorithm to find the roots of a polynomial (and God help us when he turns the page to Newton’s Method).  Bottom line:  it probably still pays dividends to give a little thought to what you’re doing before you do it.

    • #37
  8. SParker Member
    SParker
    @SParker

    I suspect the counter-example that blows the Rule of Proverbs out of the theoretic water is Jonah Goldberg’s citing of  the Russian proverb, “if you meet a Bulgarian in the street, beat him.  He will know the reason why.”  Jiggered if I can think of the inverse.  I do predict, however, that it will be a Bulgarian who eventually comes up with it.

    • #38
  9. Carey J. Inactive
    Carey J.
    @CareyJ

    SParker:

    Old-timers will recognize “fail fast, fail often” as uncomfortably close to the original meaning of “hacking.” It meant treating a problem the way an amateur butcher treats a steer carcass. Naturally, a pejorative–it’s just not a sure sign of brightness–made all the worse in an age when computer time was a precious resource and the “hacker” was using a seat at a terminal that you really needed to finish that compiler project due tomorrow and that in hindsight you should probably have been working on a little more steadily since the 2nd week of the quarter. Naturally you don’t blame yourself for sloth, but the effing idiot who just can’t seem to work out the details of his bisection algorithm to find the roots of a polynomial (and God help us when he turns the page to Newton’s Method). Bottom line: it probably still pays dividends to give a little thought to what you’re doing before you do it.

    Elaborate exception handling that keeps a system running even though compromised may be necessary for a fly-by-wire control system, but for most data processing applications, responding to an exception with termination and a clear error message identifying the failure point is preferable to an “exception handler” which masks the problem. If function int xyz(int a) fails with a divide by zero error when a=10, you should get a “divide by zero error in xyz(10)” message, immediately.

    • #39
  10. Owen Findy Inactive
    Owen Findy
    @OwenFindy

    donald todd: Actually it was about making communications (voice and data) work across nodes by making the nodes recognize and respond to each other by allocating channels and bandwidth to accommodate the voice or data call through its conclusion.

    When I was a mainframe programmer, we said it as a complaint about people (programmers and managers) who did not want to spend the time to do it best the first time, but were instead in a hurry to get it done half-assed.

    • #40
  11. Barfly Member
    Barfly
    @Barfly

    Carey J.: If function int xyz(int a) fails with a divide by zero error when a=10, you should get a “divide by zero error in xyz(10)” message, immediately.

    That’s not exactly the situation with which the “fail-fast” doctrine is concerned. If a divide error occurs then that’s the failure – it has happened. Fast, for that matter. You’re talking about error handling.

    Suppose void setAlarm(int* a) is invoked with a null argument. The function might merrily set some (pointer) variable to that value. Only later at some asynchronous time will a failure occur, when the variable is accessed and used. A fail-fast implementation of setAlarm(int*) that detected a null argument would not just write some worrisome lines in the log. It would actually fail at that point, interrupting the program flow and generally invoking whatever exception handling mechanism.

    • #41
  12. Carey J. Inactive
    Carey J.
    @CareyJ

    Barfly:

    That’s not exactly the situation with which the “fail-fast” doctrine is concerned. If a divide error occurs then that’s the failure – it has happened. Fast, for that matter. You’re talking about error handling.

    Suppose void setAlarm(int* a) is invoked with a null argument. The function might merrily set some (pointer) variable to that value. Only later at some asynchronous time will a failure occur, when the variable is accessed and used. A fail-fast implementation of setAlarm(int*) that detected a null argument would not just write some worrisome lines in the log. It would actually fail at that point, interrupting the program flow and generally invoking whatever exception handling mechanism.

    Which, for most application software, is a better choice than implementing an exception handling system (or using one built into the language) to keep a broken program staggering along. Device control software and operating systems have to keep running or people can die. But for just about everything else, failure on detection of an error is cheaper in the long run.

    Agreed that using an uninitialized pointer is bad medicine. Most OSes will zap a process which tries to access memory outside it’s allocated space, unless the process is running at a high privilege level. But even if it stays inside the process’ memory space, it’s probably going to corrupt data. I personally consider explicit pointers to be in the same category as GOTO statements – more trouble than they’re worth.

    • #42
  13. david foster Member
    david foster
    @DavidFoster

    Some of the most insidious failures are theres in which there is a known defect or limitation in a system, but bureaucratic/political reasons prevent this from being appropriately responded to, and the system continues in use as if the defect were not there.

    See the truly horrifying case of Washington Metrorail Train T-111:

    Blood on the Tracks

    • #43
Become a member to join the conversation. Or sign in if you're already a member.