Testing Doesn’t Catch Everything

 

We put things to the test, to discover their limits and minimize human error in their design. Yet sometimes the test itself is imperfect. Like the product it tests, it’s more prone than we’d like to admit to human error and inexperience.

One supremely stressful testing ground is preparing for war. Ricochet member Percival, who has good reason to know about these things, said in a recent thread, “When you test a new weapons system, you generally do it against a target that you have absolute control over. You don’t do it in or near populated areas. You set up a lot of cameras at different angles so you can record everything that happens”. Engineers know from generations of hard experience that tests don’t always catch everything, though, and the reasons are sometimes only obvious in retrospect.

Anti-nuclear people have long thought that Americans in WWII were cruel and reckless to unleash poisonous fallout via the A-bombings. But fallout, which is essentially radioactive burnt soot, was an unpleasant surprise to Manhattan Project scientists, who expected the vast majority of deaths to be caused by direct blast. Based on the well-instrumented, carefully done Trinity test, they didn’t expect much lingering residual radiation afterward. But that’s because Trinity vaporized sand and rocks, not cities of wood and fabric.

With the war over, and a worldwide economic boom slowly gathering strength, people were hungry for growth after decades of depression and conflict. New industries flourished, like electronics, chemicals, plastics, and a newcomer, atomic energy. In general, testing for medical safety was primitive by our standards. This would change.

Thalidomide was developed in Germany and almost exported to the USA until the FDA blocked it, because of an alarming number of birth defects and mutations. As many as 40% of babies born to women who took Thalidomide died within a year. There’d been no wide-scale human test, and none involving pregnant women. The test was, none of the rats died. In 1957 that was enough. By 1962, the uproar over Thalidomide ensured that pharmaceutical testing would be much more demanding. By the way, today it’s an approved, respected drug for certain serious conditions, such as leprosy; but it is kept away from women of childbearing age.

You’d think nothing like this could happen again. In 1975, ads proclaimed “Proctor and Gamble, and a woman gynecologist introduce a new tampon. Remember, we call it Rely”. Rely hadn’t been widely tested because legally, it didn’t have to be. It was classified as not much different from facial tissue, not as a quasi-medical product that required testing. It was, in the language of the time, “grandfathered in”. But despite its rosy, pseudo-feminist branding, Rely was a disaster that caused severe problems, called toxic shock, for thousands of women, with consequences that included amputation of hands and limbs. By the end of the ‘70s it was banned.

What about other industries? There’s a reasonable temptation to believe that Detroit’s half-century of quality problems have been caused by rushing out untested junk, but that’s seldom actually been true.

GM was always in a hurry to market this year’s flashy new stuff—“Patented C-Thru Glass, 20% More Transparent!”—and for decades, people traded cars in much more frequently than in our adult lifetimes. Few expected to hold onto them for long, so why prioritize better appearance and performance five years down the road? But the Germans, the Swedes, and the Japanese did. They didn’t have special testing or magic quality control technology that we lacked. They knew that avoiding rusting chrome, snapped torsion bars, or faded paint required long-duration tests that weren’t rushed.

One of the most notorious cases, the Chevy Vega subcompact, had a big test program by the standards of the dawn of the Seventies. But some of the key things GM needed to know were obscured by long-established testing procedures that babied the $100,000 a copy, hand-built prototypes. Drivers meticulously checked fluid levels and tire pressures before every drive in a way that few American consumers would. So something like the inadequacy of a one-gallon radiator wasn’t obvious until the car was sold to the public.

Ironically, GM’s wealth and size actually worked against it. Smaller manufacturers like Daimler-Benz built six prototypes of a new design and ran them for 80,000 miles each before signing off; 480,000 test miles in total. General Motors could afford to build 100 test cars, run each for 10,000 miles, and boast of a 1 million total mile test program completed in one eighth the time. Clever! But maybe not so clever, because time is exactly what was needed. Common sense tells you that the car with 80,000 miles gives you more information, more confidence about durability and wear than eight 10,000 miles ones put together.

Some gigantic investments were never tested before the money was spent. In a violation of the (then admittedly limited) traditions of the manned space program, the space shuttle’s first flight in 1981 was manned. Forty years later this decision still causes some head-scratching around NASA. The pilot and commander of that first flight later said flatly that they would have aborted it within seconds of liftoff if they’d realized that sheer sound pressure of the launch threatened to damage vital flight control surfaces, to an unexpected degree that the agency would mitigate in later launches. There was no compelling reason for this, except that at some point in the mid-seventies a test flight was chopped to help the budget make it through another year.

The Department of Defense is not exempt from human error either. During the Reagan buildup, smaller but appropriately lethal nuclear warheads were designed to fit shorter range and cruise missiles. The bombs worked just fine when they were carried in bomb bays inside the plane (modern weapons platforms have various jazzy new terms for this, but I’ll stick to familiar ones). Great! But somebody at DoD forgot that in certain uses, the missiles were to be carried underwing. They hadn’t been tested for extreme cold and needed to be redesigned to still work when it was 70 below with a wind blast of 550 miles an hour. Note that the test itself wasn’t faulty. They simply hadn’t asked all the right questions, and it ended up costing millions of dollars.

But those losses are nothing compared to the consequences of a test that was not only faulty, but outright fraudulent. Volkswagen, still associated in pop culture with hippies and love bugs, did something distinctly unlovable; they realized their diesel-engine cars couldn’t pass standard US pollution tests, so they gamed the tests, in essence creating cheating software that would recognize when a test was being done, and cut the performance of the engine just enough to pass it. Amazingly, they didn’t get caught by Big Eco, or the archipelago of federal and state agencies that monitor these things, but by students at a small, unfashionable college who couldn’t explain nagging inconsistencies in their attempts to verify the test numbers. Otherwise, they might well have gotten away with it.

This little stunt, gaming the test, will cost VW a cool $20 billion all told, in fines, lawsuits, and replaced cars. I’m not inclined to cry any Adam Smith tears for the poor Germans. They lied to America. But they’ll pass the one test that really counts: they’ll survive to be allowed to continue to sell cars in the American market.

Published in General
This post was promoted to the Main Feed by a Ricochet Editor at the recommendation of Ricochet members. Like this post? Want to comment? Join Ricochet’s community of conservatives and be part of the conversation. Join Ricochet for Free.

There are 83 comments.

Become a member to join the conversation. Or sign in if you're already a member.
  1. Arahant Member
    Arahant
    @Arahant

    Testing on computer programs seldom goes as well as it should, either.

    • #1
  2. Richard Easton Coolidge
    Richard Easton
    @RichardEaston

    Great post as always Gary! This may be a step too far, but I’d like to extend it to how historians test articles or books. They compare the assertions made with primary source materials such as contemporaneous documents. Take for example the dispute over who invented GPS. My Dad’s main rival Brad Parkinson claims that he and about twelve other people created it at the Lonely Halls Meeting at the Pentagon over Labor Day 1973. He’s talked about a few people who were in the group but has never provided any documents from the time supporting this. He’s mentioned a seven page document dated 4 September 1973 but has never quoted from it. A contact gave me a 21 September 73 addendum which is on the website created by my coauthor. http://www.gpsdeclassified.com/wp-content/uploads/2013/08/addendum-to-4Sep73-dcp-dtd-21-Sep-1973.pdf

    It mentions three scenarios which were in the 4 September doc but the important scenarios 4 and 4a came later (between 4 September and 21 September). This fits in perfectly with my Dad’s assertion that he and Captain David Holmes met with Parkinson over Labor Day. Parkinson said that the system he took over in November 1972 was too expensive and Holmes offered him my dad’s Timation system. Scenarios 1-3 in the Addendum are variants of 621B and 4/4a are variants of Timation and GPS. It’s amazing how the errors in Parkinson’s accounts almost always unfairly attack my Dad’s role and exaggerate his. For example, he attacked the first atomic clocks launched in my Dad’s  NTS-1 in 1974 saying they were definitely not space-hardened. I found an article by Holmes in the mid 70s which said that the clocks, in spite of the lack of time to space-harden them, were approved by the Joint Program Office and worked well. Parkinson was the head of the JPO so he was the guy who gave the OK. He convently forgot this decades later when he was attacking my Dad’s role.

    • #2
  3. Hank Rhody, Freelance Philosopher Contributor
    Hank Rhody, Freelance Philosopher
    @HankRhody

    Excellent article. People really underestimate the importance of epistemology. Another common failure is to design tests that only check for things you expect to be there. You’ll miss something important which is happening just outside your field of view.

    Another aspect of the Atom bombs that wasn’t anticipated, the thermal radiation. Heat. It was enough to burn things like a wicker fence pattern into people’s skin at distances where the blast wave didn’t kill them. In normal explosions there isn’t enough energy, enough heat there to make a difference. Anything you set on fire will have already been crushed by the blast. 

    • #3
  4. GrannyDude Member
    GrannyDude
    @GrannyDude

    Though you didn’t say so, Gary, you’ve provided excellent reasons for considering cries of “but science!” with some skepticism. Not skepticism about the field and process of science writ large, but skepticism about the capacity of human scientists, with human limitations, to achieve “settled” conclusions, especially over short time-frames (many of your examples are ones in which years reveal what months could not!). 

    • #4
  5. KentForrester Coolidge
    KentForrester
    @KentForrester

    You make testing sound so interesting, Gary, that I’m almost sorry I didn’t major in engineering instead of English.  I’ve always liked puzzles, and engineering, as I understand it, is the solving of puzzles.

    Good job. Ricochet thrives when it’s able to inspire articles like your article on testing.

    • #5
  6. Hank Rhody, Freelance Philosopher Contributor
    Hank Rhody, Freelance Philosopher
    @HankRhody

    KentForrester (View Comment):
    and engineering, as I understand it, is the solving of puzzles. 

    Most things are. Your brain wants to solve problems by its nature (Possibly having to do with your male chromosomes, I don’t know). Video games provide puzzles for people to solve, even if we’ve gotten coordinated enough to forget that “master the timing to jump on a goomba” was a challenge. The repeated presentation of simple and not so simple puzzles is what makes a game good to play.

    Have you tried minecraft? It’s not a normal video game, where speed and reflexes are prized. It’s a landscape, with dirt to dig in, caves to mine, and it lets you build things. Whatever you’d like. Towers. My brother Sam built a stadium, just because. His main achievement is a giant hole in the ground. (Yes, really. It’s impressive.)

    If you’re looking for a way to wile away the endless hours in pleasant puzzle solving, it’s a good pastime. Talk to Omega Paladin, he’s got a server and he’d like more people to play on it. His main accomplishment is a working nuclear reactor. (Sort of; the game doesn’t actually let you split atoms as far as I know.)

    • #6
  7. Seawriter Contributor
    Seawriter
    @Seawriter

    There was a friend I worked with back in my early NASA days who kept a poster in his cubical. It has the words “Did You Test Your Design First?” with photos of epic (and gory) NASA design failures. Management made him take it down in February 1986.

    • #7
  8. WillowSpring Member
    WillowSpring
    @WillowSpring

    This reminds me of a conversation I had with my father years ago.  He had a PhD in Physics and was one of the early proponents of what became “Operations Research”, where models and statistics were used for various problems like the optimal maintenance schedule for military vehicles.   I was more of a hands on type and had to tell him when to put oil in the engine or air in the tires.  I was much less impressed with models.

    I was looking through one of his Journals from the IEEE where the fire incident at the Brown’s Ferry Nuclear Power plant was described.  It was caused by a lit candle (!) being used by a technician to try to find a leak in a raceway for cables.  I said to my father, “If you can show me in the reliability model what probability they assigned to the presence of a lit candle in the conduit, I will have more faith in the models.”

    As soon as humans are involved in a situation, it seems that all models and testing will miss something.

    • #8
  9. Arahant Member
    Arahant
    @Arahant

    WillowSpring (View Comment):
    As soon as humans are involved in a situation, it seems that all models and testing will miss something.

    That doesn’t mean we shouldn’t try to test, just that we need to be more creative in our test plans.

    • #9
  10. Stad Coolidge
    Stad
    @Stad

    Dr. Nancy Leveson (https://en.wikipedia.org/wiki/Nancy_Leveson) has done a lot of research into safety.  I’ve touted her work before, particularly in the area of software safety.  One of the takeaways I got was how software failures addressed in system design tend to be inadequate in the real world.  If anyone out there really wants to geek out, read her papers . . .

    • #10
  11. The Reticulator Member
    The Reticulator
    @TheReticulator

    Gary McVey: Clever! But maybe not so clever, because time is exactly what was needed.

    Are you talking about covid-19 vaccines here?

    That reminds me that some people write articles about our “failure” to be prepared for the current pandemic. But I don’t think it’s possible to be prepared for every exigency any more than it is possible for tests to reveal every possibility and mode of failure.  Doesn’t mean we shouldn’t plan and shouldn’t test, of course, or that it’s a good idea to hijack an agency that was supposed to deal with infectious diseases and divert its resources to promote gun control instead.

    Maybe this could be a topic in the upcoming presidential “debates.” 

    • #11
  12. Percival Thatcher
    Percival
    @Percival

    Seawriter (View Comment):

    There was a friend I worked with back in my early NASA days who kept a poster in his cubical. It has the words “Did You Test Your Design First?” with photos of epic (and gory) NASA design failures. Management made him take it down in February 1986.

    Figuring out how you are going to test that your design meets its requirement is the first step.

    Design your test, then test your design, Grasshopper.

    • #12
  13. Percival Thatcher
    Percival
    @Percival

    GrannyDude (View Comment):

    Though you didn’t say so, Gary, you’ve provided excellent reasons for considering cries of “but science!” with some skepticism. Not skepticism about the field and process of science writ large, but skepticism about the capacity of human scientists, with human limitations, to achieve “settled” conclusions, especially over short time-frames (many of your examples are ones in which years reveal what months could not!).

    The science isn’t done.

    We have learned a lot from experience about how to handle some of the ways we fool ourselves.  One example: Millikan measured the charge on an electron by an experiment with falling oil drops and got an answer which we now know not to be quite right.  It’s a little bit off, because he had the incorrect value for the viscosity of air.  It’s interesting to look at the history of measurements of the charge of the electron, after Millikan.  If you plot them as a function of time, you find that one is a little bigger than Millikan’s, and the next one’s a little bit bigger than that, and the next one’s a little bit bigger than that, until finally they settle down to a number which is higher.

    Why didn’t they discover that the new number was higher right away?  It’s a thing that scientists are ashamed of—this history—because it’s apparent that people did things like this: When they got a number that was too high above Millikan’s, they thought something must be wrong—and they would look for and find a reason why something might be wrong.  When they got a number closer to Millikan’s value they didn’t look so hard.  And so they eliminated the numbers that were too far off, and did other things like that.  We’ve learned those tricks nowadays, and now we don’t have that kind of a disease.

    But this long history of learning how to not fool ourselves—of having utter scientific integrity—is, I’m sorry to say, something that we haven’t specifically included in any particular course that I know of.  We just hope you’ve caught on by osmosis.

    The first principle is that you must not fool yourself—and you are the easiest person to fool.  So you have to be very careful about that.  After you’ve not fooled yourself, it’s easy not to fool other scientists.  You just have to be honest in a conventional way after that. 

    – “Cargo Cult Science“, Richard Feynman

    [Emphasis added]

    • #13
  14. Seawriter Contributor
    Seawriter
    @Seawriter

    WillowSpring (View Comment):

    This reminds me of a conversation I had with my father years ago. He had a PhD in Physics and was one of the early proponents of what became “Operations Research”, where models and statistics were used for various problems like the optimal maintenance schedule for military vehicles. I was more of a hands on type and had to tell him when to put oil in the engine or air in the tires. I was much less impressed with models.

    I was looking through one of his Journals from the IEEE where the fire incident at the Brown’s Ferry Nuclear Power plant was described. It was caused by a lit candle (!) being used by a technician to try to find a leak in a raceway for cables. I said to my father, “If you can show me in the reliability model what probability they assigned to the presence of a lit candle in the conduit, I will have more faith in the models.”

    As soon as humans are involved in a situation, it seems that all models and testing will miss something.

    I have built models all my life, starting with plastic kits as a kid. One thing building models taught me was that every model is an imperfect representation of reality and that no model reproduces actual reality with 100% fidelity. That is one reason I have never been overly impressed by someone showing me a model and telling me I have to believe it “because Science!”

    • #14
  15. Nohaaj Coolidge
    Nohaaj
    @Nohaaj

    Great post.  I am surprised no one has linked your thoughts to our current rush to create and vaccinate the entire population.  It was where I expected your post to lead us. As you have alluded, time will tell if that is a wise decision. 

    • #15
  16. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    Nohaaj (View Comment):

    Great post. I am surprised no one has linked your thoughts to our current rush to create and vaccinate the entire population. It was where I expected your post to lead us. As you have alluded, time will tell if that is a wise decision.

    First law of show business: leave ’em wanting a little more instead of a little less…but yes, that was also a thought. 

    • #16
  17. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    The Reticulator (View Comment):

    Gary McVey: Clever! But maybe not so clever, because time is exactly what was needed.

    Are you talking about covid-19 vaccines here?

    That reminds me that some people write articles about our “failure” to be prepared for the current pandemic. But I don’t think it’s possible to be prepared for every exigency any more than it is possible for tests to reveal every possibility and mode of failure. Doesn’t mean we shouldn’t plan and shouldn’t test, of course, or that it’s a good idea to hijack an agency that was supposed to deal with infectious diseases and divert its resources to promote gun control instead.

    Maybe this could be a topic in the upcoming presidential “debates.”

    After 9/11 I read more than one sneering article in the NYT and WaPo about the US Army’s crass stupidity in not having enough non-commissioned officers ready who spoke Dari or Pashto. How the DoD was supposed to anticipate sudden ground war in Afghanistan is beyond me. How many Pashto speakers did newsrooms have?

    • #17
  18. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    Arahant (View Comment):

    WillowSpring (View Comment):
    As soon as humans are involved in a situation, it seems that all models and testing will miss something.

    That doesn’t mean we shouldn’t try to test, just that we need to be more creative in our test plans.

    “Plans are useless, but planning is indispensable” is attributed to Napoleon, but you’d know better than I. Personally, I’d bet Marshal Ney said it but the boss grabbed credit.  

    • #18
  19. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    GrannyDude (View Comment):

    Though you didn’t say so, Gary, you’ve provided excellent reasons for considering cries of “but science!” with some skepticism. Not skepticism about the field and process of science writ large, but skepticism about the capacity of human scientists, with human limitations, to achieve “settled” conclusions, especially over short time-frames (many of your examples are ones in which years reveal what months could not!).

    Thanks as always for reading, GD! People who tut-tut endlessly about how religion should keep its mouth shut about the practical world have few compunctions about misapplying their own prejudices and calling it settled science. There are some docs who are hot stuff on blood flocculation cycles; that doesn’t give them any special insight into Why We’re Here. 
    *

    • #19
  20. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    Richard Easton (View Comment):

    Great post as always Gary! This may be a step too far, but I’d like to extend it to how historians test articles or books. They compare the assertions made with primary source materials such as contemporaneous documents. Take for example the dispute over who invented GPS. My Dad’s main rival Brad Parkinson claims that he and about twelve other people created it at the Lonely Halls Meeting at the Pentagon over Labor Day 1973. He’s talked about a few people who were in the group but has never provided any documents from the time supporting this. He’s mentioned a seven page document dated 4 September 1973 but has never quoted from it. A contact gave me a 21 September 73 addendum which is on the website created by my coauthor. http://www.gpsdeclassified.com/wp-content/uploads/2013/08/addendum-to-4Sep73-dcp-dtd-21-Sep-1973.pdf

    It mentions three scenarios which were in the 4 September doc but the important scenarios 4 and 4a came later (between 4 September and 21 September). This fits in perfectly with my Dad’s assertion that he and Captain David Holmes met with Parkinson over Labor Day. Parkinson said that the system he took over in November 1972 was too expensive and Holmes offered him my dad’s Timation system. Scenarios 1-3 in the Addendum are variants of 621B and 4/4a are variants of Timation and GPS. It’s amazing how the errors in Parkinson’s accounts almost always unfairly attack my Dad’s role and exaggerate his. For example, he attacked the first atomic clocks launched in my Dad’s NTS-1 in 1974 saying they were definitely not space-hardened. I found an article by Holmes in the mid 70s which said that the clocks, in spite of the lack of time to space-harden them, were approved by the Joint Program Office and worked well. Parkinson was the head of the JPO so he was the guy who gave the OK. He convently forgot this decades later when he was attacking my Dad’s role.

    Sometimes I wonder how you put up with it, Richard. To me, Parkinson has become a malign figure, like deceitful Courtney Massengill, Once an Eagle‘s smiling, untrustworthy villain. There’s a deluded lady out there who thinks she’s the Hidden Figures of GPS. Then there are a few people who were honest but wrong, or who helped in refining or implementing your father’s ideas. To your credit, you are usually fairly generous to them. 

    • #20
  21. The Reticulator Member
    The Reticulator
    @TheReticulator

    Gary McVey (View Comment):
    After 9/11 I read more than one sneering article in the NYT and WaPo about the US Army’s crass stupidity in not having enough non-commissioned officers ready who spoke Dari or Pashto. How the DoD was supposed to anticipate sudden ground war in Afghanistan is beyond me. How many Pashto speakers did newsrooms have?

    And if it did have those speakers, it probably wouldn’t have people who spoke Saami when it came time to intercept the reindeer traffic over the North Pole. Not completely out of the realm of possibility, given Putin’s moves in the north polar region. 

    • #21
  22. Gary McVey Contributor
    Gary McVey
    @GaryMcVey

    The Reticulator (View Comment):

    Gary McVey (View Comment):
    After 9/11 I read more than one sneering article in the NYT and WaPo about the US Army’s crass stupidity in not having enough non-commissioned officers ready who spoke Dari or Pashto. How the DoD was supposed to anticipate sudden ground war in Afghanistan is beyond me. How many Pashto speakers did newsrooms have?

    And if it did have those speakers, it probably wouldn’t have people who spoke Saami when it came time to intercept the reindeer traffic over the North Pole. Not completely out of the realm of possibility, given Putin’s moves in the north polar region.

    And what about an offshore cruise missile sneak attack from India? How many of our uniformed forces are ready to sing love poetry in Oriya? 

    • #22
  23. Flicker Coolidge
    Flicker
    @Flicker

    WillowSpring (View Comment):

    This reminds me of a conversation I had with my father years ago. He had a PhD in Physics and was one of the early proponents of what became “Operations Research”, where models and statistics were used for various problems like the optimal maintenance schedule for military vehicles. I was more of a hands on type and had to tell him when to put oil in the engine or air in the tires. I was much less impressed with models.

    I was looking through one of his Journals from the IEEE where the fire incident at the Brown’s Ferry Nuclear Power plant was described. It was caused by a lit candle (!) being used by a technician to try to find a leak in a raceway for cables. I said to my father, “If you can show me in the reliability model what probability they assigned to the presence of a lit candle in the conduit, I will have more faith in the models.”

    As soon as humans are involved in a situation, it seems that all models and testing will miss something.

    Now this is memorable.  And a very good point.

    • #23
  24. TBA Coolidge
    TBA
    @RobtGilsdorf

    Percival (View Comment):

    Seawriter (View Comment):

    There was a friend I worked with back in my early NASA days who kept a poster in his cubical. It has the words “Did You Test Your Design First?” with photos of epic (and gory) NASA design failures. Management made him take it down in February 1986.

    Figuring out how you are going to test that your design meets its requirement is the first step.

    Design your test, then test your design, Grasshopper.

    “Be the ant whose teamwork and long patience builds cities that last forever, and not the scarab who just throws up a bunch of [redacted] and calls it done.” 

    • #24
  25. TBA Coolidge
    TBA
    @RobtGilsdorf

    Percival (View Comment):

    GrannyDude (View Comment):

    Though you didn’t say so, Gary, you’ve provided excellent reasons for considering cries of “but science!” with some skepticism. Not skepticism about the field and process of science writ large, but skepticism about the capacity of human scientists, with human limitations, to achieve “settled” conclusions, especially over short time-frames (many of your examples are ones in which years reveal what months could not!).

    The science isn’t done.

    We have learned a lot from experience about how to handle some of the ways we fool ourselves. One example: Millikan measured the charge on an electron by an experiment with falling oil drops and got an answer which we now know not to be quite right. It’s a little bit off, because he had the incorrect value for the viscosity of air. It’s interesting to look at the history of measurements of the charge of the electron, after Millikan. If you plot them as a function of time, you find that one is a little bigger than Millikan’s, and the next one’s a little bit bigger than that, and the next one’s a little bit bigger than that, until finally they settle down to a number which is higher.

    Why didn’t they discover that the new number was higher right away? It’s a thing that scientists are ashamed of—this history—because it’s apparent that people did things like this: When they got a number that was too high above Millikan’s, they thought something must be wrong—and they would look for and find a reason why something might be wrong. When they got a number closer to Millikan’s value they didn’t look so hard. And so they eliminated the numbers that were too far off, and did other things like that. We’ve learned those tricks nowadays, and now we don’t have that kind of a disease.

    But this long history of learning how to not fool ourselves—of having utter scientific integrity—is, I’m sorry to say, something that we haven’t specifically included in any particular course that I know of. We just hope you’ve caught on by osmosis.

    The first principle is that you must not fool yourself—and you are the easiest person to fool. So you have to be very careful about that. After you’ve not fooled yourself, it’s easy not to fool other scientists. You just have to be honest in a conventional way after that.

    – “Cargo Cult Science“, Richard Feynman

    [Emphasis added]

    On the shoulder of every scientist is a monkey hoping to gain status in his troop. 

    • #25
  26. Percival Thatcher
    Percival
    @Percival

    TBA (View Comment):

    Percival (View Comment):

    Seawriter (View Comment):

    There was a friend I worked with back in my early NASA days who kept a poster in his cubical. It has the words “Did You Test Your Design First?” with photos of epic (and gory) NASA design failures. Management made him take it down in February 1986.

    Figuring out how you are going to test that your design meets its requirement is the first step.

    Design your test, then test your design, Grasshopper.

    “Be the ant whose teamwork and long patience builds cities that last forever, and not the scarab who just throws up a bunch of [redacted] and calls it done.”

    Management likes scarabs.

    • #26
  27. WillowSpring Member
    WillowSpring
    @WillowSpring

    Arahant (View Comment):
    That doesn’t mean we shouldn’t try to test, just that we need to be more creative in our test plans.

    And more skeptical of the results

     

    • #27
  28. TBA Coolidge
    TBA
    @RobtGilsdorf

    WillowSpring (View Comment):

    Arahant (View Comment):
    That doesn’t mean we shouldn’t try to test, just that we need to be more creative in our test plans.

    And more skeptical of the results

    Trust but verify. 

    • #28
  29. EJHill Podcaster
    EJHill
    @EJHill

    The Reticulator: Are you talking about covid-19 vaccines here?

    Now that you bring it up… “Trials of the Oxford vaccine have been paused twice after two participants, both British women, sequentially developed transverse myelitis, an inflammation of the spinal cord that can cause paralysis.”

    • #29
  30. SkipSul Inactive
    SkipSul
    @skipsul

    Gary McVey: But those losses are nothing compared to the consequences of a test that was not only faulty, but outright fraudulent. Volkswagen, still associated in pop culture with hippies and love bugs, did something distinctly unlovable; they realized their diesel-engine cars couldn’t pass standard US pollution tests, so they gamed the tests, in essence creating cheating software that would recognize when a test was being done, and cut the performance of the engine just enough to pass it. Amazingly, they didn’t get caught by Big Eco, or the archipelago of federal and state agencies that monitor these things, but by students at a small, unfashionable college who couldn’t explain nagging inconsistencies in their attempts to verify the test numbers. Otherwise, they might well have gotten away with it.

    To be fair: the EPA diesel requirements had by that time been so constricted that they constituted, practically speaking, a total ban in all but name on small diesels.

    Which neatly illustrates one of the other problems with testing: setting standards that are pure bunkum, then attacking companies for “failing” them.  

    You see this all the time with the crash-test “5-Star” racket ratchet.  A car that earned 5 stars just a few years ago, and whose design hasn’t been refreshed yet, is suddenly rated 2 or 3 stars and sees its sales crate because the test itself has been made ridiculously harder.  The crash test standards got to a point where crash fatalities for head on were cut, then side impacts, then various forms of obliques, but at no point are they ever allowed to be deemed “good enough”, even as visibility has been reduced severely by massive A-pillars and other design constraints.  The standards are digressing from the incidence, and “protecting” drivers against increasingly unlikely scenarios (Our new system tests for 7200 degree rollover while falling down a cliff and catching fire after hitting a line of bicyclists on a blind mountain hairpin during a car chase…).

    I see this in my own line of work, where our solid-state electronic components are pitched head to head against electro-mechanical standards of build and performance, nevermind that what we make is often utterly unlike what we’re up against.  We end up getting dinged at some customers because our stuff “failed” some arbitrary bench test that not only does not actually resemble the real world we’re in, but was designed to test relay contacts, not MOSFETs, microprocessors, and power re-direction.

    • #30
Become a member to join the conversation. Or sign in if you're already a member.