The Perils and Pleasures of Modeling

April 3, 2020

The term ‘model’ is much in the news, and I’m not talking about @RightAngles trade. It’s the term apparently favored by the media to describe a general area that may also go by: cybernetics, system dynamics, advanced statistics, simulation, control theory, and others. Having some academic and professional background in the domain, this is my (inevitably simplified) attempt to sketch its limits, so you can be smarter than the average journalist.

So, simplifying, as warned: There are two types of models. One is broadly statistical in approach. The other attempts to be more mechanistic.

And there are two major uses of models. One is descriptive: What’s going on here? The other is control: What can we do about it?

Statistical Modeling

This may also be labeled curve fitting, black-box models, deep learning, stochastic models, and more. It means taking as large a sample as possible of system inputs over time, and correlated outputs over time, and building a statistical description of how they relate to one another.

The farthest the mass media go into this territory is the canonical bell curve: “Here is the distribution of salaries for purple humped clerics. Here is the distribution for green crested clerics. They are different -> discrimination!” Having tried to explain the output of complex statistical models to state-level legislators, I have a bit of empathy.

In our current situation, the best known statistical model is the IHME model, being used by both media and government to estimate where the pandemic is headed and, importantly, what resources will be required to meet it. IHME is only slightly more complex than the standard bell curve model, it’s using something called a logistic or S-curve. Statistical modeling is also widely used in another domain temporarily shoved off the front pages, climate.

Why use this? It’s easy to get running – just start watching and recording what’s going on. No need for fancy experiments to isolate cause and effect – which might not be possible anyway – just watch the trends. You can refine things as you go along and get more data. Note that IHME is doing exactly that as more data comes in from states and from countries that are further along in the pandemic. These are very compelling arguments when you are under the gun for forecasts and lives depend on it.

What can go wrong? Just a few things…

Biased or inaccurate sampling. All statistical techniques depend on having a representative sample of the domain in question. What happens if some of the data going into the model has been deliberately perturbed (*cough* China *cough*). What happens if your sampling space, say South Korea or northern Italy, has economic or social practices that differ from where you are attempting to forecast, North America? Nothing good.

Under-sampling and over-extrapolation. These often go together. An unbiased statistical model may be good in areas where you have lots of data, but fall apart outside that sample space, quite a problem if you are trying to forecast extreme conditions. Climate models are notorious for this, using techniques like principal components analysis on limited historical records, and attempting to extrapolate the results into extreme conditions of CO2 and temperature.

Overfitting and incorrect model assumptions. Again, these often go together. It’s an aphorism in the field that you can fit an elephant with enough parameters, meaning roughly that you can always pile on fudge factors to conceal the fact that your underlying system concept is wrong. Hockey sticks come to mind. Simplified statistical epidemic models may fall apart if we try to restart an economy without reaching a steady-state of virus.

Hidden variables and lack of understanding. These are not the same thing, but they will both destroy attempts to use a statistical model for control purposes. Something you can’t currently observe (asymptomatic carriers?) may turn out to be a major driving variable. If you don’t really know what’s in the black box, attempts to drive its inputs to create desired outputs may not go well, particularly when there are inevitable time delays between taking an action and seeing its results.

Simulations

Also known as mechanistic models or just plain science. This is where you attempt to understand cause and effect in some detail, going into internal processes of the system as necessary, and build a mathematical replica. If you’re doing climatology you’ll model things like carbon fixing by plants depending on temperature and CO2 levels. If you are doing epidemiology, you’ll have things like social network density and incubation periods. In the current situation, the best-known model of this type comes from Imperial College of London. This is a simulation that was constructed after the H1N1 pandemic and embeds a detailed model of epidemic spread that was retrospectively tested against the pandemic records.

Why use this? In a word, understanding. If you have some validation of cause and effect mechanisms, you are on firmer ground trying to reach beyond your previous experiences, and in coming up with control strategies, which are both perilous with purely statistical models.

What could go wrong? Just a few things…

Taking the model out of context. What worked very well for Carboniferous forests, may not when we have things like managed tree farms. H1N1 is all well and good, but the Wuflu isn’t actually a flu and propagates differently.

Time, we have no time! Understanding takes time and often controlled experiments, and often neither is or will ever be available before decisions must be made.

Incompleteness. There are very few simulations of any complexity that are completely mechanistic. There’s always some statistical modeling buried in there. The Imperial College model doesn’t actually have all the churches, schools and airports described, instead it has a ‘synthetic population’ generated in accordance with a statistical description. Components like that are subject to all the problems described above for statistical models.

There’s no neat conclusion to this post. None of the models being tossed about are completely right or wrong. They are incomplete. This might give you some sympathy, beyond what the MSM spin will ever provoke, for those modelers being sweated by decision-makers who have trillions of dollars and thousands of lives on the line.

Published in Healthcare

Tags: China, climate, COVID-19, epidemiology, modeling, statistics, Virus, WuFlu

This post was promoted to the Main Feed by a Ricochet Editor at the recommendation of Ricochet members. Like this post? Want to comment? Join Ricochet’s community of conservatives and be part of the conversation. Join Ricochet for Free.

There are 17 comments.

Become a member to join the conversation. Or sign in if you're already a member.

Member
RightAngles
@RightAngles

1:32 PM EDT ⋅ Apr 1, 2020

This so needs dissemination. I’ve been noticing the word “models” too, and couldn’t help being reminded of their use in the ongoing “Climate Change” debate. The hard fact is that they can be used to show whatever someone wants them to show.
- #1

Member
David Foster
@DavidFoster

1:34 PM EDT ⋅ Apr 1, 2020

Good post. I don’t think the average journalist or politician has any idea what people are talking about when they use the word ‘model’. (Well, this kind of model, at least)….It is just some sort of magical crystal ball developed by people with the Right Credentials.
- #2

Member
Stina
@CM

1:50 PM EDT ⋅ Apr 1, 2020

Interestingly enough, there’s not much different between Right Angle’s type of model and statistical models. They are representation of some real thing in the world. Some models are better representative of reality than others and some real events are well represented by models…

But those are quite rare.

As I’ve learned more about tailoring and pattern making in sewing, I’ve come to appreciate models and “generic”, average measurements. But these measurements only get you so far. The model shows off fashion in it’s best way, and general measurements capture a large group, never really fitting anyone perfectly. If I try to make something fit one specific person and “overfit” without enough allowance in the measurements, my real person would never be able to move or eat while wearing what I make.

So, the model is only useful as a narrow representative, and all my curve fitting should provide enough wiggle room for real life scenarios.

See, not so different.
- #3

Member
Jerry Giordano (Arizona Patrio…
@ArizonaPatriot

2:40 PM EDT ⋅ Apr 1, 2020

Good post. You have a typo. It’s the IHME model, not IMHE. I made the same error myself in one of my prior posts on the subject.
- #4

Member
Locke On
@LockeOn

Post author

2:43 PM EDT ⋅ Apr 1, 2020

Jerry Giordano (Arizona Patrio… (View Comment):

Good post. You have a typo. It’s the IHME model, not IMHE. I made the same error myself in one of my prior posts on the subject.

Good catch, fixed. Thank you!
- #5

Member
philo
@philo

2:47 PM EDT ⋅ Apr 1, 2020

Locke On: This may also be labelled curve fitting…It means taking as large a sample as possible of system inputs over time, and correlated outputs over time, and building a statistical description of how they relate to one another.

It is the habitual abuse of the correlation coefficient that would scare most observers. (Not all correlations are equal.)
- #6

Member
Clifford A. Brown
@CliffordBrown

4:12 PM EDT ⋅ Apr 1, 2020

And now to add an extra wrinkle. Listening to the description of the way the IHME model is constantly updated, based on the latest observed data, we are not even talking simple statistical modeling. It is, apparently, Bayesian. That is, there is a special area of econo/socio/biometric analysis, not to be confused with the basic “statistics” necessary to enter the subject area, that provides theoretical underpinnings for adding in information while your “statistical model” is running. You get to continuously improve, if all the conditions are right, your predictions.
- #7

Member
Locke On
@LockeOn

Post author

4:36 PM EDT ⋅ Apr 1, 2020

Clifford A. Brown (View Comment):

And now to add an extra wrinkle. Listening to the description of the way the IHME model is constantly updated, based on the latest observed data, we are not even talking simple statistical modeling. It is, apparently, Bayesian. That is, there is a special area of econo/socio/biometric analysis, not to be confused with the basic “statistics” necessary to enter the subject area, that provides theoretical underpinnings for adding in information while your “statistical model” is running. You get to continuously improve, if all the conditions are right, your predictions.

I’m seeing that as a good thing. They bootstrapped the model using data reports of pandemic behavior out of other countries as their ‘priors’, did a minimum variance fit to their proposed logistic model, and then mapped the states against that model. If they are rerunning the fit as more data comes in, it may average out any bias in the original samples. Unfortunately, as I’ve watched that process over the last couple of days, their forecast of total deaths has trended up (perhaps an effect of washing out understated Chinese numbers??).
- #8

Member
Brian Wyneken
@BrianWyneken

5:49 PM EDT ⋅ Apr 1, 2020

I saw the title but didn’t have time to read the post – does this help?
- #9

Member
LC
@LidensCheng

8:24 PM EDT ⋅ Apr 1, 2020

Locke On (View Comment):

Clifford A. Brown (View Comment):

And now to add an extra wrinkle. Listening to the description of the way the IHME model is constantly updated, based on the latest observed data, we are not even talking simple statistical modeling. It is, apparently, Bayesian. That is, there is a special area of econo/socio/biometric analysis, not to be confused with the basic “statistics” necessary to enter the subject area, that provides theoretical underpinnings for adding in information while your “statistical model” is running. You get to continuously improve, if all the conditions are right, your predictions.

I’m seeing that as a good thing. They bootstrapped the model using data reports of pandemic behavior out of other countries as their ‘priors’, did a minimum variance fit to their proposed logistic model, and then mapped the states against that model. If they are rerunning the fit as more data comes in, it may average out any bias in the original samples. Unfortunately, as I’ve watched that process over the last couple of days, their forecast of total deaths has trended up (perhaps an effect of washing out understated Chinese numbers??).

Agreed, I’m always going to advocate Bayesian modeling. And totally not because Bayesian statistics essentially sums up all of my grad school work and now professional work…
- #10

Member
David Foster
@DavidFoster

7:33 AM EDT ⋅ Apr 2, 2020

Some interesting news about the coronavirus model developed at Imperial College London, the projections of which (initially 500K+ dead, later reduced with different assumptions about social distancing) have received wide publicity and have influenced UK government policy…

Several researchers have apparently asked to see Imperial’s calculations, but Prof. Neil Ferguson, the man leading the team, has said that the computer code is 13 years old and thousands of lines of it “undocumented,” making it hard for anyone to work with, let alone take it apart to identify potential errors. He has promised that it will be published in a week or so….

https://www.wsj.com/articles/coronavirus-lessons-from-the-asteroid-that-didnt-hit-earth-11585780465?mod=searchresults&page=1&pos=1
- #11

Coolidge
Stad
@Stad

6:00 AM EDT ⋅ Apr 3, 2020

My masters thesis was a model of the first wall of a fusion reactor undergoing a plasma disruption. Because there were no mathematical solutions for the equations I used, I wrote a computer program (e.g. a model) using the Crank-Nicholson Method, a central finite difference approach.

One of the things I had to do was a sensitivity analysis, which was to vary the input parameters one at a time by 10%. This way, I could determine what inputs were most important when running the program to get results. However, many of the input parameters were coefficients for the equations. Once I realized this, I knew I could make whatever outcome I wanted (within reason) if I could justify the parameters used. When I coupled this with knowleadge gained by reading How To Lie With Statistics by Darrell Huff, I knew I was home free.

The bottom line is this: many of those jokes and saying about statistics are true:

“If you torture the data long enough, it will say whatever you want.”

“There are lies, damn lies, and statistics.”

“Facts are stubborn things, but statistics are pliable.”

And I believe this one quote sums it all up:

“Statisticians, like artists, have the bad habit of falling in love with their models.”
- #12

Member
philo
@philo

6:14 AM EDT ⋅ Apr 3, 2020

Stad (View Comment): …if I could justify the parameters used.

Drop this silly constraint from your resume and you have the makings of a rather serviceable climate scientist.
- #13

Inactive
Brian Clendinen
@BrianClendinen

8:23 AM EDT ⋅ Apr 3, 2020

I posted this Econtalk from last year on a another post, on how bad the H1N1 models were in 2009 from Google. Gerd Gigerenzer instead used flu-related doctor visits in a region from the two previous weeks and tested it. Guess what, their model was better at predicting the flu, than Googles peer reviewed (Nature Magazine) model.

He explains googles model below:

Gerd Gigerenzer: So, what I’m studying is: What are those simple heuristics that just look at a few variables in order to deal with these huge amounts of uncertainty.

One example I would like to give you is Google Flu Trends. You may recall that Google tried to prove that big data analytics can predict the spread of the flu. And it was hailed with fanfares all around the world when they published a Nature article in 2008 or 2009. And so they had done everything right. So they had fitted four years of data and then tested data means[?]. They had about 550 million search terms and then they had maybe 100,000 algorithms that they tried and took the best one, and had also tested it in the following year.

And then they made predictions. And here we are really under uncertainty.

The flu is hard to control, and people’s search terms are also hard to control. And, what happened is something unexpected–namely, the swine flu came in 2009, while Google Flu Trends, the algorithm had learned that flu is high in winter and low in summer. The swine flu came in the summer. So, it started early in April and had its peak late in September. And of course the algorithm failed because it fine-tuned on the past and couldn’t know that.

Now, the Google engineers revised the algorithm. By the way, the algorithm was a secret, a business secret. We only knew that it had 45 variables and probably was a linear algorithm. Now, in our research, what I would do is now realize you are under uncertainty: Make it simpler. No. The Google engineers had the idea if a complex I algorithm fails, make it more complex.

Russ Roberts: It just didn’t have enough variables.

Gerd Gigerenzer: Yes, yes.

Russ Roberts: It had a cubic term or a quadratic term.

Gerd Gigerenzer: And they changed it to 160 variables–so up from 45–and made predictions for four years. It didn’t do well. And then it silently was out [inaudible 00:26:27] buried it.
- #14

Member
Locke On
@LockeOn

Post author

9:07 AM EDT ⋅ Apr 3, 2020

Brian Clendinen (View Comment):

I posted this Econtalk from last year on a another post, on how bad the H1N1 models were in 2009 from Google. Gerd Gigerenzer instead used flu-related doctor visits in a region from the two previous weeks and tested it. Guess what, their model was better at predicting the flu, than Googles peer reviewed (Nature Magazine) model.

He explains googles model below:

..

The flu is hard to control, and people’s search terms are also hard to control. And, what happened is something unexpected–namely, the swine flu came in 2009, while Google Flu Trends, the algorithm had learned that flu is high in winter and low in summer. The swine flu came in the summer. So, it started early in April and had its peak late in September. And of course the algorithm failed because it fine-tuned on the past and couldn’t know that.

Now, the Google engineers revised the algorithm. By the way, the algorithm was a secret, a business secret. We only knew that it had 45 variables and probably was a linear algorithm. Now, in our research, what I would do is now realize you are under uncertainty: Make it simpler. No. The Google engineers had the idea if a complex algorithm fails, make it more complex.

….

See the OP, under ‘overfitting’….
- #15

Inactive
MISTER BITCOIN
@MISTERBITCOIN

4:59 PM EDT ⋅ Apr 4, 2020

Locke On (View Comment):

Jerry Giordano (Arizona Patrio… (View Comment):

Good post. You have a typo. It’s the IHME model, not IMHE. I made the same error myself in one of my prior posts on the subject.

Good catch, fixed. Thank you!

the website is healthdata.org which is easier to remember
- #16

Inactive
MISTER BITCOIN
@MISTERBITCOIN

5:01 PM EDT ⋅ Apr 4, 2020

Locke On (View Comment):

Brian Clendinen (View Comment):

I posted this Econtalk from last year on a another post, on how bad the H1N1 models were in 2009 from Google. Gerd Gigerenzer instead used flu-related doctor visits in a region from the two previous weeks and tested it. Guess what, their model was better at predicting the flu, than Googles peer reviewed (Nature Magazine) model.

He explains googles model below:

..

The flu is hard to control, and people’s search terms are also hard to control. And, what happened is something unexpected–namely, the swine flu came in 2009, while Google Flu Trends, the algorithm had learned that flu is high in winter and low in summer. The swine flu came in the summer. So, it started early in April and had its peak late in September. And of course the algorithm failed because it fine-tuned on the past and couldn’t know that.

Now, the Google engineers revised the algorithm. By the way, the algorithm was a secret, a business secret. We only knew that it had 45 variables and probably was a linear algorithm. Now, in our research, what I would do is now realize you are under uncertainty: Make it simpler. No. The Google engineers had the idea if a complex algorithm fails, make it more complex.

….

See the OP, under ‘overfitting’….

45 variables?

the curse of multi diminsionality

more variables = higher r^2 or correlation

more variables = less predictive value

the devil is in the assumptions underlying the model
- #17

Become a member to join the conversation. Or sign in if you're already a member.

Conservative Conversation + Podcasts

There are 17 comments.

@