This is an elaboration of a comment I made in @indymb ‘s post “Is there any point in writing to a Congressperson?” and I’m indebted to him and @Misthiocracy (who has experience working for a Canadian Member of Parliament, I understand) for the details on how all this works. Briefly, we’ll look at a simple task done every day in the houses of government and at how we’d train a computer to do it better.
Briefly, as you may have expected, the letter to your Senator isn’t so much read as processed for the minimum amount of information and interaction required. I’ll quote the meat of his description of the process and then describe how I’d go about automating it. You’re encouraged to go back and read his post (and it should go without saying on Ricochet but the comments too).
A regular constituent would write a letter. Based on that, I knew that it should go to an [Legislative Correspondent]. Each LC and [Legislative Assistant] are assigned different subjects, or policy areas. I, the lowly intern, would read the letter just long enough to figure out the subject. Once I knew that, I knew which LC should receive it, and I put it in his or her pile of mail.
So for example, let’s say you write a letter about immigration. Jane is the lucky LC who has been assigned the topic of immigration. I put your letter on Jane’s desk.
Jane adds your letter to her large stack of letters. Jane will read your letter and determine whether you are for or against immigration. No, Jane will not be considering the nuances of what you wrote, such as being for legal immigration, but against illegal immigration.
When she finally gets to your letter, Jane will read it just long enough to determine which side of the issue you are on, so that you receive the form letter intended for people who agree with you.
To cut it down to its essence:
- The unpaid intern opens the letter; determines if it’s someone mildly important or unimportant, and just enough to know what they’re talking about.
- Based on those two data points he routes it to the appropriate LC or LA
- The LC or LA reads the letter just enough to know if the writer is ‘for’ or ‘against’
- The LC or LA sends a form letter.
This has several hallmarks of a task that can be automated.
- Rote decision making which doesn’t require a terrible lot of creativity.
- Lots of human labor.
- The possibility of easy improvements to get more out of the task.
In our improved process, we still start with the intern. Unpaid labor is cheap; no need to automate away interns. The intern takes the envelope, scans the return address, opens the envelope, and scans the letter. That’s all the human action that you need on this end; everything else gets done in the program. At this point, the program resets it’s data entry fields so the intern can scan the next letter. I estimate 10-15 seconds of human time required per letter.
The program takes the picture of the address and letter and converts it into digital text. There’s long been work done on computers reading text from pictures, and while the results still aren’t perfect they’re good enough for, as it were, government work. You’ll lose some data that way but that isn’t important. Individual letters can slip through the cracks as long as you’ve got good enough results with the rest. If you were treating constituent correspondence as important you wouldn’t be treating it this way to begin with. Handwriting is harder to read than printed words, but I’m told the majority of what you get these days is print.
With emails you can skip the first two steps; it’s already in the computer-readable text. You just need to route your public email inbox into the same engine as the scanned letters.
Now comes the hard part; the program needs to understand what the letter is about. This is called “Natural Language Processing”, and is actually a very difficult problem for computers to solve. However, in this day and age, it gets solved every time you say “Hey Alexa, play some George Clinton.” In practice, this would require either training an A.I. to read constituent ravings or outsourcing to a cloud service like Amazon to do that for you. This problem is probably why this hasn’t been done before. That and the customer base for your solution is sort of narrow. It’s still a problem that can be solved.
What are you asking the program to process? You’re attempting to duplicate the extant human labor. Imagine a three-page screed directed at Diane Feinstein broken down into its constituent ideas: (Climate Change : Bad), (Gun Control : Good), (Orange Man : Bad). The fact that you can extract more than one position from the text already provides more value than your LCs did.
Note that you’re only reading the positions you already told the program to look for. If you get a new issue (Wall : Bad) it might be harder to train your system to look at it. Then again it might not; I’ve got not much experience working with machine learning. Adding a column to your data table is simplicity itself.
These bits of data get stored in a table. Here, let’s say that we processed that manifesto. In the table it looks like this:
We see the name (which has been changed to protect the stupid), and the positions stripped of all nuance. It’s either good, bad, or “NULL” if they didn’t have anything to say about it. (I know that it staggers belief that a ranting leftist could go three whole pages without mentioning LGBT issues but work with me here.) Every line in the database is associated with one person. When the program reads their first letter it writes what opinions it sees into the table, leaving anything missing null. Every letter after that writes over the previous values (as long as the letter covers that issue; don’t reset things to null.) This ensures you keep up with changing minds as much as you can from letter to letter. You can also queue up the appropriate form letter to be printed out so that the intern can pick ’em up and drop ’em in the mailbox.
Now think about how you could use that data. First of all, one of the Senator’s minions can crunch those results into bright and shiny graphs. Higher-ups always like graphs. The group of people who write letters to legislators doesn’t make a representative sample of the population, but some data is better than none. This sort of information should help your politician prevaricate with the shifting political winds. Just be sure you’ve got a good spam filter on your email; you don’t want him developing positions on enhancement pills and Nigerian finance opportunities.
There’s more than that. Misthiocracy mentioned the more on-the-ball legislator staffs keeping an index of constituents who have written in. When the time comes to send campaign mailers you then know to pound the table on which issue. It’s much easier to keep track of that in a database, and from there you can also get much better customization in your form letters. Recall in the Obama Campaign how people would receive messages with “If you could donate just $X it would really help our cause”, and different people would get the exact same message with the dollar value switched out depending on the wealth of the recipient. How do you think they accomplished that?
That’s the kind of thing you can do with “Big Data”, and congratulations, you’ve just entered into the wonderful world of Big Data. If you’re feeling a bit of sleaze on you right now, well, recall that we’re talking Washington D.C. There’s more where that came from.
What then does that imply for the congressional staff? Not much. The interns still fetch coffee. The LAs and LCs no longer do much correspondence duty, but I doubt their numbers would decrease. As Misthiocracy reminds us those phony-baloney government jobs are payback for donors and volunteers from campaign time. Still, if someone were to actually write that program and my ideas caused some swamp denizen to lose its job, well, I wouldn’t be heartbroken.