Contributor Post Created with Sketch. Recommended by Ricochet Members Created with Sketch. How to Automate a Job Out of Existence

 

This is an elaboration of a comment I made in @indymb ‘s post “Is there any point in writing to a Congressperson?” and I’m indebted to him and @Misthiocracy (who has experience working for a Canadian Member of Parliament, I understand) for the details on how all this works. Briefly, we’ll look at a simple task done every day in the houses of government and at how we’d train a computer to do it better.

Briefly, as you may have expected, the letter to your Senator isn’t so much read as processed for the minimum amount of information and interaction required. I’ll quote the meat of his description of the process and then describe how I’d go about automating it. You’re encouraged to go back and read his post (and it should go without saying on Ricochet but the comments too).

A regular constituent would write a letter. Based on that, I knew that it should go to an [Legislative Correspondent]. Each LC and [Legislative Assistant] are assigned different subjects, or policy areas. I, the lowly intern, would read the letter just long enough to figure out the subject. Once I knew that, I knew which LC should receive it, and I put it in his or her pile of mail.

So for example, let’s say you write a letter about immigration. Jane is the lucky LC who has been assigned the topic of immigration. I put your letter on Jane’s desk.

Jane adds your letter to her large stack of letters. Jane will read your letter and determine whether you are for or against immigration. No, Jane will not be considering the nuances of what you wrote, such as being for legal immigration, but against illegal immigration.

When she finally gets to your letter, Jane will read it just long enough to determine which side of the issue you are on, so that you receive the form letter intended for people who agree with you.

To cut it down to its essence:

  1. The unpaid intern opens the letter; determines if it’s someone mildly important or unimportant, and just enough to know what they’re talking about.
  2. Based on those two data points he routes it to the appropriate LC or LA
  3. The LC or LA reads the letter just enough to know if the writer is ‘for’ or ‘against’
  4. The LC or LA sends a form letter.

This has several hallmarks of a task that can be automated.

  • Rote decision making which doesn’t require a terrible lot of creativity.
  • Lots of human labor.
  • The possibility of easy improvements to get more out of the task.

In our improved process, we still start with the intern. Unpaid labor is cheap; no need to automate away interns. The intern takes the envelope, scans the return address, opens the envelope, and scans the letter. That’s all the human action that you need on this end; everything else gets done in the program. At this point, the program resets it’s data entry fields so the intern can scan the next letter. I estimate 10-15 seconds of human time required per letter.

The program takes the picture of the address and letter and converts it into digital text. There’s long been work done on computers reading text from pictures, and while the results still aren’t perfect they’re good enough for, as it were, government work. You’ll lose some data that way but that isn’t important. Individual letters can slip through the cracks as long as you’ve got good enough results with the rest. If you were treating constituent correspondence as important you wouldn’t be treating it this way to begin with. Handwriting is harder to read than printed words, but I’m told the majority of what you get these days is print.

With emails you can skip the first two steps; it’s already in the computer-readable text. You just need to route your public email inbox into the same engine as the scanned letters.

Now comes the hard part; the program needs to understand what the letter is about. This is called “Natural Language Processing”, and is actually a very difficult problem for computers to solve. However, in this day and age, it gets solved every time you say “Hey Alexa, play some George Clinton.” In practice, this would require either training an A.I. to read constituent ravings or outsourcing to a cloud service like Amazon to do that for you. This problem is probably why this hasn’t been done before. That and the customer base for your solution is sort of narrow. It’s still a problem that can be solved.

What are you asking the program to process? You’re attempting to duplicate the extant human labor. Imagine a three-page screed directed at Diane Feinstein broken down into its constituent ideas: (Climate Change : Bad), (Gun Control : Good), (Orange Man : Bad). The fact that you can extract more than one position from the text already provides more value than your LCs did.

Note that you’re only reading the positions you already told the program to look for. If you get a new issue (Wall : Bad) it might be harder to train your system to look at it. Then again it might not; I’ve got not much experience working with machine learning. Adding a column to your data table is simplicity itself.

These bits of data get stored in a table. Here, let’s say that we processed that manifesto. In the table it looks like this:

Apparently, this crazy lady didn’t leave a return address. Wonder why.

We see the name (which has been changed to protect the stupid), and the positions stripped of all nuance. It’s either good, bad, or “NULL” if they didn’t have anything to say about it. (I know that it staggers belief that a ranting leftist could go three whole pages without mentioning LGBT issues but work with me here.) Every line in the database is associated with one person. When the program reads their first letter it writes what opinions it sees into the table, leaving anything missing null. Every letter after that writes over the previous values (as long as the letter covers that issue; don’t reset things to null.) This ensures you keep up with changing minds as much as you can from letter to letter. You can also queue up the appropriate form letter to be printed out so that the intern can pick ’em up and drop ’em in the mailbox.

Now think about how you could use that data. First of all, one of the Senator’s minions can crunch those results into bright and shiny graphs. Higher-ups always like graphs. The group of people who write letters to legislators doesn’t make a representative sample of the population, but some data is better than none. This sort of information should help your politician prevaricate with the shifting political winds. Just be sure you’ve got a good spam filter on your email; you don’t want him developing positions on enhancement pills and Nigerian finance opportunities.

There’s more than that. Misthiocracy mentioned the more on-the-ball legislator staffs keeping an index of constituents who have written in. When the time comes to send campaign mailers you then know to pound the table on which issue. It’s much easier to keep track of that in a database, and from there you can also get much better customization in your form letters. Recall in the Obama Campaign how people would receive messages with “If you could donate just $X it would really help our cause”, and different people would get the exact same message with the dollar value switched out depending on the wealth of the recipient. How do you think they accomplished that?

That’s the kind of thing you can do with “Big Data”, and congratulations, you’ve just entered into the wonderful world of Big Data. If you’re feeling a bit of sleaze on you right now, well, recall that we’re talking Washington D.C. There’s more where that came from.

What then does that imply for the congressional staff? Not much. The interns still fetch coffee. The LAs and LCs no longer do much correspondence duty, but I doubt their numbers would decrease. As Misthiocracy reminds us those phony-baloney government jobs are payback for donors and volunteers from campaign time. Still, if someone were to actually write that program and my ideas caused some swamp denizen to lose its job, well, I wouldn’t be heartbroken.

There are 50 comments.

  1. RightAngles Member

    Wow. Well, the last reply I got from my senator might as well have been processed by a machine, or a kangaroo for that matter, for all the relevance it had to what I wrote.

     

    • #1
    • January 14, 2019, at 4:58 PM PST
    • 9 likes
  2. Hank Rhody, Badgeless Bandito Contributor

    RightAngles (View Comment):

    Wow. Well, the last reply I got from my senator might as well have been processed by a machine, or a kangaroo for that matter, for all the relevance it had to what I wrote.

    Exactly! You don’t have to duplicate actual human interaction; if you can be marginally more convincing than the replies constituents get now you’re at an advantage.

    • #2
    • January 14, 2019, at 4:59 PM PST
    • 3 likes
  3. Arahant Member

    I see you’re bringing the Funk to Ricochet.

    • #3
    • January 14, 2019, at 5:09 PM PST
    • 5 likes
  4. Hank Rhody, Badgeless Bandito Contributor

    Arahant (View Comment):

    I see you’re bringing the Funk to Ricochet.

    Somehow I already had the Parliament Funkadelic in mind and it just came up as I’m writing the post.

    • #4
    • January 14, 2019, at 5:10 PM PST
    • 4 likes
  5. The Reticulator Member

    I like to make my opini0ns hard to categorize. Unfortunately, it’s not always easy. And whether or not you put the effort into it, they’ll categorize your opinion anyway. 

    • #5
    • January 14, 2019, at 5:19 PM PST
    • 4 likes
  6. RightAngles Member

    The Reticulator (View Comment):

    I like to make my opini0ns hard to categorize. Unfortunately, it’s not always easy. And whether or not you put the effort into it, they’ll categorize your opinion anyway.

    Hah. When I wrote asking him to support all efforts to prosecute and indict Hillary Clinton, they replied “Thank you for your interest in the U.S. Prison System.”

    • #6
    • January 14, 2019, at 5:23 PM PST
    • 8 likes
  7. Hank Rhody, Badgeless Bandito Contributor

    RightAngles (View Comment):

    The Reticulator (View Comment):

    I like to make my opini0ns hard to categorize. Unfortunately, it’s not always easy. And whether or not you put the effort into it, they’ll categorize your opinion anyway.

    Hah. When I wrote asking him to support all efforts to prosecute and indict Hillary Clinton, they replied “Thank you for your interest in the U.S. Prison System.”

    Trying to be better than the current system really is a low bar.

    • #7
    • January 14, 2019, at 5:24 PM PST
    • 4 likes
  8. Clifford A. Brown Contributor

    Aliens and Robots versus the American worker!

    • #8
    • January 14, 2019, at 8:09 PM PST
    • 2 likes
  9. Judge Mental Member

    Text recognition software has about a 99% accuracy rate, which sounds good until you think about how many letters there are on a page. For a typical name and address, you have about a 50-50 chance on an error that will cause a failure to deliver.

    I would recommend instead digitizing the return address as an image, storing that image as a BLOB n your database, and then printing the image on the envelope. Let the post office deal with their crappy handwriting.

    • #9
    • January 14, 2019, at 8:56 PM PST
    • 6 likes
  10. The Reticulator Member

    Judge Mental (View Comment):

    Text recognition software has about a 99% accuracy rate, which sounds good until you think about how many letters there are on a page. For a typical name and address, you have about a 50-50 chance on an error that will cause a failure to deliver.

    I would recommend instead digitizing the return address as an image, storing that image as a BLOB n your database, and then printing the image on the envelope. Let the post office deal with their crappy handwriting.

    Want our government to work more efficiently? Write all your letters out in longhand, thus keeping it from mischief.

    • #10
    • January 14, 2019, at 9:01 PM PST
    • 5 likes
  11. Hank Rhody, Badgeless Bandito Contributor

    Judge Mental (View Comment):
    I would recommend instead digitizing the return address as an image, storing that image as a BLOB n your database, and then printing the image on the envelope. Let the post office deal with their crappy handwriting.

    Yeah, that’s pretty smart. Hadn’t thought of that.

    • #11
    • January 14, 2019, at 9:03 PM PST
    • 4 likes
  12. Hank Rhody, Badgeless Bandito Contributor

    The Reticulator (View Comment):
    Want our government to work more efficiently? Write all your letters out in longhand, thus keeping it from mischief.

    Thinking about this one. You have to waste more of their time than time of your own. only it isn’t a straight 1 to 1 ratio; there are more of us than there are of them. Might get it to work. That is, if they decided that answering letters was more important than just about anything else they might be doing. Seems unlikely.

    • #12
    • January 14, 2019, at 9:20 PM PST
    • 2 likes
  13. dnewlander Coolidge

    Judge Mental (View Comment):

    Text recognition software has about a 99% accuracy rate, which sounds good until you think about how many letters there are on a page. For a typical name and address, you have about a 50-50 chance on an error that will cause a failure to deliver.

    I would recommend instead digitizing the return address as an image, storing that image as a BLOB n your database, and then printing the image on the envelope. Let the post office deal with their crappy handwriting.

    You know that the Post Office OCRs the addresses, right? And they’ve done it for decades. Really, really, really fast.

    That part’s not the problem.

    Discerning meaning from written text is the problem. I’ve actually looked into this problem in the past, and it’s far from solved.

    • #13
    • January 14, 2019, at 9:41 PM PST
    • 3 likes
  14. Judge Mental Member

    dnewlander (View Comment):

    You know that the Post Office OCRs the addresses, right? And they’ve done it for decades. Really, really, really fast.

     

    Could be wrong, but to my knowledge they OCR the zip code, not the full address.

    • #14
    • January 14, 2019, at 9:46 PM PST
    • 2 likes
  15. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    You know that the Post Office OCRs the addresses, right? And they’ve done it for decades. Really, really, really fast.

     

    Could be wrong, but to my knowledge they OCR the zip code, not the full address.

    That’s not what I learned 30 years ago, working in a print shop that did a lot of postcards.

    A certain percentage get hand-sorted, but they OCR the vast majority of them. That’s why there are regulations for where the address should be placed on the envelope or card.

    But, still, that’s not the issue with automating Congressional responses (or any other business mail).

    • #15
    • January 14, 2019, at 9:51 PM PST
    • 1 like
  16. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    You know that the Post Office OCRs the addresses, right? And they’ve done it for decades. Really, really, really fast.

     

    Could be wrong, but to my knowledge they OCR the zip code, not the full address.

    Here’s a bit for you:

    https://pe.usps.com/text/pub28/28c2_002.htm

    • #16
    • January 14, 2019, at 9:53 PM PST
    • 2 likes
  17. Judge Mental Member

    dnewlander (View Comment):
    30 years ago

    My info is probably older than that, a 60 Minutes or 20/20 story on modernization of the P.O.

    • #17
    • January 14, 2019, at 9:55 PM PST
    • 2 likes
  18. dnewlander Coolidge

    dnewlander (View Comment):

    Judge Mental (View Comment):

    dnewlander (View Comment):

    You know that the Post Office OCRs the addresses, right? And they’ve done it for decades. Really, really, really fast.

     

    Could be wrong, but to my knowledge they OCR the zip code, not the full address.

    Here’s a bit for you:

    https://pe.usps.com/text/pub28/28c2_002.htm

    I know that page talks about “automated rates”, but they OCR everything, regardless.

    • #18
    • January 14, 2019, at 9:55 PM PST
    • 2 likes
  19. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):
    30 years ago

    My info is probably older than that, a 60 Minutes or 20/20 story on modernization of the P.O.

    No worries. I was actually amazed when I learned it, having dallied in using OCR software on Macs at that time.

    Which was nigh-on unusable.

    It really is fairly incredible, considering the volumes of mail the USPS used to have to sort, all over the country.

    • #19
    • January 14, 2019, at 9:58 PM PST
    • 2 likes
  20. Judge Mental Member

    dnewlander (View Comment):

    Judge Mental (View Comment):

    dnewlander (View Comment):
    30 years ago

    My info is probably older than that, a 60 Minutes or 20/20 story on modernization of the P.O.

    No worries. I was actually amazed when I learned it, having dallied in using OCR software on Macs at that time.

    Which was nigh-on unusable.

    It really is fairly incredible, considering the volumes of mail the USPS used to have to sort, all over the country.

    To the rest of your point, I did design for natural language processing more than 20 years ago, that I still believe to be a realistic and valid approach, that given time I could produce. I never moved forward from the design stage, because I was looking for something I could build in my spare time and get rich, whereas this would have been at least 20 man-years, meaning more like 50 years in my spare time. And I had already been burned, building something cool only to have one of the big players release the same thing before I could finish.

    • #20
    • January 14, 2019, at 10:04 PM PST
    • 5 likes
  21. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    Judge Mental (View Comment):

    dnewlander (View Comment):
    30 years ago

    My info is probably older than that, a 60 Minutes or 20/20 story on modernization of the P.O.

    No worries. I was actually amazed when I learned it, having dallied in using OCR software on Macs at that time.

    Which was nigh-on unusable.

    It really is fairly incredible, considering the volumes of mail the USPS used to have to sort, all over the country.

    To the rest of your point, I did design for natural language processing more than 20 years ago, that I still believe to be a realistic and valid approach, that given time I could produce. I never moved forward from the design stage, because I was looking for something I could build in my spare time and get rich, whereas this would have been at least 20 man-years, meaning more like 50 years in my spare time. And I had already been burned, building something cool only to have one of the big players release the same thing before I could finish.

    Oh, I believe that part can be done, it’s the further classification of the results that would get hard.

    20 years ago a few friends and I had an idea to write some software to automatically respond to RFPs based on something similar. We, likewise, never had time to actually implement it.

    • #21
    • January 14, 2019, at 10:10 PM PST
    • 2 likes
  22. Judge Mental Member

    dnewlander (View Comment):

    it’s the further classification of the results that would get hard.

     

    No, it wouldn’t. I started out thinking NLP, but ended up with something much more like AI. My code would have “understood” the meaning of the text in the same manner as a human.

    The downside of mine would have been the need to educate it like a human child.

    • #22
    • January 14, 2019, at 10:15 PM PST
    • 3 likes
  23. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    it’s the further classification of the results that would get hard.

     

    No, it wouldn’t. I started out thinking NLP, but ended up with something much more like AI. My code would have “understood” the meaning of the text in the same manner as a human.

    The downside of mine would have been the need to educate it like a human child.

    ???

    I think there’s an introduction missing there.

    • #23
    • January 14, 2019, at 10:59 PM PST
    • 1 like
  24. Judge Mental Member

    dnewlander (View Comment):

    Judge Mental (View Comment):

    dnewlander (View Comment):

    it’s the further classification of the results that would get hard.

     

    No, it wouldn’t. I started out thinking NLP, but ended up with something much more like AI. My code would have “understood” the meaning of the text in the same manner as a human.

    The downside of mine would have been the need to educate it like a human child.

    ???

    I think there’s an introduction missing there.

    My point was that the code I designed could have done any classification that a human could do, by reading and digesting the text. If you meant it would be hard for a human, then my point is moot.

    • #24
    • January 14, 2019, at 11:07 PM PST
    • 3 likes
  25. Arahant Member

    dnewlander (View Comment):
    I think there’s an introduction missing there.

    Not to mention how he was going to put diapers on AI.

    • #25
    • January 14, 2019, at 11:11 PM PST
    • 3 likes
  26. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    Judge Mental (View Comment):

    dnewlander (View Comment):

    it’s the further classification of the results that would get hard.

     

    No, it wouldn’t. I started out thinking NLP, but ended up with something much more like AI. My code would have “understood” the meaning of the text in the same manner as a human.

    The downside of mine would have been the need to educate it like a human child.

    ???

    I think there’s an introduction missing there.

    My point was that the code I designed could have done any classification that a human could do, by reading and digesting the text. If you meant it would be hard for a human, then my point is moot.

    Well, that’s the point. Machines have to learn that “Build the fraking wall!!” = not only “Build the wall”, but “stop illegal immigration” and “illegal immigration costs us ‘way too much money!”. Etc. And vice versa.

    And unless you have samples of them all, then a machine can’t do it as well as a human, because those types of inferences are hard.

    Not impossible, mind you, but hard.

    Just like self-driving cars. They work under many conditions. But not enough. Not by a longshot.

    Now, could a system like what @hankrhody describes improve Congressional (and other business’) correspondence with clients? Yes. Undoubtedly.

    But we’re all still going to get form letters, just like we all get boilerplate when we write any company’s support staff. Or when we interact with those infernal “bots” that pop up ALL THE FRAKING TIME.

    Because understanding what people say is hard. And I don’t think that problem is solvable within my lifetime.

    No matter how much I like my Echo Dots.

     

    • #26
    • January 14, 2019, at 11:19 PM PST
    • 3 likes
  27. Judge Mental Member

    dnewlander (View Comment):

    Well, that’s the point. Machines have to learn that “Build the fraking wall!!” = not only “Build the wall”, but “stop illegal immigration” and “illegal immigration costs us ‘way too much money!”. Etc. And vice versa.

     

    But what I’m saying is that it would have handled exactly that sort of thing. By teaching it the meaning of words. It’s not looking for stock phrases, it’s discerning meaning from whatever words are used.

    • #27
    • January 14, 2019, at 11:25 PM PST
    • 4 likes
  28. dnewlander Coolidge

    Judge Mental (View Comment):

    dnewlander (View Comment):

    Well, that’s the point. Machines have to learn that “Build the fraking wall!!” = not only “Build the wall”, but “stop illegal immigration” and “illegal immigration costs us ‘way too much money!”. Etc. And vice versa.

     

    But what I’m saying is that it would have handled exactly that sort of thing. By teaching it the meaning of words. It’s not looking for stock phrases, it’s discerning meaning from whatever words are used.

    How?

    Because it’s not just what words are used, but how and in what order they’re used. I still think that’s a right-now-intractable problem.

    Hell, humans do a crappy job of it a fair percentage of the time.

    Add in the problem of parsing English, which I don’t think many programmers adequately understand, and I just don’t see it anytime soon.

    • #28
    • January 14, 2019, at 11:30 PM PST
    • 3 likes
  29. Judge Mental Member

    dnewlander (View Comment):

    How?

     

    That’s way too long for a comment. Maybe I’ll write a post someday, although it’s been a long time ago.

    But the basic answer is the same way that you do it; through the manipulation of symbols, that are themselves aggregations of simpler symbols. And the comparison of those symbols to other known symbols to discern meaning.

    • #29
    • January 14, 2019, at 11:39 PM PST
    • 3 likes
  30. Arahant Member

    dnewlander (View Comment):
    Hell, humans do a crappy job of it a fair percentage of the time.

    He has a point here.

    • #30
    • January 14, 2019, at 11:49 PM PST
    • 2 likes