Can robots write sports previews?

Considered a creative skill, writing has long been seen as mostly immune to automation and commoditization — the seemingly inevitable end-state of anything touched by the Internet. Perhaps no longer.

What’s the score?

One of the more ubiquitous writing genres is sports reporting. Countless publications, portals, aggregators and distributors in print, radio, TV and Internet cover team rosters, game previews, schedules, results and all manner of short notices from Little League to college games to professional sports. An army of writers are routinely tasked to generate the base content for this wide spectrum of sports coverage.

Here’s a recent example. Despite having been promoted as championship contenders this year but currently being at the very bottom of the NBA standings, Brooklyn Nets and NY Knicks recently met. The day before the game, as is customary, a “preview” of the upcoming game for general syndication had to be written. Something with a lede like this:

Lede1

Now remember, there are games in all sports. At all levels. Across the entire world. Every single day. There are also daily and and hourly developments to be covered in finance, weather, healthcare, marketing, real estate, politics, entertainment, transportation, technology and myriad other fields. There’s always been an insatiable demand for expository writing across the board. While domains are very different, to an analytical eye all such data-driven writing share two important traits: they’re very structured and highly automatable. Everything in the game preview above is simple prose, wrapped around stored data, shown in blue here:

Lede2

It turns out one NBA game preview is pretty much the same as any other similar game. We could structurally separate parts that can be substituted for different data about the other 28 teams and roughly the same compositional logic:

Lede3

If we can now plug in team-specific names, places and data wherever there’s one of those blue-bracketed placeholders above, we could customize a game preview so specifically to a given event that I’m confident 95% of the reading public couldn’t tell if those sentences were composed by a human writer or an algorithm, like the one I pseudo-reverse-engineered and highly simplified below:

Lede4

Fortunately, or unfortunately depending on your perspective and profession, such algorithmic-writing is not some hovering, hyperlooping fantasy. Here’s the actual preview that ran across many sites on the Internet and elsewhere before the game:

Full Preview

And syndicated in one of the biggest such venues, Yahoo Sports:

Human Version

Who’s your daddy?

See the non-human byline below the headline, Automated Insights? That’s one of the new generation of companies involved in algorithmic writing. There are, and will be, others. For the initiated, the technology is quite straight forward. Often structured data is the gating factor, not compositional technology. Parsing and conditional templating technology is well understood by now. It’s tedious but low-scale pieces could be done with procedural programming, larger ones with rules engines and truly scalable and flexible ones with semantic coupling of the domain specific data.

In fact, many aspects of the writing itself is amenable to conditional embellishment of the parts of speech. For example, in the piece above, we could have pre-programmed a list of synonyms for “struggled” and picked a substitute randomly or one specific to geography, audience or sports. Lexical stylization can indeed get very sophisticated through contextual or randomized algorithms. Management of such conditional logic and metadata at scale has been possible for a couple of decades. When composing a personalized investment report or answering a question on your iPhone, your broker and Siri (though using different technologies underneath) already do something similar.

The advantage

In our example, the day before the game there was another “Knicks-Nets Preview” written by a human, Associated Press basketball writer Brian Mahoney, also syndicated in Yahoo Sports. The two pieces clearly serve different purposes. Mahoney’s article is much longer, as well as being significantly more detailed, colorful and analytical. Automated Insights’s preview is all about brevity, information, timeliness and, ultimately, volume, coverage and cost-effectiveness. In one millionth of the time it takes Mahoney to write one of his NBA previews, Automated Insights can generate previews for all the games not just in NBA but in all sports, anywhere on the planet, as long as there’s underlying data. And in a domain like sports, there’s plenty of data.

The differentiating cost of algorithmic writing is nearly all front-loaded on template and conditional logic programming. When done properly, this can obviate post-production fact checking and proof reading. Once set up, these pieces can be auto-produced when underlying data changes or when schedules are triggered. Thus the marginal cost of iterative articles approaches zero.

The day has arrived

Clearly, programmed robots can in fact write sports previews. And many other types of writing suitable for algorithmic automation. As is the case with the Internet, this will displace a lot of writers and also create concomitant technology jobs elsewhere.

It would be easy to dismiss this as procedural, utilitarian writing that doesn’t share much with literary prose. Granted. But such competition is not the focus of algorithmic writing. Not yet, anyway. Given enough nouns, verbs and associations in a specific knowledge domain, you’d be surprised how close you can come in compositional “believability” even today. Tomorrow, don’t be surprised if your next textbook or travel guide or cookbook is written mostly by domain-specific algorithms. And welcome to the ["brave" | "splendid" | "efficient" | "fearful" | "faceless" | "decimating"] new world of algorithms…eating yet another profession!

Follow Kontra on Twitter

You might also be interested in:
Is Siri really Apple’s future? and Can Siri go deaf, mute and blind?

8 thoughts on “Can robots write sports previews?

  1. Pingback: Thought Starters: Content that has got me thinking 4 | Inspiral

  2. Pingback: What I've been reading this week December 13, 2013 | Alex Balfour

  3. Pingback: Boot up: offline Sheets, Japan’s apps boom, robot writing, snooze = lose, and more

  4. Pingback: Boot up: offline Sheets, Japan’s apps boom, robot writing, snooze = lose, and more | The Actual News

  5. Pingback: Boot up: offline Sheets, Japan's apps boom, robot writing, snooze = lose, and more - technology news

  6. Pingback: Can robots write sports previews?, via @hugovk Fortunately, or… | Design Interaction

  7. I’m going to guess that for many teams, the recent track record is only marginally different from either begin-of-season expectations and also the whole-season track record. In those circumstances, leading with the mismatch between the expectations and recent results would misuse the first sentence’s importance.

    It would also get dreadfully dull after a fan checks in a couple of days in a row. Or flits over to the college games’ summaries to see how his college team and its arch-rival(s) were doing.

    I certainly think there’s a place for form-letter sports reporting, but it’ll have to be at least an order of magnitude better than what you posit, before it’s anything besides a joke.

    PS: Phun Phacts Dept: 80 years ago, Ronald Reagan was a “play-by-play” announcer for the Cubs at a radio station in Iowa. He made up all the “Mulhooney winds up… and there’s a swing and a miss” type color based on simple stats that were Telexed to his station. Automated Insights is re-creating a phenomenon that brought Reagan to national attention; hope it works out as well for them!

  8. Pingback: programmed robots can in fact write sports previews | Newspoodle

Comments are closed.