Features of the next generation editorial CMS

This morning’s displacement activity for a very-last-minute double-check of my tax return was an interesting Twitter conversation about the functionality of future news content management systems started by Mark Ng (who is, BTW, doing some very interesting work with the Media Standards Trust on developing microformats for increasing the transparency of news).

Chris Edwards had a particularly interesting response:

One area where I think a CMS could make a difference would be if stories acquired more metadata. Again, this is an area fraught with difficulty as the last thing you want to do is force writers to add metadata just to keep the machines happy, which is what tends to happen today. But if you were able to do something akin to what Microsoft tried to do with Smart Tags earlier in the decade, I can see some advantages down the road. Instead of writing “Q4 revenues were $2.4bn”, you point to the source data within the financial repository. When someone clicks on the tag, they get taken to a virtual P&L sheet which shows the same number and, if it all works properly, whether that number was later restated.

Similarly, you might tag people so that the system can log all stories about them and build dynamic timelines so you can see when they moved companies or said certain things. That would go some way to making the information that goes into stories more remixable. You don’t create new stories from the parts, but you can at least extract some structured information that may prove useful to a reader.

I couldn’t agree more. The really interesting developments in newsroom technology relate to the ways news organisations organise and render useful, both internally and externally, the data and metadata their journalists have generated.

Last year, I wrote a story about Reuters’ Open Calais project. Reuters used automated semantic tagging to organise and interlink its vast archive of content and is now making (a simplified version of) the software available to outside developers.

The idea of “making the information that goes into stories more remixable”, as Chris puts it, is very much at the heart of the various <a href=http://blogs.pressgazette.co.uk/fleetstreet/2008/08/28/news-media-apis-more-on-mashups/”>news API projects</a>. The New York Times APIs, for example, are part of a bigger internal project called “data universe” which involves the paper reorganising some of its internally-held information as structured data.

In a key blog post a few years ago, Adrian Holovaty argued that embracing structured data was a fundamental change newspapers need to make. At the time he noted one of the key barriers to this sort of news:

… newspaper companies’ current software and organizational setup overwhelmingly discourages any sort of “information special-casing.” Just about every newspaper Web site content-management system I’ve ever seen is unabashedly story-centric. Want to post event calendar information into your news-site CMS? Post it as a “news article” object. Want to publish listings of recent crimes in your town? It goes in as a “news article.” There’s not much Joe Reporter, or even Jane Online Editor, can do about this, because Oh We’ve Invested So Much Into This CMS, and/or Our Newspaper Web Site Doesn’t Employ Any Computer Programmers.

Besides just making them as easy to use as WordPress, the key development for the next generation CMS, therefore, is to give non-technical journalists the ability to develop systems of this sort of “information special-casing”.