Philip Meyer: "After personal computers with user-friendly software became common, using a computer wasn't such a big deal. But the term CAR, for computer-assisted reporting, is still used today to describe what I prefer to think of as the application of scientific method to reporting."
"Reading the Riots is modelled on an acclaimed survey conducted in the aftermath of the Detroit riots in 1967. The findings of that study, the result of a groundbreaking collaboration between the Detroit Free Press newspaper and Michigan's Institute for Social Research, challenged prevailing assumptions about the cause of the unrest. Prof Phil Meyer, who co-ordinated the Detroit study more than four decades ago, will advise the research into the English riots."
"The Wall Street Journal filed several Freedom of Information Act requests with the Federal Aviation Administration for the entire Enhanced Traffic Management System database, which contains flight records for aircraft that flew in the U.S. under instrument flight rules. The Journal analyzed the flight data for non-commercial jet aircraft traffic for a four-year period, 2007 through 2010. ... The Journal has included in the flights database an estimated cost to operate each flight. The estimates are based on per-hour cost figures for each model of jet, provided by Conklin & de Decker Aviation Information, an industry consulting firm used by some public companies to provide aircraft-cost estimates for regulatory filings."
"Texas Tribune reporter Matt Stiles and Duke University computational journalism professor Sarah Cohen explain how they find good stories in a sea of government data."
Deadlines is 11 May: "He or she will have a strong background in investigative work, and Pulitzer-sized ambition. A strong background in computer-assisted and database reporting, a proven track record at some of the world's biggest news organisations, international frontline reporting experience, in-depth knowledge of multimedia and an interest in mentoring and coaching would ensure a successful application."
How to use the Google Maps API and Google Refine to geocode and improve partial addresses in a dataset.
"WhoRunsHK is an interactive database of a select group of leaders in Hong Kong and their connections. It is based on publicly available documents and is not comprehensive. It will be expanded over time."
"James Ball’s [Guardian] story [on lobbyist influence in the UK Parliament] is helped and supported by a ScraperWiki script that took data from registers across parliament that is located on different servers and aggregates them into one source table that can be viewed in a spreadsheet or document."
"The New York Times is considering options to create an in-house submission system that could make it easier for would-be leakers to provide large files to the paper."
"[A]midst all [the 'data journalism'] hype, earnestness and spreadsheet-geekery, here's the truth about so-called 'data journalism'. It's still about the story, stupid. ... [S]urely what's shocking is how few stories journalists actually managed to uncover [from recent major data dumps] ... No doubt we'll get better at this. Over time, journalists will learn how to pick out the stories that matter from these huge data releases - and it will help hugely whenever a single news outlet has control of the data, as the Telegraph did with MPs' expenses, so that they can drip-feed the top lines one at a time rather than see the whole lot drown in the 24-hour news cycle."
Sir Tim Berners-Lee: "the responsibility needs to be with the press. Journalists need to be data-savvy. These are the people whose jobs are to interpret what government is doing to the people. So it used to be that you would get stories by chatting to people in bars, and it still might be that you'll do it that way some times. But now it's also going to be about poring over data and equipping yourself with the tools to analyse it and picking out what's interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what's going on in the country."
"The data came to us as a huge excel file – over 92,201 rows of data, some with nothing in at all or were the result of poor formatting. Anything over 60,000 rows or so brings excel down in dramatic fashion – saving takes a painfully long period of time (tip number one – turn automatic saving off in preferences…). It doesn't help reporters trying to trawl through the data for stories and it's too big to run meaningful reports on. Fortunately, after COINS, huge datasets hold no fear for us. ..."
How the New York Times built an interactive graphic based on 1.9 million records of video rental queues obtained from Netflix.
AP memo on how reporters who found an accidentally pre-released copy of Sarah Palin's book produced a story in 40 minutes: "They bought a copy, ripped it from its spine and scanned it into the system so it could be read and electronically searched. A NewsNow moved within 40 minutes, followed quickly by multiple leads as details were gleaned from the 413-page manuscript."