"James Ball’s [Guardian] story [on lobbyist influence in the UK Parliament] is helped and supported by a ScraperWiki script that took data from registers across parliament that is located on different servers and aggregates them into one source table that can be viewed in a spreadsheet or document."
"The New York Times is considering options to create an in-house submission system that could make it easier for would-be leakers to provide large files to the paper."
"InMaps [linkedinlabs.com] is new service that visualizes the collection of a LinkedIn 'connections' as a single network graph."
"[A]midst all [the 'data journalism'] hype, earnestness and spreadsheet-geekery, here's the truth about so-called 'data journalism'. It's still about the story, stupid. ... [S]urely what's shocking is how few stories journalists actually managed to uncover [from recent major data dumps] ... No doubt we'll get better at this. Over time, journalists will learn how to pick out the stories that matter from these huge data releases - and it will help hugely whenever a single news outlet has control of the data, as the Telegraph did with MPs' expenses, so that they can drip-feed the top lines one at a time rather than see the whole lot drown in the 24-hour news cycle."
Sir Tim Berners-Lee: "the responsibility needs to be with the press. Journalists need to be data-savvy. These are the people whose jobs are to interpret what government is doing to the people. So it used to be that you would get stories by chatting to people in bars, and it still might be that you'll do it that way some times. But now it's also going to be about poring over data and equipping yourself with the tools to analyse it and picking out what's interesting. And keeping it in perspective, helping people out by really seeing where it all fits together, and what's going on in the country."
"The data came to us as a huge excel file – over 92,201 rows of data, some with nothing in at all or were the result of poor formatting. Anything over 60,000 rows or so brings excel down in dramatic fashion – saving takes a painfully long period of time (tip number one – turn automatic saving off in preferences…). It doesn't help reporters trying to trawl through the data for stories and it's too big to run meaningful reports on. Fortunately, after COINS, huge datasets hold no fear for us. ..."
How the New York Times built an interactive graphic based on 1.9 million records of video rental queues obtained from Netflix.
AP memo on how reporters who found an accidentally pre-released copy of Sarah Palin's book produced a story in 40 minutes: "They bought a copy, ripped it from its spine and scanned it into the system so it could be read and electronically searched. A NewsNow moved within 40 minutes, followed quickly by multiple leads as details were gleaned from the 413-page manuscript."
"Who Knows Who is Channel 4's new website which shows the connections between politicians, celebrities and business leaders, and where power really lies in the UK. We hope that it will reveal the surprising and often hidden stories behind the headlines. This is the first iteration of an ongoing process to develop this tool to be rich in content and functionality and over time build the biggest network of connections in the UK."
"At Medill, we’re using a Firefox add-on called SQLite. It’s small, fast and free; in other words, it’s perfect for a journalist on deadline."
"SQLite is my choice for the candidate to replace Access in journalism education. In addition to the advantages listed above, it’s also easy to “install.” If you can download files, unzip them and move them to a location on your hard drive, you can “install” SQLite. If you can install a Firefox add-on, you can manage it in the browser. And you can take your database files home with you or email them around. The add-on supports importing CSV files, SQL dumps and XML (although all databases can have issues with importing XML). It looks and works the same on a PC or a Mac. Most importantly, it demands an understanding of SQL that you can avoid when learning Access."
"News Dots scans all the articles from major publications—about 500 a day—and submits them to Calais ... Each time two tags appear in the same story, this tool tallies one connection between them. ... s this tool scans hundreds of stories, this network grows rapidly, and "communities" begin to form among the tags. ... The news network that results is visualized using Slate's custom News Dots tool, which is built using an open-source Actionscript library called Flare."
Megan Taylor: "I see a very clear progression from CAR to the programmer/journalist trend via the web. CAR is meant to be invisible. You analyze a database as part of the reporting process, but you don't want to clog up a story with too many numbers. The ability to add details online has changed this process. Data has become a part of the story. And that's the key connection between CAR and programming in journalism: data."
"A study by a groundbreaking public journalism project has revealed the worst place to park in Birmingham. ... Freedom of Information specialist Heather Brooke obtained the figures from Birmingham City Council, and the data was collated and sorted by Help Me Investigate user Neil Houston."