data journalism | in English (without translation)

Data doesn’t grow in tables: dealing with large sets of documents

10:30 - 12:00 saturday 18/04/2015 - Hotel Sangallo

While we would all like for our journalistic evidence to be delivered to our doorsteps in nicely-formatted spreadsheets, more often than not that is not the case. Instead, information often comes as a large stash of (unstructured) documents. When these collections grow, reading through all of the documents stops being an option.

This workshop will discuss alternatives: what tools and technologies are available for the automated analysis of large document sets? How can you learn about the recurring topics of a document stash automatically? How can important concepts, people and companies be traced across the result of a leak? Level: intermediate.

Organised in association with the European Journalism Centre and Open Knowledge Foundation.

Data doesn’t grow in tables: dealing with large sets of documents

Friedrich Lindenberg