Lesson 17: Investigative journalism in the digital age
Lesson 17: Investigative journalism in the digital age
How can an investigative journalist benefit from data-driven journalism? One definition of investigative journalism as given in UNESCO’s investigative journalism manual says that it is about exposing the truth about public interest issues, issues whose details are kept under wraps (deliberately or otherwise) by the people involved. Using this definition, data and the stories extracted from it can constitute a central pillar of investigative journalism. Data constitutes the ‘raw material’ that a journalist uses to cast light into darkness, clear up ambiguities and solve apparent contradictions in their story.
Investigative journalism and data-driven journalism have a big overlap. They both engage in indepth research and sift through information to exclude any impurities (fake news or misleading data). Data thus plays an important role in the various stages of an investigation, including how it is presented within the story. You should state the importance of the data clearly within your report. You should also be careful to distinguish data from facts: data does not necessarily mean fact. There may be biases in the way data is collected, and you should always be careful to test your data and establish how it is linked to the incident you are investigating.
When you first start working on an investigation, you should look up the data already available whether official or unofficial in order to answer as many of your questions as possible before moving on to posing questions whose answers are unknown and coming up with hypotheses and possibilities. A story begins with looking up data. You will need to have a way of collecting and presenting data, which is exactly what data-driven journalism provides. Data will help make your story measurable. It will allow you to render your ‘hows’ as measurable ‘how much’s, allowing the reader to more clearly see the scope of the problem alongside the added value of any new information that you obtain from private sources of data or from data not initially included.
Let’s take a look at the different stages involved in data-driven journalism:
Stage 1:Looking for sources
First of all, review the available open source data relevant to the investigation, making sure before doing anything else that you are familiar with the frequency with which it is made available. You should also review any private, verifiable sources of data that you or your employer might gain access to.
Open source data includes reports from the World Bank, the World Health Organisation (WHO) and the Food and Agriculture Organisation (FAO), annual government statistical reports, and social media websites.
Tools like Gapminder or Google Public Data Explorer allow you to easily collate this data. Journalists interested in environmental investigations can access data from relevant local and international organisations through sites like that of the US Geological Survey. Data on medical testing is provided by clinicaltrials.gov. The FIFA website may be of interest to journalists investigating sport.
Stage 2: Handling the data
There are several database programs that may be of use in handling data: Microsoft Excel (spreadsheets), OpenRefine (data refining), Fusion Tables (verification), MySQL (databases), SOLR and Access (databases).
When refining large quantities of data, you can analyse and compare using a particular chronological or geographical filter. This can give your story new dimensions that may not have been immediately clear. If you go deep into the data, you may even find new stories.
There are several new techniques from data-driven journalism that may be of assistance in investigations:
• Analysing data taken from the social media profiles of perpetrators or influential people. This can help you tease out lines of investigation or access information from non-traditional sources (Donald Trump’s tweets about a particular incident, for example, predating his presidency by many years).
• Analysing audience reactions to prominent public issues.
• Accessing historical data relevant to your investigation. For example, working out the dates on which something happened can provide you with new ways of understanding present problems (the date of a famine in a particular country with chronic water supply problems...)
• Working out where something happened (a military operation in a particular country, for example) or a photo or video clip was taken. You can use data that has been deleted using archiving tools like Internet Archive or the Wayback Machine.
1. In 2011 the Guardian was able to establish who was responsible for looting during rioting that had taken place across the UK in August 2011. The Reading the Riots project, conducted in cooperation with LSE, was heavily data-driven
2. The Panama Papers project drew on more than 11.5 million documents making up 2.6 terabytes of data dating from 1977 to 2015 and concerning about 214,000 corporate entities. The International Consortium of Investigative Journalists (ICIJ) incorporated the data into a database that makes sifting through it and searching it much easier.
3. The Paradise Papers project, which likewise incorporates about 13.4 million documents obtained by the Suddeutsche Zeitung and showing how the world’s super-rich invest their money (ICIJ).
Stage 3: Analysis
After collating and refining the data, there are several methods you can use to analyse the data:
• Descriptive analysis: answers the questions “what?”, “who? “, “how”, “where” and “when?”
• Diagnostic analysis: answers the question “why?”
• Advanced analysis making predictions about future scenarios. A successful example is provided by Noun Post’s report Golden Generals
Stage 4: Preparing a data-driven investigation
When putting together your story, there are various tools you can use to present data in way that is easier to understand: Charts and graphs, Infographics , Interactive maps
Tools that may be of interest include: Tableau Public and Many Eyes, which will allow you to present data visually in a range of different ways, and Geocommons and Google Fusion Tables, which will allow you to produce maps using coordinates.
The AJMI has produced a guidebook to data-driven journalism that provides detailed instructions on how to go about doing this.
Saving data and documents
There are various programs you can use to store data:
Google Drive: Google Drive is associated with your personal email. It can be used as a digital memory folder allowing you to save data. You can also work on it directly, whether through the Google Docs interface or through Google Sheet (Excel).
Xperia Companion: Download this program to produce backup copies of your data. It allows you to transfer files easily from one device to another and store it safely.
Dropbox: Dropbox allows you to keep your files safe in a cloud folder. You can then access them wherever you are in the world.