img

Lesson 9: Building and analyzing databases

How Does Pegasus Work?

Lesson 9: Building and analyzing databases

Around 130 countries have adopted laws encouraging the sharing of information and of them have become part of the Open Government Partnership. This has improved the flow of data from governments to the media and onto the internet. The same applies to companies and individuals. The world produces vast quantities of data every day estimated at 2.5 quintillion bytes (1 followed by 18 zeros). It is the journalist’s job to sift through this data and confirm its accuracy and analyse and collate it in order to produce high-quality investigations serving the public interest. Data does not only mean numbers.
It means any raw material – numbers, letters, and symbols – whether it has been processed or not, even if it is simply individual facts obtained in different contexts. To put together a database that will help produce a quality investigation, you need to be aware of where to find that data, whether it is descriptive or numerical, and how best to get at, develop, and verify it. Much published or available data will fail to totally meet your hopes, requirements and expectations. For this reason, you will need to put together a database tailored to your specific needs. You can do this by using existing databases, collecting the missing data yourself or with the help of others and conducting surveys, interviews and field visits. In the absence of more effective ways of getting at data, this will be your best option, despite the effort and cost involved.
In a 2016 investigation published on the Jordanian website Amman.net and overseen by the Arab Investigative Journalism Network (ARIJ), journalists demonstrated that members of the House of Representatives had been taking home millions in illegal and unconstitutional tendering money. There was no database giving the names of all the members of the 17th House of Representatives who had taken government contracts or the value of those contracts, or showing their voting, oversight and legislative behaviors after winning these contracts. Nor did requests for information help the investigative team to get hold of the names of the companies that had won those contracts or find out their financial value. This made the first part of the investigation very difficult. With some effort, however, the team managed to find a list of all government contracts given to companies on the Government Contracts Office’s website.
They also found data on the tenders put out by the Public Procurement Bureau and the Greater Amman Municipality. This formed the kernel of the investigation. The team also drew on an open database of companies and their shareholders as well as the House of Representatives website, which provides members’ names. This allowed them to make sure that they had the right representatives and there had been no mix-ups because of similar names. A leaked list of Civil Status records listing names, spouses and children also provided some clues.
Using all this information, the team came up with a spreadsheet format appropriate to the investigation, which they filled in with the names of representatives, their dates of birth, their election dates, the names of the companies in which they held shares, these companies’ purposes and capital, the government contracts they had won, the value and date of those contracts, and other relevant information:
A lot was achieved in this investigation by analyzing, collating, indexing, and arranging data extracted from publicly available documents. This data helped to create a map of companies, people associated with them, and government contracts. This allowed the team to take the first steps towards producing and publishing this information about fifteen days before the next set of elections. The timeframe made this investigation the most important in the country at the time, and revealed a number of headline figures
• A quarter of a billion JD in government money went to companies belonging to HoR member Hazem Majali and his son
• 128 million went to the siblings of the Atyeh brothers, both HoR members
• 150 million went to companies owned by the brothers of HoR member Atef Tarawneh
• 92 members of the 17th House of Representatives held shares in companies with a combined capital of one and a half billion JD
Nonetheless, data cannot take the place of traditional journalism, and even the best data analysis cannot replace fieldwork and investigation. The team thus submitted some 27 freedom of information requests to various ministries and institutions asking about government contracts that had not been published on the three official websites. They also interviewed experts on the constitutional and legal aspects of the case, on transparency and on companies, and confronted representatives who had violated the constitution and the law. Verifying data protects journalists from misleading the public. Data is at least as dishonest as people: at the end of the day, it is collected and collated by human beings. As such, data cannot be trusted absolutely, because it is influenced by the biases of those who collect it. You should always make an effort to verify your data and ask questions to those who collected it in order to avoid publishing inaccurate observations.
We advise journalists to ask government bodies for raw data, because analysis of raw data helps produce a successful investigation. It is also often necessary, however, to access information that the government does not want to reveal. In this case you will have to negotiate with officials, work out how to classify it, digitalize it and enter it into a database. And of course, you will have to sift through it carefully and review it in order to guarantee that it is free of mistakes. While data provided by government bodies is essential, you should also try and work out how to get access to other additional or supporting data, in order to make sure the investigation is as watertight as possible. You can try following the accounts of people assumed to be close to the issue or influential or who are involved in public affairs, whether because they hold an administrative or legislative office or because of their position in the world of finance and business.
Following their posts and the people who interact with them may reveal new information or lines of investigation. You may also find it useful to look for relevant data by analyzing or reviewing old media publications on the subject. This can also give you an insight into public opinion on the issue at hand or on similar issues in the past. After this, you can sift through the data, in order to get a new angle that will give it an added value not present in traditional analysis. Written evidence Written and documentary evidence is an important source for any journalist seeking to access and prove facts in order to use them as evidence in their investigation. The official documents used should confirm the incidents and acts that the journalist wants to prove. They should be related to the facts and the characters involved. Documents should not be used simply for decoration.
An investigation produced by the Syrian Investigative Journalism Unit (SIRAJ) showed that five patients had died and a sixth had lost a hand because of negligence and inadequate precautions taken by doctors. Both the doctors and the hospitals that they worked at had escaped being held accountable because of the lack of a medical negligence law in Syria. The investigation used autopsies listing negligence as the cause of death as well as death certificates in order to demonstrate this. The investigation team made sure to verify that the documents were accurate before incorporating them into the investigation, closely examining the form, the content, the handwriting and the signatures and consulting sources mentioned in the reports. A similar approach was taken by the Guardian team that analyzed the so-called Palestine Papers (documents pertaining to the Palestine-Israel peace negotiations), which were leaked to Al Jazeera and published in partnership with the British newspaper. Before publishing the documents, they consulted independent sources, former participants in talks and various diplomatic and intelligence sources, who all confirmed that they were accurate.
Official documents issued by responsible government bodies are some of the strongest proof you can muster, so long as they do not turn out to be forged. There is a legal principle that an identical copy is as effective as the original unless it can be proven invalid. This principle applies equally to investigative journalism: evidence collected by journalists is no less effective than that collected by detectives, particularly given how often investigations end up in court (either seeking justice for victims or because the journalist is being sued). If the evidence cited is weak, then this can cause you real legal trouble. Verifying documentary evidence is essential. You should assess the form and content, the dates, the stamps, the handwriting, the signatures, the letterhead, the CCs and BCCs. Documents can be just as imprecise as human sources, and may even have been forged or issued by someone with no right to do so. All this means that they can be misleading 113. Take care to ensure that they are accurate before using them. If the document is an official letter, then there will be a sender and a recipient, as well as CCs for reference – each of them giving names and job titles. You can contact these people, present them with the document and ask them to confirm that it is real. Documents that are not covered by secrecy laws may also be subject to freedom of information requests.
When power struggles take place between different elites or factions within the regime itself, many documents will be issued, some entirely valid and some less so. Journalists may find themselves being used as puppets in these conflicts. Be particularly careful in these circumstances: publishing a fake document may end your career. Leaks that do not come in document form but rather as copy-pasted or encoded text require particular care. Anyone whose name appears in such a leak can easily dispute or deny their involvement, and journalists and their employers may be sued for publishing unsubstantiated information.
When no original copy is provided, we have no way of knowing whether sentences or particular words have been lost or added, or whether this has been done deliberately, or whether the figures and dates or the spelling of names is correct. In the case of the Panama Papers, for example, the leaks came in the form of documents, passports, emails, signatures and emails, which helped to confirm their veracity (the public admission by Fonseca that the documents were real and had been linked as the result of a hack also lent the documents great credence).
Government institutions also took steps to verify the documents. All this transformed them from documents into solid proof. The same applies to the famous HSBC leaks, whose veracity was confirmed by the bank itself after a former employee shared them with the French Tax Authority But none of this means that journalists are exempted from engaging in their own traditional examination and verification of documents that they want to use in their investigations.
There are several kinds of written evidence. In law, these documents have varying probative value:
• Documents produced by government bodies, which are assumed to be proof of their contents unless they are proven to be forged and can be marshalled as evidence of anything and against anyone. These documents include passports, birth certificates, university and school graduation certificates, government contracts, official communications and letters, anything published in the official gazette (so long as it has not subsequently been amended or repealed), and documents originating elsewhere that have been signed or stamped by a competent government official.
• Documents produced and ratified by organizations or companies and issued by a competent employee. These can be used as proof against their originating organizations, so long as they are not proven to be forgeries. Documents of this kind include budgets, decisions taken by the board of directors, other administrative decisions, payrolls, etc.
• Documents produced and ratified by natural persons (i.e. signed, stamped or fingerprinted by them). These can be used as proof against their originator, so long as they are not proven to be forgeries, but cannot serve as proof against others. Documents of this kind include personal letters, wills and diaries.
• Documents produced by organizations or natural persons that have not been ratified or signed can also be cited as supporting evidence, i.e. supporting a case already substantiated with other forms of proof.
• Emails can be used as evidence against their sender, so long as they cannot prove that someone else sent the email.
• Government emails, so long as they are not proven to be forged, can always be used as proof.
• Final court judgements can be used as proof against anyone with regard to their content and the facts of the case. Non-final judgements and investigation files are not proof be - cause they may include confessions extracted by force, but they can be used as supporting evidence.
• The minutes of parliamentary sessions and meetings of municipal councils, local committees, party committees, associations, unions and publicly traded companies can be used as proof of what was said, even if it is defamatory, as well as any decisions reached and the general progress of the session.

Ads