The International Network of Investigative Journalists (ICIJ) wants to make it easier for its more than 200 members in 70 countries to collaborate via the Internet and analyze extensive data packages using data mining techniques. The Spring Laboratory of the École Polytechnique Fédérale de Lausanne (EPFL) commissioned by the consortium developed with the “Datashare Network” a “completely anonymous decentralized research and information exchange system” for this purpose.
Don’t miss any news! With our daily newsletter you will receive all heise online news from the past 24 hours every morning.
Subscribe to the newsletter now
It has made a name for itself ICIJ especially since 2013, when it launched 260 gigabytes of data about the pullers and beneficiaries of tax havens with the joint operation “Offshore Leaks”. Later, involved media such as the Süddeutsche Zeitung or the NDR with the Panama and the Paradise Papers revealed the existence of several hundred thousand mock societies, in which personalities from art, politics, business, including the IT industry and sports, were among the shareholders.
Principle of anonymity
Only thanks to international cooperation is it possible to conduct such investigations with millions of leaked documents, emphasizes the head of the consortium, Gerard Ryle. A leak would also endanger the safety of journalists and witnesses. Core element of the decentralized search engine is therefore anonymity. “Both the research and the exchange of information can be done without disclosing the identity to colleagues or the organization,” it says. The ICIJ guarantees the functioning of the system, but has “no insight into the exchange actions”.
According to the EPFL, the network is based on virtual tokens “that journalists attach to their messages and documents in order to guarantee other people that they belong to the consortium”. A central administration store was out of the question because it would be too obvious a target for hackers. Because the organization does not have decentralized servers for all of its areas of responsibility, the documents remained on the members’ hosts or computers. The users only stored essential basic and meta information in the system and thus enabled other participants to join an analysis.
“Multi-Set Private Set Intersection”
Anyone looking for information “enters a few keywords into the search engine”, the researchers involved explain. If he is successful, “he can contact his colleagues, whose identity he does not yet know but who have potentially interesting information, via a newsletter system and, if necessary, forward his message to everyone. The recipients received the request with a delay and decided themselves whether they wanted to start communicating and share their own information.
The team is using anonymous authentication and communication mechanisms and its own solutions for the further exchange, emphasizes Carmela Troncoso, head of the Spring laboratories. The comparatively new “Multi-Set Private Set Intersection” (MS-PSI) protocol guarantees the security of the search engine across numerous databases. The mail server uses a very large number of virtual mailboxes that could only be used once. The team plans to present further details at the Usenix Security Symposium in mid-August.