How our smallest FinCEN Files stories captured the biggest data lessons

Each Confidential Client pro­file amounts to only a few hun­dred words, but the huge data lift involved was emblem­at­ic of the entire FinCEN Files investigation

Unlike pre­vi­ous ICIJ inves­ti­ga­tions such as Luanda Leaks, Mauritius Leaks, or Panama Papers, the FinCEN Files doc­u­ments didn’t num­ber in the hun­dreds of thou­sand or millions.

In fact, the entire inves­ti­ga­tion was based pri­mar­i­ly on a cache of just over 2,600 doc­u­ments that BuzzFeed News shared with the International Consortium of Investigative Journalists and 108 media part­ners, includ­ing more than 2,100 suspi­cious activ­i­ty reports, sev­er­al hun­dred spread­sheets, a few dozen Word doc­u­ments, and emails.

But don’t let the size of the leak fool you. It took months of delib­er­ate, metic­u­lous data work and dili­gent report­ing to eke out some of the investigation’s big sto­ries and find­ings, like ana­lyz­ing the more than $2 tril­lion in sus­pi­cious trans­ac­tions flagged by U.S. banks, or iden­ti­fy­ing the net­works of shell com­pa­nies used to laun­der poten­tial­ly dirty money.

With any data jour­nal­ism project, it’s impor­tant to put a human face to the fig­ures. Our Confidential Clients fea­ture was a selec­tion of busi­ness­men, fraud­sters and polit­i­cal lead­ers whose sto­ries appeared in the leaked files. The aim was to show how glob­al banks con­tin­ued mov­ing bil­lions of dol­lars around the world for clients they sus­pect­ed were fund­ing illic­it or ille­gal activities.

We quick­ly dis­cov­ered that each pro­file offered a micro­cosm of the huge amount of report­ing and analy­sis that would even­tu­al­ly go into every sto­ry across the entire inves­ti­ga­tion, whether it was a 7,000-word fea­ture on British shell com­pa­nies or a 300-word pro­file of a cor­rupt politi­cian. Here’s what we learned along the way:

The data required reading between the lines

The files we were look­ing at were not the sort of frank exchanges between a client and their trust­ed wealth man­ag­er, accoun­tant or lawyer that we’ve seen in past inves­ti­ga­tions. There were no details of inti­mate client meet­ings or clear indi­ca­tions of inten­tions or motivations.

The sus­pi­cious activ­i­ty reports and spread­sheet lists of trans­ac­tions we accessed were put togeth­er by banks’ com­pli­ance offi­cers report­ing their sus­pi­cions to the U.S. Treasury’s Financial Crimes Enforcement Network, known as FinCEN. These offi­cers often had no direct con­nec­tion to the trans­ac­tion or the client them­selves, and were often either ill-informed or lim­it­ed in the amount of details they could pro­vide FinCEN.

The data we had was also quite patch­work. Most of the files were unstruc­tured nar­ra­tives out­lin­ing offi­cers’ sus­pi­cions, writ­ten into doc­u­ments from which we could extract some details about trans­ac­tions. On occa­sion, we had access to trans­ac­tion­al data in spread­sheets. Those files gave us very detailed infor­ma­tion from the point of view of the fil­ing bank;  some came in the FinCEN Files with the cor­re­spond­ing reports. But there is no manda­to­ry for­mat for the trans­ac­tion­al data to be filed — each bank had cho­sen its own way of pre­sent­ing the trans­ac­tion­al infor­ma­tion. Only 46 of these spread­sheets came with con­tex­tu­al nar­ra­tive files attached.As a result, the reports had to be treat­ed with a lot of cau­tion, though they still con­tained a wealth of use­ful infor­ma­tion to use as a start­ing point. For exam­ple, JPMorgan Chase report­ed hun­dreds of mil­lions of dol­lars worth of trans­ac­tions made by Paul Manafort, the for­mer man­ag­er of Donald Trump’s 2016 pres­i­den­tial cam­paign, and his asso­ciates. The SAR itself wasn’t par­tic­u­lar­ly detailed but we for­tu­nate­ly had access to the cor­re­spond­ing list of trans­ac­tions, which showed the mon­ey flows tied to Manafort and his companies.

For each Confidential Client pro­file, we had to assess what infor­ma­tion we had, what infor­ma­tion we need­ed to get, and then piece it all together.

The data was selective

The FinCEN Files doc­u­ments rep­re­sent less than 0.02% of the more than 12 mil­lion sus­pi­cious activ­i­ty reports that finan­cial insti­tu­tions filed between 2011 and 2017. According to BuzzFeed News, some of the records were gath­ered as part of U.S. con­gres­sion­al inves­ti­ga­tions into Russian inter­fer­ence in the 2016 U.S. pres­i­den­tial elec­tion; oth­ers were put togeth­er fol­low­ing requests to FinCEN from law enforce­ment agen­cies. What we had was but a very small win­dow into the world of a few banks, who were selec­tive­ly pick­ing the infor­ma­tion they deemed wor­thy of being report­ed to the U.S. government.

It wasn’t pos­si­ble to pick a bank’s famous client and fol­low a steady trail of infor­ma­tion exchanged between them over the years. Nor was it pos­si­ble to care­ful­ly build a com­plete pic­ture of each bank’s rela­tion­ships with their cor­re­spon­dent clients around the world.A few names of indi­vid­u­als and com­pa­nies popped up quite quick­ly. Still, it took a lot of heavy lift­ing to fill in the blanks. For exam­ple, we knew Deutsche Bank took an inter­est in a few com­pa­nies owned by Ukrainian busi­ness­man Ihor Kolomoisky, espe­cial­ly in 2016 when it filed sev­er­al reports men­tion­ing trans­ac­tions made by Kolomoisky’s Ukrainian International Airlines. But we couldn’t know what exact­ly prompt­ed the reports to be filed in the first place, whether more reports were filed than those we had, and what hap­pened after they were sent to FinCEN. What we had was not dis­sim­i­lar to the feel­ing you get when dri­ving on a sparse­ly lit high­way. You can dis­tin­guish patch­es illu­mi­nat­ed by street­lights, but you can­not know what lies in the dark­ness beyond.

The data was full of duplicates

One of the things we like to do in the ICIJ data team is to count things. We count doc­u­ments, we add up amounts, we cal­cu­late date ranges, and we clas­si­fy enti­ties, such as banks, all in the name of con­tex­tu­al­iz­ing infor­ma­tion and build­ing under­stand­ing. I like to use the anal­o­gy of a per­son col­lect­ing shells on the beach: where­as most peo­ple would select the most beau­ti­ful or orig­i­nal shells and dis­card the rest, the data team goes through every shell there is, and con­sid­ers each one against a method­ol­o­gy. In the end, the ques­tions we need to answer are plen­ti­ful: is it use­ful? Is it rep­re­sen­ta­tive? Can it be trust­ed? What can we safe­ly say, and what don’t we have enough infor­ma­tion about?The first “con­fi­den­tial client” we looked at was Mukhtar Ablyazov, a Kazakh nation­al who has lived in exile for more than ten years and is accused by Kazakh author­i­ties of hav­ing embez­zled bil­lions of dol­lars dur­ing his tenure as chair­man of a bank. Ablyazov has appeared in a num­ber of pre­vi­ous ICIJ inves­ti­ga­tions. In the FinCEN Files, his com­pa­nies were men­tioned in sus­pi­cious activ­i­ty reports filed by sev­en dif­fer­ent banks. This gave us an incli­na­tion that it was a good pro­file to focus on. Not only were we able to review the banks’ reports, but we also had access to FinCEN’s own sum­ma­ry reports about Ablyazov, which pro­vid­ed addi­tion­al back­ground information.

But because the sus­pi­cious activ­i­ty reports came from numer­ous banks, filed over the course of sev­er­al years, we ran the risk of hav­ing mul­ti­ple banks report­ing the same trans­ac­tions. Different banks can be involved in the same trans­ac­tion, or a series of trans­ac­tions can part­ly inter­sect with anoth­er series report­ed by anoth­er insti­tu­tion. This risk of dou­ble-count­ing also exist­ed when banks filed mul­ti­ple reports and failed to explain what trans­ac­tions they had includ­ed in pre­vi­ous reports. We had to cross-ref­er­ence the trans­ac­tion data from the sus­pi­cious activ­i­ty reports, the trans­ac­tion spread­sheets, and FinCEN’s own reports to be able to iden­ti­fy and set aside poten­tial­ly dupli­cate records.

The data needed more data

Because so many of the rich and pow­er­ful rely on shell com­pa­nies, includ­ing for legal rea­sons, it is often very dif­fi­cult to iden­ti­fy the indi­vid­u­als that ulti­mate­ly ben­e­fit from oper­at­ing those com­pa­nies. The U.S. banks them­selves some­times have a hard time track­ing those ulti­mate ben­e­fi­cial own­ers (or UBOs), as shown in the FinCEN Files. We had to research what com­pa­nies were owned by each of the indi­vid­u­als por­trayed in the files, and check whether we had addi­tion­al trans­ac­tions in the data tied to those com­pa­nies which hadn’t been iden­ti­fied by the banks.

This hap­pened for trans­ac­tions tied to Isabel dos Santos, which Standard Chartered flagged as part of a report about var­i­ous com­pa­nies called Unitel. Thanks to our pre­vi­ous Luanda Leaks inves­ti­ga­tion, we were able to iden­ti­fy the trans­ac­tions that were rel­e­vant to the Unitel com­pa­ny part­ly owned by Isabel dos Santos, and include this amount in her Confidential Clients profile.We came across sev­er­al oth­er cas­es where banks report­ed sus­pi­cious trans­ac­tions we could tie to one of the Confidential Clients through out­side research. Such as Oleg Deripaska, a Russian busi­ness­man, whose com­pa­nies used accounts at Expobank in Latvia. The Bank of New York Mellon report­ed sus­pi­cious trans­ac­tions to FinCEN that includ­ed Deripaska com­pa­nies using the Latvian bank, but the U.S. bank didn’t report about Deripaska direct­ly. It’s only through con­tex­tu­al­iz­ing the doc­u­ments that the pro­files could be built. Thankfully, we could rely on a net­work of knowl­edge­able media part­ners to help with that con­tex­tu­al­iza­tion, and also on ICIJ’s access to pre­vi­ous data leaks.

Finally, ICIJ reached out for com­ment to the indi­vid­u­als, com­pa­nies, and banks men­tioned in the Confidential Clients inter­ac­tive, includ­ing those that were part of the trans­ac­tions we chose to visu­al­ize with each client. In some cas­es their respons­es helped con­tex­tu­al­ize and inform our report­ing; in all cas­es we includ­ed their respons­es along­side their profiles.

Delphine Reuter, ICIJ

