The data we used on the 2016 US presidential campaign


Cambridge Analytica did not use Facebook data from research company GSR on the 2016 presidential election. We provided polling, data analytics and digital marketing to the campaign as part of a collaborative effort under Brad Parscale, alongside the RNC and other vendors. 


What data did we use?

After Cambridge Analytica was hired by the campaign in June 2016, the company initially used its own commercial and political data (licensed from vendors such as L2, Infogroup and Bridgetree) as well as its own suite of models. 

From August onwards, our data team used the following sources of data:

  • Voter files from the RNC.
  • Polling using continuous large-scale research surveys in each battleground state (a combination of face-to-face, phone and online). 
  • The campaign: data that the campaign itself collected on donors, event attendees, volunteers, store purchases.
  • Early and absentee voting returns released by each state.
  • Consumer data available from commercial brokers.

What did we use the data for?

We used the data to identify “persuadable” voters, how likely they were to vote, the issues they cared about, and who was most likely to donate. We also built a polling tracker for every key state, and provided dashboards for the campaign, including the group that planned the candidate’s travel agenda. Our analysis was also used for targeted advertising. 

Isn’t that the same as every other campaign?

We used the same kind of political preference models used by the Obama and Clinton campaigns; however we started five months out from election day and did it with far fewer resources and less data. It’s very rare to have one vendor working across so many different campaign functions – we’re proud that we integrated polling, data science and marketing into a single operation. Having a large amount of control over each of these three areas allowed us to be extremely efficient and reactive. 

Did we use personality models?

We didn’t have the opportunity to get into personality models. Building a presidential data program often takes campaigns well over a year. We focused on the core elements of a political data science program, as we explained at public events, in media interviews, and on our website. 

What else did we do?

We managed a large proportion of the digital advertising budget on behalf of Giles-Parscale. In doing this we used a suite of models produced by the data science team, which outlined profiles such as undecided voters or inactive supporters, and matched these audiences to online cookies, mobile devices, and social IDs. Onboarding data through ad tech platforms is standard digital advertising practice.  

We also relied on the audience segments that Facebook and other online platforms make available to all advertisers. These are based on interests and demographics, to help serve the most relevant ads to the most relevant people. Our digital marketing used core campaign messages with "paid for by" disclaimers to persuade voters to vote, increase turnout among supporters, and boost volunteer numbers and donations. In the case of Facebook and Twitter ads, these were clearly linked to the official Trump presidential campaign Facebook and Twitter ​​accounts. Cambridge Analytica did not use bots.

What was our impact? 

Elections are won or lost by candidates, not data science. Data is important in modern campaigns for deciding how to allocate resources and for making advertising more efficient, but of course the candidate and their message ultimately needs to connect with the electorate.