Challenger MI: Trump: what the left-brain saw, but the right-brain didn't

So, this blog takes a few tangents, and here is the first: I have noticed, increasingly, that I think in tangents. Two practical manifestations of this include my tendency to use em dashes in my writing, which – according to Grammar Book (case in point) – are used to indicate, "an interruption, or an abrupt change of thought" (aka, a tangent!) And, secondly, instead of the usual A4 or A5 notepad – or, increasingly, a tablet of some description – that is the accompaniment of virtually everyone in the world of work, I have started using an A3 clipboard and paper, because it enables me to better visualise all of the tangents I take in any given meeting!

[Incidentally, this very blog is a tangent: I told myself I was going to do all sorts of other things tonight, but – having spent my morning bike ride thinking about this (another one, I’ll try to do one per paragraph) – couldn't resist spending my evening writing it.]

Anyway, I do some guest lecturing and was recently working with some first year Marketing students via a lecture and a series of tutorials on 'The Principles of Marketing Research'. This was one small element within a broad marketing syllabus, and I wanted to get across the basic principles, of course, but also draw their attention to some other points that I think will stand them in good stead for their future careers.

Foremost among those was my view that there is a fundamental conflict at the very heart of that term "Marketing Research". Namely, that each constituent part – 'marketing' and 'research' – is a specialist discipline, wholly different to the other and, often, best lend themselves to people who think predominantly with the creative right-side of their brain (former) or the calculated left-side of their brain (latter).

I have seen this dichotomy innumerable times in my career, as I mostly work as the sole analyst (left-brain) in teams of marketers (right-brain). For the most part, this is highly productive because it ensures that a bit of Devil's Advocate is played, and that an issue is considered from varying perspectives, leading to a well-tested outcome. But I won't talk about any specific work examples, because I think we can have much more fun than that.

Instead, I'm going to talk – as I did to the students – about The Donald (sorry, President Trump). First, a boast; second, a question. When very few others did, I forecast a win for The Donald; and that leads me to ask: how did more people not see this coming? It was right there in the data.

Newsweek infamously did a print-run of their “Madam President” headline, while the Huffington Post – similar to many media outlets – declared that Clinton had a 98% chance of victory. Paddy Power paid out on a Clinton win, to the tune of £800,000; then, when Trump won, it cost them a further £3.5m, and was their biggest ever political payout.

I shook my head through all of this, boiling down to two (three) things: 1. Confidence levels; 2. the margin of error (MoE); (3. triangulation).

Most of the assessments that we all made were based on polls; the difference, therefore, was in how we interpreted those polls. The first chart below shows the extent to which Trump was ahead (above the line, red) or behind (below the line, blue) in national polls (more than 160-odd of them) in the six months leading up to the election. In only 16 cases – roughly 10% – was he ahead. So, most journalists and marketers – drawing heavily on the right side of their brain – looked at that kind of chart and thought it's clearly in Clinton's favour: “Madam President”, £800k payout, thank you very much, and goodnight.

But history shows that they were wrong; and we need to understand a bit about polls (and surveys in general) before we continue. Firstly, they can never be 100% accurate because it is never possible to speak to every single voter in the population (c250m); but, statistically, if we ask a representative sample of the population and weight it accordingly, we can be confident to a certain degree. Polls will be expressed with a confidence level, usually 95%, that says, “we are confident that 95% of the time X will be true.” The size and structure of our sample, relative to the population of interest, determines our margin of error. The results of a survey will not be bang on so, taken together, we use a range within which we are 95% confident that the true value lies; typically, on a sample size of 1,000 the MoE is +/-3%, and on a sample of 2,000 it is +/-2% (there is more to this, still, but this blog is not about that extent of statistical detail – I save those for weekends!)

So, if a poll of 1,000 people says that Trump will get 49% of the vote and Clinton will get 51%; what it is actually saying is: there's a 95% chance that Trump will get between 46% and 52%, and a 95% chance that Clinton will get between 48% and 54%. The key point? Those ranges cross-over one another. It is, effectively, saying that the outcome is too close to call, statistically.

So what happens when we super-impose the full range of the margin of error on top of the previous chart? Firstly, below, we can see that the margin by which either candidate was leading was rarely outside the MoE – fewer than one in four cases; and, in particular, when we magnify the polls closest to election day, that is even less likely: only once was Clinton ahead and outside the MoE and, for the most part, it wasn't all that close to being outside of it, either.

But, guess what? Firstly, these polls were broadly correct when we bear in mind that Clinton did win the popular vote. The thing is that US elections are not decided on the popular vote (nor are UK ones), but the Electoral College system, which renders national polling almost irrelevant, and means that only state-level polling bears any direct relevance to the outcome.

Some states are always red (Alaska, Idaho, Kansas, Utah, Wyoming), and some states are always blue (DC, Minnesota). And there are always a handful of key battleground states with, in 2016, Florida being chief among those (29 Electoral College votes), along with various others including Ohio (18), North Carolina (15), and Michigan (16).

State-level polling, in the equivalent format to the previous charts, and covering the couple of months up to the election, is shown for Florida, Ohio and North Carolina below, finding thus:

Florida: Neither candidate was outside of the margin of error at any point. While Clinton was ahead in more polls (11 vs. 8), this was only the case in one of the five polls closest to the election. What’s more, Trump consistently came closer to being outside the MoE: his average distance from the edge of the MoE was 2.8%pts, against 5.1%pts for Clinton.
Ohio: Only once was a candidate outside the MoE, and that was Trump in the poll closest to the election. Trump was ahead more often, 16 occasions vs. 6. Again, Trump consistently came closer to being outside of the MoE: 3.6%pts, against 4.8%pts for Clinton (or 5.6% pts if we exclude the outlier that was two months from the election).
North Carolina: Only once was a candidate outside of the MoE, and that was Clinton about a month before the election. And, again, she was ahead more often but not so in the polls closest to the election. Significantly, as the election drew closer, it was Trump who came closer to being outside of the MoE. Overall, on average, Trump was 3.1%pts away from the MoE (or 1.8%pts if you only consider those closer to election day), against 4.7%pts for Clinton.

Trump won all of these states; and it was looking at this Florida and Ohio polling ahead of election day that convinced me that he was going to win. The clues were there.

Now, margin of error doesn’t explain all of this, and nor was it all the fault of right-brainers (journalists, Paddy Power’s marketing team) misinterpreting the data. The left-brainers made plenty of mistakes too and, indeed, it is the polling industry that has come in for most of the flak over the last eighteen months (UK General Election: ‘wrong’; EU Referendum: ‘wrong’; US Presidential Election: ‘wrong’). There were methodological flaws in the polling, from sampling to weighting (e.g. under-representation of historical non-voters), and some of these missed the ‘Trump Effect’. Further, states like Michigan and Wisconsin consistently had Clinton ahead, often outside of the MoE, but Trump ended up winning both.

And further to that, my charts above are a little simplified, because the point estimate is still the most likely result, while the closer to the extremes of the margin of error we get, the less likely those results are – this is why the finding that Trump was often closer to being outside of the margin of error is important, as the point estimates were more in his favour.

Finally, of course, is the confidence level. A 95% confidence level means that there is a 1/20 chance that a poll is entirely wrong - point estimate, margin of error, everything. With an election every four years, any given poll is likely to call entirely the wrong outcome once every 80 years - perhaps this was a once in 80-year event, one of Taleb's 'Black Swans'. This doesn't really apply here, because there were so many polls, but is important for those conducting one-off polls/surveys to remember; and the likes of the Economist/YouGov published 22 polls in the campaign, NBC/WSJ published 25, and Reuters/Ipsos 21 - each of those is likely to have been entirely wrong once.

Beyond this (my third point) – which I won’t elaborate on too much because this blog is already far too long – is the need for triangulation. Don't just take one source (polls) and be satisfied with that, but look to test your findings via other means. In this case, I found evidence that I deemed fairly compelling on social media: the left (wing, not brain) usually dominate Twitter, but Trump was doing so here; and, in the example of Brexit – which I forecast to within a decimal point and where most of the above was also true – I did so by speaking to people 'on the ground'.

All of this highlights where the right- ("marketing") and left- ("research") brain come into conflict. The creative right brain is looking for the headline, the one key point that makes a compelling story – we're always told that we need to be succinct, we need an 'elevator pitch' (I am incapable of making an elevator pitch). But, in this case, the left-brain would tell you that there literally was no easy headline, no compelling point. The election was statistically too close to call, but there was plenty of evidence that Trump would win. The difference can be summed up by considering two different headlines, one less catchy than the other:

Left brain headline

Right brain headline

Hardly 'breaking news', is it?

This doesn't explain the phenomenon in its entirety; but the point is that the polling was, categorically, not telling us anything conclusive – and, at the more granular level, strongly hinting of what would come – yet the media outlets, and others, were acting as if it were conclusive. Whoever told Paddy Power to pay out clearly had no grasp of the data, and cost them millions (unless, of course, it was a pure marketing ploy, which is possible).

To finish by bringing it back to my first year undergraduate marketers: the point that I was trying to get across to them was to make sure that you aren't doing a 'Madam President' for your company or clients; make sure you aren't telling them something categorically that the data isn't actually saying – be clear about what the data is telling you, the confidence level, the margin of error, the assumptions you've made…even if the right side of your brain is telling you otherwise.

Challenger MI

Monday, 6 February 2017

Trump: what the left-brain saw, but the right-brain didn't

No comments:

Post a Comment