Upon the enlightening release of the Cambridge Analytica exposé that revealed the company’s underhanded tactics it utilises to influence and manipulate people with a combination of data analytics, espionage, and ‘honeypotting’, I think it’s incredibly important to discuss how the analytical community needs to ensure a commitment to honest and unbiased analytics, and ethically sourced data.

It should always be on the analyst’s mind about how their work may be used and interpreted by all parties who consume the analysis that they’ve produced. This is underlined by the importance of clarity in data visualisation. Producing data visualisations that have little to no confounding elements (such as dual axis and scales that don’t start at zero) is important to ensure that our analysis can be correctly and easily interpreted by our audience, and minimise the risk that our audience draws false conclusions from the visualisation. If there are any particular nuances to the data that may cause an incorrect interpretation of that data, it’s important that it is made clear to the user and isn’t allowed to cloud their judgement. Techniques like visualising uncertainty are therefore incredibly useful when showing analysis that includes things like forecasting or correlation analysis that rely heavily on confidence intervals.

Evidently when it comes to enacting these methods, to create clarity in data visualisations, there is a wide scope for interpreting how this should be carried out. This creates the subjectivity of data visualisation which opens a gap which is ripe for ethical dilemmas! One thing that we, as analysts, have to be salient of is making sure we don’t infer our biases in the analyses we carry out. When ensuring our audience don’t mistakenly draw false conclusions from the data we’re presenting to them, we need to make sure that these ‘false’ conclusions aren’t simply the conclusions that we don’t want them to make because it doesn’t align with our world view. This is especially important in the land of data journalism. The vast majority of publications come with a litany of biases; which side of the political spectrum they lean toward, their views on particular hot-button topics, and whomever sponsors their content. It’s important that the data being presented in these publications isn’t being twisted to create misleading and false narratives that align with the publication’s biases. Doing so is morally reprehensible and incredibly manipulative, but we see it time and time again, and the general public that aren’t well versed in statistics and good data visualisation will fall for these cheap tricks again and again.

This is simply a microcosm of the scope for this topic. It’s obvious that despite my misgivings, analytics will still be used for a variety of dishonest and malicious practices – hell, most marketing companies use analytics to make sure us chumps in the public are enticed as much as possible by their product/message/service and that’s not likely to stop any time soon. Hopefully, in time and likely after a large enough scandal with big data (e.g. accounting and the Enron scandal influencing the profession’s push for a greater emphasis on ethics, and physics with the Manhattan project) we’ll solidify a code of ethics that all analysts will strive to follow.

Advertisements