In some of my previous articles I’ve talked about the need for good, balanced training for Machine Learning to work properly[1] – the need for it to be trained upon both the outcomes you seek and the outcomes you don’t – but there is a more subtle and worrisome impact of this need too.

Machine Learning models can only be as good as the models and data they’re trained with and herein lies the problem. Like the tech industry itself, the data & models have been trained on and tested with is overrepresented by Western white men.

Examples of bias creeping into systems and processes in society have always had the potential for damaging social and economic effect, but the risk is amplified with AI as often the algorithms are built, tested and then turned into a ‘black box’, a closed system that people don’t look inside, question or fully understand how the outcomes are achieved.

In February this year Joy Buolamwini of MIT highlighted that for photos of a white man, the AIs from IBM, Microsoft and Megvii could correctly identify the gender 99% of the time. If the photo contains a dark-skinned woman, the accuracy drops – dramatically – to 35%[2].

As AI moves into more and more components of our daily lives – everything from shortlisting CVs for jobs you apply for, to analyzing your application for a mortgage, to predicting where crimes will occur to focus policing – there is an inherent risk that the AI will fuel a self-perpetuating bias.

If historically, only CVs for male candidates with specific university and career backgrounds have been hired as CEO, then that data, if used to train the CV sifting, would also prefer those same segments of society for those roles.

The humanity of data, what it means and how it can be applied fairly is a critical topic. Phil Harvey and Noelia Jimenez Martinez are in the process of shaping a book “Data: A Guide to Humans” here: and this topic is one that will gain more and more criticality as AI pervades every aspect of our lives.

Part of the solution will come from democratizing AI so that it’s not something that belongs to one culture[3]. If we make it something that multiple cultures, multiple countries and diverse groups are part of modelling, if we can ensure that it’s developed not just in predominantly western cultures, but in Africa, Latin America, APAC and everywhere else then we can hopefully start to balance the equation and start working out the bias.

Like teaching our kids bad habits – when they throw a tantrum, they get what they asked for – if we train our AI on bad habits, bad patterns from our past and bad behaviours we exhibit today, we’re only fueling trouble down the road. If we can consider our actions carefully, thinking about the consequences, the implications and the nuances, perhaps AI can be part of how we kick the habit on our addition bias.