We’ve heard a lot about Machine Learning (ML) in recent years and the power of this approach and related technologies is fuelling a huge demand for skills and expertise to build machine learning models, which creates a set of its own challenges.

Machine learning sets out to extract knowledge from data. It requires expert skills in data science to organise the data into a well classified, labelled set and often consumes vast resources (more people than compute in reality) to build the ML model that provides us the power we’re looking for.

So how can we democratise this process? How can we make it more accessible, more pervasive and enable a bigger portion of our workforce to make it happen?

It turns out one promising approach lies in how human knowledge is built – teaching.

Teaching is an iterative process. Rather than giving your child hundreds of rules in grammar, you start by giving them examples of words, then simple phrases, followed by simple and then ever more sophisticated sentences. Their knowledge evolves though breaking the problem down into its constituent parts and teaching by example of both right and wrong, iterating repeatedly to refine.

And so too this can be applied to machine intelligence or machine knowledge models. Where the traditional approach with machine learning is to provide the machine with vast amounts of labelled data that’s not what we do when we teach another human. A teacher provides the label for what you’re looking for and we tell you where you look, they break the concept down, provides examples and tests for understanding.

Separating the teaching information from the algorithm

Where traditional development of ML models seeks the building of the algorithm and hence requires deeply specialised skills, machine teaching abstracts the information away from the algorithm. This allows the teaching to be performed by non-technical personnel, rather by subject matter experts on the domain being taught.

The teacher takes examples, breaks them down into components and labels those components. This labelling is performed as part of an interactive data exploration and debugging process. You start with no labelled data and rapidly build a library of labels by example. You can then manage by exception to continue to iterate and refine the labels by testing and flagging which examples did or did not match, and further labelling them to teach why.

Decomposing the labelling allows training on more atomic components of our problem, which can then be applied across a much larger set of data without the need to individually label them all. It can also promote reuse as components of a model can be reused.

A simplified example is the use of text analysis to recognise a recipe from an arbitrary page of text on a web page. By teaching the model some phrases that are often found in a recipe, and teaching it what an ingredient looks like in a list, the model can quickly learn what looks like a recipe and what does not, with examples of both positive and negative reinforcement used and further examples of the decomposed components of a recipe – such as headings or an ingredient – given as further enhancements and potentially reused in other applications that require recognising an ingredient.

Human nature has taught us that teaching each other requires breaking a problem into smaller components and teaching by specific example so that we can then recognise the generalised. It turns out teaching machines may significantly benefit from this most human of approaches.