What is machine learning?
Dr. Frank Sauberlich discusses the inner workings of machine learning.
Every useful machine – from advanced AI through to the humble toaster – has to possess ‘knowledge’ about the world in which it operates. For example, within the design of a household toaster, much knowledge about the world must have already been implemented. This is implicit knowledge.
The toaster ‘knows’ what electrical voltage is available through the design of its electric components. Through its physical dimensions it knows the size of a slice of bread (with some allowance for variation). Within the toaster’s regulation mechanisms lies knowledge on the duration of toasting typically needed, and in its switch and knob design it even knows the typical size of people’s fingers with which these buttons are being pushed and turned.
Similarly, an AI has to possess a lot of knowledge. For example, an AI may know which medicine should be prescribed for neurodermatitis; another AI may know which phonemes people use to utter “call John!”; the AI in your car may know which visual features extracted from a camera stream indicate the markings on the left and right of a highway lane.
It is a fundamental truth that such knowledge is a requirement for proper operations of any machine. However, an AI is particularly demanding in the amount of knowledge it needs to succeed – perhaps a million times more bits and pieces than that of a successful toaster. Without an extensive amount of knowledge, there is no AI. This requirement for knowledge about the world goes back to Conant and Ashby’s good regulator theorem.
Direct design vs machine learning
The question any machine creator faces is how best to impart the necessary knowledge into the machine. Essentially, there are two ways to give a machine knowledge: direct design and machine learning. Direct design encompasses any features inserted into a machine directly by a human designer, i.e. by an engineer who understands the problem and knows how to solve it. For example, an engineer will use heating bodies of adequate resistance in order to achieve the right burning temperature to toast bread, given the voltage supplied by the electrical grid. Similarly, a computer programmer may use IF/THEN statements to achieve the desired behaviour of a piece of software. This is the traditional way of engineering.
When we look to insert especially large amounts of knowledge, we may want to do it differently – automatically. We may endow a machine with a component that automatically fills up other parts of the machine with knowledge. This is the basis of machine learning.
Rather than getting its world-related knowledge explicitly by a human designer, a machine is given only a few rules, defining how to extract knowledge. The machine is then left to interact with the world on its own: it learns. As a proxy for the world, a data set is usually provided containing condensed information on the correct action to take in a given situation.
The machine is left to find the rules in the data, and express those rules in a way that is suitable for the machine. The results are often big matrices of numbers – the machine figures out how to get the numbers in the matrices right. The process typically involves some sort of an optimization algorithm whereby errors, defined as a disparity between what the machine does and what it should do, is gradually reduced.
Interestingly, at the end of such a learning process a machine will often perform better than if humans had attempted to achieve the same outcomes through direct design. Also, what a machine learns is often not easily understandable by humans. Machine ways of solving problems are often nothing like the way a human would explain how to do a task, e.g., “Once this light switches off, press this key.” Machine wisdom is expressed in a pool of numbers that are virtually impossible to track and understand.
Supervised and unsupervised learning
There are two main categories of machine learning algorithms: supervised learning and unsupervised learning (see this blog – Machine learning goes back to the future – for more details). There’s also a special subset of machine learning algorithms that are designed to be human interpretable, but typically the cost of making these algorithms interpretable is that of lower overall performance (see Objectives and accuracy of machine learning algorithms).
Machine learning is often set up such that the majority of the learning takes place during initial production, i.e. before it reaches the customer. AI diagnostic systems used by physicians, for example, capitalise on a supervised machine learning algorithm trained using data from historical patients. The algorithm is then applied to make a prediction as to the most likely diagnosis for a new set of symptoms and patient health history. It is also possible that learning continues; the machine improves through interactions with the customer.
A successful AI needs more than just machine learning. To understand the other components necessary to create an accomplished AI, in the next blog we will look at what a complete AI looks like without any machine-learned knowledge.
Don't miss part one of this series, "Wait, machine learning and artificial intelligence aren’t the same thing?"