The media update team looks at unsupervised learning algorithms and the interesting ways in which they are being used.

Machine learning, a component of artificial intelligence (AI), is helping humans in many ways, from predicting your best route to work to learning what products to recommend to you.

But we don’t always have complete datasets that machines can learn from. Without comprehensive information, they can’t predict future data or deliver the answers we need.

We might, for instance, not have a dataset of both fraudulent credit card transactions and the attributes that made these incidents fake. In this scenario, a machine wouldn’t be able to learn what the typical characteristics of fake transactions are – and would therefore be unable to determine whether new transactions are fraudulent or not.

Enter unsupervised machine learning. Unsupervised machine learning can be defined as a type of machine learning algorithm that can draw conclusions from data that is not classified, categorised or labelled.

It allows machines to explore data to find intrinsic structures within unlabelled datasets. It’s used when input data is available in a dataset, but no corresponding output data. Unsupervised learning is useful in our credit card transaction scenario above where only the attributes of the transactions are available in a database – but no data on which transactions were fake.

Untrained learning can ‘explore’ data like this and discover knowledge within it, but it doesn’t answer a specific question or determine a specific value.

It’s one type of machine, with supervised learning being the other. The media update team covered supervised machine learning in a previous article, #Definition: What is supervised learning in AI?

The cool ways unsupervised learning is being used

Unsupervised learning requires only input data, making it perfect for examining massive datasets like customer behaviour databases, genetic data and patient health information.  

Using this form of learning means humans don’t have to spend months entering the corresponding output data, or label, for each row in a data basis.

One way unsupervised learning does this is via clustering, a well-known approach to untrained learning. Clustering basically groups similar data points together so you end up with data that has been sorted based on factors that they have in common.

What’s great about clustering is that it can quickly recognise a number of factors that data points have in common – which would take humans much longer to do. Not only that but because humans are subjective, there are some factors they might never have considered when looking for similarities in data points.

Take the example of segmenting consumers.

Marketers use information about the various segments – or types – of consumers they are targeting. But, if you have a list of 20 000 customers with their corresponding attributes, then it could take a long time to figure out what some of them have in common and group them accordingly. Not only that, but you’re bound to miss some of the similarities between the data points.

The same goes for monitoring the behaviour of users on a company’s network. Unsupervised machine learning can detect anomalies in user activity by recognising patterns in thousands of users’ typical conduct on the network. Once a user behaves in a way that is different from the rest, the company is notified of a possible breach of security.

Unsupervised learning can quickly find similarities and patterns in data that humans can’t always recognise, and uncover insights that give businesses the competitive edge.  

And that’s what makes it cool.

Want to stay up to date with the latest media news? Subscribe to our newsletter.
The role between humans and AI-powered machines is fascinating. Read how the two can work together for the greater good in our article, What is augmented intelligence?