In the past few years, machine learning (ML) has revolutionized the way we do business. A disruptive breakthrough that differentiates machine learning from other approaches to automation is a step away from the rules-based programming. ML algorithms allowed engineers to leverage data without explicitly programming machines to follow specific paths of problem-solving. Instead, machines themselves arrive at the right answers based on the data they have. This capability made business executives reconsider the ways they use data to make decisions.
In layman terms, machine learning is applied to make forecasts on incoming data using historic data as a training example. For instance, you may want to predict a customer lifetime value in an eCommerce store measuring the net profit of the future relationship with a customer. If you already have historic data on different customer interactions with your website and net profits associated with these customers, you may want to use machine learning. It will allow for early detection of those customers who are likely to bring the most net profit enabling you to focus greater service effort on them.
While there are multiple learning styles, i.e. the approaches to training algorithms using data, the most common style is called supervised learning. This time, we’ll talk about this branch of data science and explain why it is considered low-hanging fruit for businesses that plan to embark on the ML initiative, additionally describing the most common use cases.
How supervised machine learning works
Supervised machine learning suggests that the expected answer to a problem is unknown for upcoming data, but is already identified in a historic dataset. In other words, historic data contains correct answers, and the task of the algorithm is to find them in the new data.
As an example, let’s have a look at a public dataset gathered by one Portuguese banking institution during a 2012 marketing campaign. The bank aimed at encouraging its customers to subscribe to terms deposits by calling them and pitching the service.
Usually, datasets are in tables having data items (e.g. bank customers) organized in rows with variables (e.g. age, job, education, money balance) in columns. Labeled data sets also have target variables (labels), the values to be predicted in future data. In this dataset, the target variable defines whether customers have subscribed for terms deposit after a call or not.
Applying ML to datasets of this kind will help determine the likelihood of other bank clients subscribing to terms deposit.
Dataset credit: S. Moro, P. Cortez, and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
Training an ML algorithm means feeding this data into a machine using one of the mathematical methods. The process allows for building a model able to define the target variable in future data. In this case, the task of an algorithm would be to classify data items into two categories (yes/no). Generally, supervised learning operates with three main tasks:
Binary classification. The case of binary classification is described above. The algorithm classifies data into two categories.
Multiclass classification. This requires the algorithm to choose between more than two types of answers for a target variable.
Regression. Regression models predict continuous values, while classification models consider categorical ones. For instance, predicting net profit as a measurement of the customer lifetime value is a standard regression problem.
3 main problems that should be considered for supervised learning
That said, business executives should consider three main aspects to justify decisions about adopting supervised learning techniques.
Collecting data
Data is the backbone of machine learning: The more records you have, the higher the chance to build accurate models. Collecting and organizing your data the right way is a form of art. Not only you do have to set a coherent data collection mechanism, but also ensure that your variables are relevant for prediction.
Labeling data
In the bank case mentioned above, labeling doesn’t seem like a challenge. If the data collection was done right, the labels were assigned straightaway after the marketing call or after the campaign was finished. But, usually, things are more complex than that.
Imagine that you want to automatically sort out rotten apples from good ones on a packaging line. And if you apply image recognition techniques, you will have to make a large set of images containing various apples, both rotten and good ones. Then you’ll have to manually assign labels to them (good/rotten). Considering that image recognition is only possible if you have thousands of examples, labeling may take too much time.
In 2006, Google crowdsourced their image labeling by suggesting to its users a game-like experience that asked people to simply label images thus contributing to the company’s AI-development. Another leading AI company, Amazon, also crowdsourced their labeling duties through creation of the mechanical turk platform where people can earn money by assigning data labels.
Guru Banavar, the IBM data scientist behind the Watson AI platform, assumed that about 70 percent of complex analytical tasks today are related to data preparation and suggested the term “data labeler.” There have to be people who are preparing and labeling data for machines to understand. Here’s a situation in which human labor automation driven by ML creates new job opportunities.
Prediction accuracy
The prediction accuracy of coin flipping is 50 percent for binary classification, according to the probability theory. Good prediction accuracy in machine learning is about 90 percent. This number can vary depending on the task. But the point is clear. You can’t achieve the same level of precision if you used a standard rules-based approach to make critical decisions. There’ll always be a chance that your prediction is wrong.
That’s exactly the issue behind the new FaceID technology used in iPhoneX. Apple claims that FaceID uses machine learning to adapt image recognition to constant changes in human appearance, whether you’ve grown a beard or worn Hunter-Thompson-type glasses. The company doesn’t disclose the ML algorithms under the hood, but the rising concerns basically address this prediction accuracy problem. Can we delegate to the machine – that can’t be 100 percent right – such important decision-making tasks? While Apple thinks they can, we recommend understanding that ML solutions don’t always present the right answers.
However, even these challenges don’t prevent supervised learning from being the most business-oriented style of ML. It’s less independent than unsupervised learning, where data isn’t labeled as analysts may not know target variables. (Unsupervised learning is used to find anomalies in data or cluster data items to groups that humans can’t assume themselves.) It’s also more practical than reinforcement learning, which currently thrives in closed game-like systems only.
Common use cases for supervised learning
In November 2016, Tech Emergence published the results of a small survey among artificial intelligence experts to outline low-hanging-fruit applications in machine learning for medium and large companies. While there were only 26 respondents who could vote multiple times, they confirmed what was evident already.
Please note that the survey covered both supervised and unsupervised learning. While supervised learning covers the lion’s share of ML applications, in data security the unsupervised style is dominant.
Interestingly, the groups used by Tech Emergence provide only a vague understanding of how use cases are distributed among different machine learning tasks. For example, Big Data can be applied to any of the mentioned groups, given that the algorithms process large and poorly structured datasets, regardless of the industry and operations field this data comes from. Also, sales tasks usually intersect marketing ones when it comes to analytics. That’s why we suggest a slightly different breakdown of the most common use cases.