Machine Learning Models in Insurance: Understanding Applicability and Usage

The terms “artificial intelligence” and “machine learning” are used with growing frequency within the insurance field. While the language is gaining ground in popular understanding, the concepts themselves are often oversimplified, conflated,  misused, or misunderstood. Without a clear grasp of the applicability and usage of models in machine learning, business development and sales staff may be at a disadvantage as they try to position products.

As you expand your offerings, a clear understanding of what machine learning (ML) is (and isn’t) will help managers embrace how ML can achieve business goals. Where and how does your product really use machine learning? Being able to pinpoint and accurately describe these answers will provide an advantage over competing products, improving your overall value proposition.

What is Machine Learning?

Too frequently in discussions about topics related to artificial intelligence (AI), terminology gets muddled. To highlight the role of machine learning within AI and focus on its applicability in insurance, let’s take a quick look at important terms and how they relate to each other:

  • Artificial intelligence (AI): Tricky to define, AI generally means an automated system that can perform work that would otherwise require human involvement. Applying rules that are to some extent generalizable to unseen examples is usually considered AI. AI work includes the three steps of problem representation (translating a business problem into computer-based representation), goal definition (usually in the form of a mathematical equation to calculate error/utility), and target optimization (to arrive at a solution, usually via a series of iterative computations).
  • Machine learning (ML): A field within AI, ML is the field of study that aims to allow computers to learn and improve at a task from data, without being explicitly programmed. ML models learn with experience; the rise of bigger data sets (big data) is propelling the rise of ML. It can be divided into supervised (using labeled data samples), unsupervised (finding hidden patterns in data), semi supervised (hybrid approach that allows training of ML models), and reinforcement (enabling the computer to create a decisioning model to decide best action) learning.
  • Neural networks: A class of machine learning models based on neural units where information is received, processed, and transmitted as an output.
  • Deep learning: Machine learning that uses neural networks that have many layers of neural units.

The Leading Role of Supervised Learning

Within the insurance industry, supervised learning is the form of ML that’s most impactful. This type of learning is seeing high adoption in an operational context, leading to greater efficiencies. This is the case both with structured data (most frequently in tabular form) and for more complex unstructured data (e.g., images and free text). Structured data can be handled more easily than unstructured data, making it the low-hanging fruit within the insurance industry. (ML models on unstructured data are also on the rise, as demonstrated by the rise of chatbots.)

Depending on the data type of the target variable, one of two categories of supervised learning will be more appropriate. The first, regression, involves the prediction of a continuous variable (e.g., example prices or probabilities). The dependent variables can be continuous or discrete. For example, plotting drivers’ amount of experience against the probability of making a claim this month can show the expected probability of a driver with a particular amount of experience making a claim this month. Regression models in insurance could be used for pricing optimization, risk/credit scoring, and product recommendation engines (with output either for customers or for agents).

The other category, classification, employs discrete data types (e.g., gender or blood type) for the target variable. For example, is the insured a smoker or a non-smoker? In this instance, nondisclosure can be problematic. By using other pieces of disclosed information (such as the mean number of cups of coffee consumed each week), a classification model can predict whether an applicant is actually a smoker. Identity verification, through voice or facial recognition, is another use case for classification.

Varied Needs, Varied Models

As underlying data becomes increasingly complex, sophisticated machine learning models are necessary (the larger the data set, the smarter the ML). Yet more complex models aren’t always the most beneficial ones to apply, depending on resources and needs. If performance is critical, simpler models can provide more useful real-time models than more complex models. To this end, working with fewer variables can be beneficial within ML, as it allows use of simple models that run quickly.


Four Examples of Machine Learning

Insurance has benefitted, and will continue to benefit, from longstanding use of statistical machine learning techniques, particularly within actuarial underwriting and pricing. However, it’s getting harder to determine fact from fiction with ML and AI claims. For the insurance industry, the key is to understand business goals before kicking off machine learning projects.




Craig Beattie

Craig Beattie is a Senior Analyst in Celent’s insurance practice. He leverages his experience in the use of enterprise architecture and application architecture practices in the insurance industry to advise insurers on topics including IT strategy, legacy modernization, insurance vendor analysis, and enterprise architecture topics. More in-depth  reports are available from