Chapter 7 Risk Categorization/Classification Model – Artificial Intelligence for Risk Management


Risk Categorization/Classification Model

Define the Goal

Define the goal of the model for the risk categorization/classification model.

The goal is to categorize and classify risks associated with the application area using artificial intelligence (AI)/machine learning (ML) to automate the process. The output from a previous risk identification model is the input to this risk categorization.

Evaluation Steps

Follow the risk identification model evaluation steps in a similar manner. See Figure 7.1.

Figure 7.1 Evaluation steps for risk categorization

Evaluate Measure or Key Performance Indicator


  • List measures and key performance indicator (KPI) and risk measures identified in Step 1 of identified risk


  • List identified risk measures with their risk category names

Evaluate Business Processes


  • List business processes and measures/KPIs with identified risks.


  • List identified risk measures in business processes in the risk category.

Identify the list of measures > Identify the risks list > identify the list of categories.


  • Feed all identified measures of the given business process from “identify risk model” to train the model.

Evaluate the Given Dataset

From the given dataset (retail industry|POS dataset)

Identify the business process > identify the list of measures > identify the risk


  • Feed all identify-list-of-risk measures identified in the “identify risk or threat model” step.


  • List respective risk category names

Identify the business process > Identify the list of measures > Identify the risks list > Identify the list of categories.

Evaluate Project-Related Documents

Collect the data from the given project-related documents.

Now we go into data collection to collect risk category data.

Data Collection

Collect Risk Category Data

Collecting a risk category dataset associated with the application area may require labeling category data based on identified risks manually process to maintain as the input.


  • Output of the risk identification model.

Design Algorithm

Design an algorithm for the risk categorization model.

The risk categorization model is a typical classification of an ML problem. Thus, apply the classification algorithm.

Output is to which risk category the identified risk measure belongs. That is the risk category name.

ML/AI use case:

This is called the multiclass classification ML problem using supervised learning. Now we dive deeply into multiclass classification steps.

Identify the List of Multiclass Classification Models

Here is the list of typical algorithms used for a multiclass classification ML problem.

  • K-nearest neighbors
  • Naive Bayes
  • Decision trees
  • Support Vector Machine (SVM)
  • Random Forest
  • Decomposing into binary classification
    • One-versus-all
    • All-versus-all
  • Hierarchical clustering
  • Hierarchical SVM
  • Neural networks
  • Deep learning (with softmax technique)
    • Convolution Neural Network (CNN)
    • Recurrent Neural Network (RNN)
    • Hierarchical attention network

Finalized Algorithm for a Multiclass Classification Problem

For our experiment, we limit ourselves to only the following algorithms:

  • RNN
  • CNN
  • GaussianNB Classifier

Data Preparation

As discussed in the ML/AI project chapter in the data preparation section, we discussed a need for preparation to avoid oversampling and undersampling.

  • Oversampling/undersampling techniques are used to adjust the class distribution in the dataset.
  • Apply the dimensionality reduction technique as part of the data preparation step.

The goal is to reduce the number of features to be examined in the model evaluation process. For example, if risk categories are more than 10, apply the dimensionality reduction process to reduce it to
below 10.

Apply any one of the following dimensionality reduction techniques to limit number of possibilities to experiment. This goes to time versus cost versus usefulness.

List of dimensionality reduction techniques:

  • Principal component analysis (PCA)
  • Non-negative matrix factorization
  • Kernel PCA
  • Graph-based kernel PCA
  • Linear Discriminant Analysis
  • Generalized discriminant analysis
  • Autoencoder

Measure risk name

Change percentage

Risk category

Sales amount decrease


Financial/competitive risk

Sales amount increase


Inventory risk

Train the Model

Use the identified datasets to train the model.

  • Split the dataset into three sets as follows:
    • Training dataset (60 percent)
    • Test dataset (20 percent)
    • Validation dataset (20 percent)
  • Tuning the hyperparameters to train the model as before to identify the risk model.
  • Use the trained model for multiple iterations with 100 epochs per iteration.
  • Use the trained model for various combinations of features to find the best model.

Tune the Model

Tune the hyperparameters to train the model.

Run the training multiple times with different combinations of the hyperparameters of selected algorithms. Typically use batch size, epochs, optimizer, learn rate, momentum, and dropout rate to find the optimum combination of hyperparameters to determine the appropriate

Compare models with different hyperparameters and choose the best fit hyperparameters; use the model to train further with the full

Test the Model

Same as identify risk or threat model testing, expect this to be a multiclass classification model.

Evaluate the Model

Evaluate the model using accuracy and mean square error, and determine the learning rate.

Performance analysis steps.

Use a Confusion Matrix

A confusion matrix measures the performance of classification algorithms. Refer to Figure 7.2.

Figure 7.2 Confusion matrix predicted versus targeted

  • X—Predicted Value [Risk Categories]—5 (1 to 5)
  • Y—Targeted Value [Target Categories]—5 (1 to 5)

Use encoded values instead of actual risk categories.

Precision–Recall Curves

Precisionrecall curves measure the success of the classification model (refer to Figure 7.3).

Check mistakes of the classification model on each label and analyze to fine-tune (refer to Figure 7.4).

Figure 7.3 Precision and recall versus K

Figure 7.4 Accuracy versus epoch graph

Accuracy Versus Epoch Graph

Decide the final model with accuracy

  • If the classification prediction is wrong, you can improve the accuracy by using more training data. Collect more data, then retrain the classification and reevaluate the model for improvement in accuracy.
  • If the “classification prediction is wrong because of wrong training data,” use the urAI-text annotation tool ( to correct the input training data, then retrain the classification and reevaluate the model for improvement in accuracy.
  • Another method to evaluate model accuracy uses ensemble learning methods.
  • Ensemble learning methods are
  • Sequential ensemble methods
  • Parallel ensemble methods
  • A combination of both
  • Weighted majority rule ensemble classifier

The final ensemble classifier method is the “weighted majority rule ensemble classifier,” based on our experiments. This provides better accuracy than any other ensemble method. Here, we look deeply into this method and how it is done.

Weighted Majority Rule Ensemble Classifier

We use three different classification models to classify the samples logistic regression, a naive Bayes classifier with a Gaussian kernel, and a random forest classifier, combined into an ensemble method.

  • Cross-validation (five-fold cross-validation) results provide the following:
    • Accuracy: 0.90 (±0.05) [logistic regression]
    • Accuracy: 0.92 (±0.05) [random forest]
    • Accuracy: 0.91 (±0.04) [naive Bayes]

These cross-validation results show that the performance of the three models is almost equal.

Now, we implement a simple ensemble classifier class that allows us to combine the three different classifiers. Define a predict method that simply takes the majority rule of the predictions by the classifiers. For example, if the prediction for a sample is

  • Classifier 1 class 1
  • Classifier 2 class 1
  • Classifier 3 class 2

Now, classify the sample as “class 1.”

Furthermore, add a weights parameter. We assign a specific weight to each classifier. To work with the weights, collect the predicted class probabilities for each classifier, multiply it by the classifier weight, and take the average. Based on these weighted average probabilities, we assign the class label.

To illustrate this with a simple example, we assume three classifiers and three-class classification problems to assign equal weights to all classifiers (the default): w1 = 1, w2 = 1, and w3 = 1.

We then calculate the weighted average probabilities for a sample as follows (see Table 7.1).

Table 7.1 Weighted average calculation

Classifier Name

Class 1

Class 2

Class 3

Classifier 1

W1 * 0.2

W1* 0.5

W1 * 0.3

Classifier 2

W1 * 0.6

W1* 0.3

W1 * 0.1

Classifier 3

W1 * 0.3

W1* 0.4

W1 * 0.3

Weighted average




As in Table 7.1, class 2 has the highest weighted average probability; thus, we classify the sample as class 2.

  • Accuracy: 0.90 (±0.05) [logistic regression]
  • Accuracy: 0.92 (±0.05) [random forest]
  • Accuracy: 0.91 (±0.04) [naive Bayes]
  • Accuracy: 0.95 (±0.03) [ensemble]

We use the ensemble classifier class to apply to majority voting, purely on the class labels, if no weights are provided and are the predicted probability values otherwise. Prediction builds on majority class labels.

Prediction based on majority class labels (Table 7.2):

Table 7.2 Prediction based on majority class label

Classifier Name

Class 1

Class 2

Classifier 1



Classifier 1



Classifier 1






Prediction based on predicted probabilities:

This is for equal weights, weights = [1,1,1]. See Table 7.3.

Table 7.3 Weighted average with prediction


Class 1

Class 2

Classifier 1



Classifier 2



Classifier 3



Weighted average






The results differ depending on whether one applies a majority vote based on the class labels or takes the average of predicted probabilities. In general, it makes more sense to use the predicted probabilities (scenario 2). Here, a “very confident” classifier 1 overrules the very unconfident classifiers 2 and 3.

Returning to our weight’s parameter, we use a naive brute-force approach to find the optimal weights for each classifier to increase prediction accuracy (Table 7.4).

Table 7.4 Example of a comparison table






Standard deviation



















Final words: When applying the ensemble classifier to the previous example, the results surely looked nice. But must keep in mind that this is just a toy example. The majority rule voting approach might not always work as well in practice, especially if the ensemble consists of more “weak” than “strong” classification models. Here, use a cross-validation approach to overcome the overfitting challenge. Please always keep a spare validation dataset to evaluate the results.

Model Conclusion

Based on our experiment, the “Weighted Majority Rule Ensemble Classifier” Ensemble performs better than other algorithms for the risk categorization/classification model.

Publish/Production of the Model

Publish/produce the model, as in identify risk or threat model.


Same as the conclusion of the identify risk or threat model.