Skip to main content

Building models with Pega machine learning

6 Tasks

15 mins

Visible to: All users
Beginner Pega Customer Decision Hub 8.7 English
Verify the version tags to ensure you are consuming the intended content or, complete the latest version.


U+ Bank uses AI to determine which credit card offer to show a customer on the bank's website. To reduce the number of clients that leave the bank, the business wants to leverage the historical data that the bank has collected on customers that have churned in the past to predict which customers are likely to leave the bank soon. The bank wants to show potential churners retention offers instead of a credit card offers.

As a data scientist, your task is to create a predictive model that predicts churn. You decide to create the model by using Pega machine learning.

Use the following credentials to log in to the exercise system:

Role User name Password
Data Scientist DataScientist rules

Your assignment consists of the following tasks:

Task 1: Create a new predictive model

Create a new predictive model, ChurnPegaML, by using the Churn Modeling template in the Retention category.

Task 2: Prepare the data

Load the data set by using the file. Set the type of predictors that have no predictive power, such as CustomerID, to Not used. Create a uniform sample that uses 100% of the data. Retain 20% for the test set and 20% for the validation set. In the Outcome definition, use a Binary outcome type and Outcome as the outcome field. Map the values of the outcome field to the outcome categories.

Task 3: Analyze the data

Examine the trends exhibited by the best-performing predictors. Create a virtual field by combining several numerical predictors, and then examine the trend exhibited by this new predictor.

Task 4: Develop predictive models

For predictor grouping, use the best predictor of each group. Create a new bivariate model.

Task 5: Analyze the models

Compare the scores of the three models. Pay particular attention to Discrimination.

Task 6: Select model

Select the Regression model. Make sure that all predictors are mapped to customer properties. Reclassify the classes into a loyal class and a churned class. Save the model.


You must initiate your own Pega instance to complete this Challenge.

Initialization may take up to 5 minutes so please be patient.

Challenge Walkthrough

Detailed Tasks

1 Create a predictive model

  1. On the exercise system landing page, click Pega CRM suite to log in to Prediction Studio.
  2. Log in as a Data Scientist with user name DataScientist and password rules.
  3. In the navigation pane of Prediction Studio, click Models to open the models landing page.
  4. In the upper-right corner, click New > Predictive model.
  5. In the New predictive model dialog box, in the Name field, enter ChurnPegaML.
  6. In the Category list, select Retention.
  7. In the Template list, select Churn Modeling.
  8. Click Start to proceed to the data preparation step.

2 Prepare the data

  1. Download and extract the CustomerData.csv file.
  2. In the Source selection section, click Choose File, and then select the CustomerData.csv file.
  3. Check the data, and then click Next to proceed to the sample construction step.
  4. In the CustomerID field, change the type to Not used.
  5. In the Hold-out sets section, retain 20% for validation and 20% for testing.
    Retain for validation and for testing
  6. Click Next to proceed to the outcome definition step.
  7. In the Outcome definition section, in the Outcome type list, select Binary.
  8. In the Outcome field to predict list, select Outcome.
  9. In the Churn row, in the Outcome category column, select churned.
  10. In the Loyal row, in the Outcome category column, select loyal.
    Mapping outcomes
  11. Confirm that the number of cases in the development, validation, and test sets are approximately equal for both outcome categories.
    Outcome field values for data sets
  12. Click Next to proceed to the data analysis step.

3 Analyze the data

  1. In the list of predictors, click CreditScore and examine the grouping for this predictor.
    Examine the grouping
  2. In the upper-right corner, click Cancel to close the predictor report.
  3. Click New virtual field to open the Virtual field dialog box.
  4. In the Virtual field dialog box, in the Name field, enter DebtToIncomeRatio*TotalAssets.
  5. Click Fields, select DebtToIncomeRatio, and then click Insert.
  6. Click Fields, select TotalAssets, and then click Insert.
  7. Complete the expression to read DebtToIncomeRatio * TotalAssets.
  8. Click Save & close.
  9. Confirm that the newly created predictor outperforms the two original predictors in the validation.
    The newly created predictor outperforms
  10. Click Next to proceed to the model development step.

4 Develop predictive models

  1. In the Predictor grouping section, select Use best of each group.
    Use the best of each group
  2. Click Next to proceed to the model creation step.
  3. In the Model creation section, in the Create model list, select Bivariate.
    Select Bivariate model
  4. In the upper-right corner, click Submit to add the model to the model list.
  5. Click Next to proceed to the model analysis step.

5 Analyze the models

  1. On the Score comparison page, ensure that all the model checkboxes are selected.
  2. Click Analyze charts to access the model analysis.
  3. On the Discrimination tab, examine the results.
    Examine the Discrimination tab
    Note: The regression model outperforms the decision tree model and the bivariate model as it has the largest area under the curve (AUC). However, before you choose a model, consider the number of predictors required by the model. Under certain circumstances, you might decide to select a lower-performing predicting model but one with fewer predictors.
  1. In the upper-left corner, click the arrow next to Model analysis charts.
    Back to the PML Wizard
  2. Click Next. Here, you can analyze the score distribution.
  3. Click Next. Here, you can analyze class comparison.
  4. Click Next to proceed to the model selection step.

6 Select the model

  1. In the Model selection section, ensure that the Regression model is selected.
  2. In the Save model section, in the Apply to field, enter UBank-Data-Customer.
  3. Click Finish to select the model.
  4. On the Model tab, in the Expected score distribution section, click the area between Result7 and Result8 in the score distribution chart.
    Expected score distribution chart
  5. In the Classification groups section, in the class 1-7 row, in the Name column, enter loyal.
  6. In the Class 8-10 row, in the Name column, enter churned.
    Rename classes to churned and loyal
  7. On the Mapping tab, ensure that all predictors are mapped to the appropriate customer fields.
  8. In the upper-right corner, click Save.

Confirm your work

  1. In the upper-right corner, click Run to test the predictive model.
  2. In the Run predictive model dialog box, in the Inputs section, select data transform Troy, a customer that is likely to churn, as the data source.
    Data transform Troy selection
  3. Click Run, and then in the Outputs section, certify that the result for Troy is churned.
    Output for Troy
  4. Re-run the model with data transform Barbara, a customer that is expected to stay loyal, as the data source.
  5. In the Outputs section, verify that the result for Barbara is loyal.
    Output for Barbara


This Challenge is to practice what you learned in the following Module:

Available in the following missions:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

100% found this content useful

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice