
Data scientist tools for Customer Decision Hub
Pega has an open data model. Consequently, data scientists can export adaptive model data, predictor binning data, and historical data for further analysis. The open-source GitHub repository Pega Data Scientist Tools helps you to build meaningful plots and more. The tools are recently updated, and this topic will be updated shortly.
Video
Transcript
This video introduces you to the open-source tools that are available to data scientists that work on Pega Customer Decision Hub™ projects. Consider the following scenario: U+ Bank uses CDH to optimize customer interactions across multiple channels.
As a data scientist, you can monitor your adaptive models in Prediction Studio. The bubble chart shows you which models perform well, and which models do not.
However, you might want to use a third-party analytical tool to do an in-depth analysis of the performance of the models and predictors. To do so, you export snapshots of the model data and the predictor data from your CDH system.
The Pega Data Scientist Tools GitHub repository provides utensils to analyze analytical data from a Pega decisioning system in R and Python.
To showcase the tools in Python, we'll use a Jupyter notebook. After installing the PDS Tools package, you can import the ADMDatamart class.
The ADMDatamart class orchestrates reading, preprocessing, and visualizing the data.
Import the Model Snapshots and ADM Predictor Snapshots data sets that you exported from your Pega system to your directory, and then initialize them in an ADMDatamart class. For this demo, we use sample data.
You can access your model data and your predictor data as data frames. When both data sources are present, the ADMDatamart class combines them in the background, and you can inspect the resulting data frame.
A collection of sample plots and graphs are available for analysis of the adaptive models operating in CDH.
All visualizations need the model data. Some visualizations also need predictor data, and still others need multiple snapshots to create timelines. One of the visualizations is a bubble chart that plots performance versus success rate, similar to the bubble chart available in Prediction Studio. The visualization considers the latest snapshot by default.
Zoom in on models that perform very well but have a low success rate, to report actions that need attention to the business.
You can visualize a subset of the data by supplying a query argument. Let's only consider the models with a high response count within the CreditCards group.
The bubble chart shows you which models perform well. You might, however, want to know if performance issues occur in a specific channel, issue, or group. The Treemap visualization offers insight in this situation.
By default, the Treemap shows the performance, weighted by the response count. The number of model IDs within a combination of context keys determines the size of the squares. Besides performance, you can also use another variable, such as the SuccessRate or ResponseCount.
You might want to look at the performance of specific predictors over multiple models. To make the visualization more legible, limit the number of predictors.
The GitHub repository PDS Tools is open source. You can therefore contribute to the repository by creating a pull request. You can also report problems by creating an issue on the main GitHub page.
This demo has concluded. What did it show you?
- How to use model data and predictor data from your CDH system with PDS Tools.
- How to inspect the data frame that combines model and predictor data.
- How to visualize a subset of the data with PDS Tools.
This Topic is available in the following Modules:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?