Skip to main content

Setting up data ingestion

7 Tasks

40 mins

Visible to: All users
Beginner
Pega Customer Decision Hub '23
Next Best Action
Decision Management
Data Model
Data Integration
English
Verify the version tags to ensure you are consuming the intended content or, complete the latest version.

Scenario

The project team at U+ Bank performed a data mapping workshop and mapped its existing data model to Pega Customer Decision Hub™ and structured the Customer Insight Cache. The U+ Bank data warehouse team prepared the customer and account data files and the manifest files. IT developed the technical infrastructure to upload the data daily to a repository that Pega Customer Decision Hub can access.

As a decisioning architect, your role is to prepare the data jobs to populate the customer and account entities of the Customer Insights Cache in Pega Customer Decision Hub™.

Use the following credential to log in to the exercise system:

Role User name Password
Decisioning architect DecisioningArchitect rules

Your assignment consists of the following tasks:

Task 1: Clear the Customer and Account tables

As a decisioning architect, use the Customer and Account data sets to clear any test data created in Customer Decision Hub.

Note: The exercise system contains customer and account data generated from a Monte Carlo data set in a previous exercise.

Task 2: Create the CustomerFile data set

As a decisioning architect, create the CustomerFile data sets using the details from the following table to access the customer data in the repository:

Requirement Detail

Data set

CustomerFile

Repository

filerepo

File path

/IngestionData/CustomerData/

Manifest file name

CustomerDataIngestManifest.xml

Data file name

CustomerDataIngest.csv

Date field formats

Date time format: MM/dd/yyyy HH:mm
Date format: MM/dd/yyyy

Time format: HH:mm:ss

Task 3: Create the AccountFile data set

As a decisioning architect, create the AccountFile data sets using the details from the following table to access the account data in the repository:

Requirement Detail

Data set

AccountFile

Repository

filerepo

File path

/IngestionData/AccountData/

Manifest file name

AccountDataIngestManifest.xml

Data file name

AccountDataIngest.csv

Date field formats

Date time format: MM/dd/yyyy HH:mm
Date format: MM/dd/yyyy

Time format: HH:mm:ss

Task 4: Create a new import data job for the customer data

As a decisioning architect, create a new data job to ingest the customer data file by using the details from the following table:

Requirement Detail

Target data source

Customer

Source data source

Customer File

Import using

File detection

Failure policy

10 records per file.

Task 5: Create a new import data job for the account data

As a decisioning architect, create a new data job to ingest the account data file by using the details from the following table:

Requirement Detail

Target data source

Account

Source data source

Account File

Import using

File detection

Failure policy

10 records per file.

Task 6: Create .tok files to trigger data ingestion

As a decisioning architect, create .tok files in the file repository to trigger the file detection that initiates the Customer and Account data jobs.

Note: For this use case, you have access to a third-party application, Filebrowser.

Task 7: Verify the data ingestion is complete

As a decisioning architect, verify that the data is ingested successfully.

 

You must initiate your own Pega instance to complete this Challenge.

Initialization may take up to 5 minutes so please be patient.

Challenge Walkthrough

Detailed Tasks

1 Clear the Customer and Account tables

  1. On the exercise system landing page, click Launch Pega InfinityTM to log in to Customer Decision Hub.
  2. Log in as the decisioning architect:
    1. In the User name field, enter DecisioningArchitect.
    2. In the Password field, enter rules.
  3. In the header of Customer Decision Hub, in the search field, enter Customer, and then click the search icon.
    1. In the third filter list, select Exact Match.
    2. In the list of results, select the Customer data set with the Applies to class UBank-CDH-Data-Customer.
      Select the Customer Data set
  4. In the upper-right corner, click the Run to truncate the data set:
    1. In the Run Data Set: Customer window, in the Operation list, select Truncate.
    2. In the upper-right corner, click Run to truncate the customer table.
    3. Close the Status Page window, and then close the Run Data Set: Customer window.
  5. In the lower-left corner, click Back to Customer Decision Hub to return to Customer Decision Hub.
    Return back to the Customer Decision Hub
  6. Repeat steps 3–5 for the Account data set to truncate the account data.
    Select the Account Data set
  7. In the navigation pane of Customer Decision Hub, click Data > Profile Designer to view data sources.
    Launch Profile Designer
  8. On the Profile Designer landing page, click Customer to open the customer data source.
    Launch the Customer Data source
  9. On the Data Set: Customer page, click the Records tab to confirm that there are no items.
    Click the records tab in the Customer Data set
  10. Optional: To confirm that there are no items for the Account data set, repeat steps 8–9.

2 Create the CustomerFile data set

  1. In the navigation pane of Customer Decision Hub, click Data > Profile Data Sources to create a new data set for accessing the customer data in the repository.
  2. On the Profile Data Sources landing page, in the upper-right corner, click Add > Data Set.
  3. In the Add data set window, select Create new data set then click Next.
  4. In the Create data set window, configure the following settings:
    1. In the Name field, enter Customer File.
    2. In the Apply to field, enter or select UBank-CDH-Data-Customer.
    3. Click Next.
  5. In the Create data set: Repository (3 of 8) window, select filerepo, then click Next.
    Note: For the purposes of the exercise, all files use the filerepo (a system-managed temporary file storage) repository. In a real-life scenario, all files are typically stored in a file repository such as AWS S3.
  1. In the Create data set: Source location (4 of 8) window, in the IngestionData folder, navigate to the Customer manifest file:
    1. In the Name column, click the IngestionData folder.
      Choose the ingestiondata folder
    2. In the Name column, click the CustomerData folder.
      Choose the customerdata folder
    3. In the Name column, select CustomerDataIngestManifest.xml.
      Choose the manifest file
    4. Click Next.
  2. In the Create data set: File configuration (5 of 8) window, configure the following settings:
    1. Select the First row contains fields (header) checkbox.
    2. In the Delimiter character list, select Comma(,).
    3. In the Date time format field, enter MM/dd/yyyy HH:mm.
    4. In the Date format field, enter MM/dd/yyyy.
    5. Click Next.
      Choose file configurations
  3. In the Create data set: File preview (6 of 8) window, confirm the file contents:
    1. On the Manifest tab, confirm that the manifest file content is valid.
      Investigate the manifest file
    2. In the Data file tab, confirm that the data file content is valid.
      Investigate the data file
    3. Click Next.
  4. In the Create data set: Field mapping (7 of 8) window, confirm that the field mappings are correct, then click Next.
    Complete the mapping of fields with columns
  5. In the Create data set: Review (8 of 8) window, review the configuration, then click Create.
    Review your configuration

3 Create the AccountFile data set

  1. On the Profile Data Sources landing page, in the upper-right corner, click Add > Data Set.
  2. In the Add data set window, select Create new data set then click Next.
  3. In the Create data set window, configure the following settings:
    1. In the Name field, enter Account File.
    2. In the Apply to field, enter or select UBank-CDH-Data-Accounts.
    3. Click Next.
  4. In the Create data set: Repository (3 of 8) window, select filerepo, then click Next.
  5. In the Create data set: Source location (4 of 8) window, in the IngestionData folder, navigate to the Account manifest file:
    1. In the Name column, click the IngestionData folder.
    2. In the Name column, click the AccountData folder.
    3. In the Name column, select AccountDataIngestManifest.xml.
    4. Click Next.
  6. In the Create data set: File configuration (5 of 8) window, configure the following settings:
    1. Select the First row contains fields (header) checkbox.
    2. In the Delimiter character list, select Comma(,).
    3. In the Date time format field, enter MM/dd/yyyy HH:mm.
    4. In the Date format field, enter MM/dd/yyyy.
    5. Click Next.
  7. In the Create data set: File preview (6 of 8) window, confirm the file contents:
    1. On the Manifest tab, confirm that the manifest file content is valid.
    2. In the Data file tab, confirm that the data file content is valid.
    3. Click Next.
  8. In the Create data set: Field mapping (7 of 8) window, confirm that the field mappings are correct, then click Next.
  9. In the Create data set: Review (8 of 8) window, review the configuration, then click Create.

4 Create a new import data job for the customer data

  1. In the navigation pane of Customer Decision Hub, click Data > Data Jobs to create a new data job for ingesting customer data.
  2. On the Data Jobs landing page, click Create import data job.
  3. In the Create import data job: Name & target (1 of 5) window, select Customer, then click Next.
    Select the Customer data set for target
  4. In the Create import data job: Source location (2 of 5) window, select Customer File, then click Next.
    Select the Customer file data set for source
  5. In the Create import data job: Trigger (3 of 5) window, select File detection, then click Next.
    Create import data job trigger
    Caution: In a real-life scenario, the trigger that you select depends on the data ingestion requirements. Use the File detection option to trigger the data ingestion process by a token file. File detection is the most used option, as the process initiates only when all files for ingestion are ready for processing. Use the Schedule option only when you know the exact time and guarantee that the system can ingest the file. The system does not process the file if the file is not in the repository at the scheduled time.
  1. In the Create import data job: Failure policy (4 of 5) window, complete the following settings:
    1. In the Fail a run after more than field, enter 10.
    2. Click Next.
      Define the failure policy
  2. In the Create import data job: Review (5 of 5) window, review the configuration, then click Create.
    Review your data job configuration
  3. On the Data Jobs landing page, double-click the Import Customer row to see its details.
    Import Customer data job is now active
  4. On the Data Job: Import Customer landing page, in the Runs section, confirm that there are no scheduled jobs.
    The overview of the customer import data job

5 Create a new import data job for the account data

  1. In the navigation pane of Customer Decision Hub, click Data > Data Jobs to create a new data job for ingesting account data.
  2. On the Data Jobs landing page, click Create import data job.
  3. In the Create import data job: Name & target (1 of 5) window, select Account, then click Next.
  4. In the Create import data job: Source location (2 of 5) window, select Account File, then click Next.
  5. In the Create import data job: Trigger (3 of 5) window, select File detection, then click Next.
  6. In the Create import data job: Failure policy (4 of 5) window, complete the following settings:
    1. In the Fail a run after more than field, enter 10.
    2. Click Next.
  7. In the Create import data job: Review (5 of 5) window, review the configuration, then click Create.
  8. On the Data Jobs landing page, double-click the Import Account row to see its details.
  9. On the Data Job: Import Account landing page, in the Runs section, confirm that there are no scheduled jobs.

6 Create .tok files to trigger data ingestion

  1. On the exercise landing page, in the upper-left corner, click the App-Switcher > Filebrowser.
    Select Filebrowser from app-switcher
  2. Log in as the repository administrator:
    1. In the User name field, enter pega-filerepo.
    2. In the Password field, enter pega-filerepo.
      Login to filerepo
  3. Navigate to the customer data folder, IngestionData > CustomerData.
  4. To begin the customer data ingestion, create a new .tok file in the IngestionData > CustomerData folder:
    1. In the navigation pane Filebrowser, click New file.
    2. In the New file window, enter .tok.
    3. Click Create
      Create an empty tok file for the customer file
  5. In the upper-right corner click the Save icon.
    Save the empty token file
  6. Click on the IngestionData folder.
    Return to Ingestion data
  7. Click on the AccountData folder.
    Click the account data folder
  8. To begin the account data ingestion, create a new .tok file in the IngestionData > AccountData folder:
    1. In the navigation pane Filebrowser, click New file.
    2. In the New file window, enter .tok.
  9. Click Create.

7 Verify the data ingestion is complete

  1. In the navigation pane of Customer Decision Hub, click Data > Data Jobs.
    Check the last processed time
  2. In the Name column, click Import Customer to view data job details.
    Verify the Customer data job import is successful
  3. In the Details section, on the Target field, click the Customer data source.
  4. On the Data set: Customer landing page, click the Records tab to confirm that the system ingested the customer data.
  5. Optional: To confirm that the system ingested the account data, repeat steps 1–4 for the Import Account data job.

This Challenge is to practice what you learned in the following Module:


Available in the following mission:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice