Setting up data ingestion
7 Tasks
40 mins
Scenario
The project team at U+ Bank performed a data mapping workshop and mapped its existing data model to Pega Customer Decision Hub™ and structured the Customer Insight Cache. The U+ Bank data warehouse team prepared the customer and account data files and the manifest files. IT developed the technical infrastructure to upload the data daily to a repository that Pega Customer Decision Hub can access.
As a decisioning architect, your role is to prepare the data jobs to populate the customer and account entities of the Customer Insights Cache in Pega Customer Decision Hub™.
Use the following credential to log in to the exercise system:
Role | User name | Password |
---|---|---|
Decisioning architect | DecisioningArchitect | rules |
Your assignment consists of the following tasks:
Task 1: Clear the Customer and Account tables
As a decisioning architect, use the Customer and Account data sets to clear any test data created in Customer Decision Hub.
Note: The exercise system contains customer and account data generated from a Monte Carlo data set in a previous exercise.
Task 2: Create the CustomerFile data set
As a decisioning architect, create the CustomerFile data sets using the details from the following table to access the customer data in the repository:
Requirement | Detail |
---|---|
Data set |
CustomerFile |
Repository |
filerepo |
File path |
/IngestionData/CustomerData/ |
Manifest file name |
CustomerDataIngestManifest.xml |
Data file name |
CustomerDataIngest.csv |
Date field formats |
Date time format: MM/dd/yyyy HH:mm Time format: HH:mm:ss |
Task 3: Create the AccountFile data set
As a decisioning architect, create the AccountFile data sets using the details from the following table to access the account data in the repository:
Requirement | Detail |
---|---|
Data set |
AccountFile |
Repository |
filerepo |
File path |
/IngestionData/AccountData/ |
Manifest file name |
AccountDataIngestManifest.xml |
Data file name |
AccountDataIngest.csv |
Date field formats |
Date time format: MM/dd/yyyy HH:mm Time format: HH:mm:ss |
Task 4: Create a new import data job for the customer data
As a decisioning architect, create a new data job to ingest the customer data file by using the details from the following table:
Requirement | Detail |
---|---|
Target data source |
Customer |
Source data source |
Customer File |
Import using |
File detection |
Failure policy |
10 records per file. |
Task 5: Create a new import data job for the account data
As a decisioning architect, create a new data job to ingest the account data file by using the details from the following table:
Requirement | Detail |
---|---|
Target data source |
Account |
Source data source |
Account File |
Import using |
File detection |
Failure policy |
10 records per file. |
Task 6: Create .tok files to trigger data ingestion
As a decisioning architect, create .tok files in the file repository to trigger the file detection that initiates the Customer and Account data jobs.
Note: For this use case, you have access to a third-party application, Filebrowser.
Task 7: Verify the data ingestion is complete
As a decisioning architect, verify that the data is ingested successfully.
Challenge Walkthrough
Detailed Tasks
1 Clear the Customer and Account tables
- On the exercise system landing page, click Launch Pega InfinityTM to log in to Customer Decision Hub.
- Log in as the decisioning architect:
- In the User name field, enter DecisioningArchitect.
- In the Password field, enter rules.
- In the header of Customer Decision Hub, in the search field, enter Customer, and then click the search icon.
- In the third filter list, select Exact Match.
- In the list of results, select the Customer data set with the Applies to class UBank-CDH-Data-Customer.
- In the upper-right corner, click the Run to truncate the data set:
- In the Run Data Set: Customer window, in the Operation list, select Truncate.
- In the upper-right corner, click Run to truncate the customer table.
- Close the Status Page window, and then close the Run Data Set: Customer window.
- In the lower-left corner, click Back to Customer Decision Hub to return to Customer Decision Hub.
- Repeat steps 3–5 for the Account data set to truncate the account data.
- In the navigation pane of Customer Decision Hub, click Data > Profile Designer to view data sources.
- On the Profile Designer landing page, click Customer to open the customer data source.
- On the Data Set: Customer page, click the Records tab to confirm that there are no items.
- Optional: To confirm that there are no items for the Account data set, repeat steps 8–9.
2 Create the CustomerFile data set
- In the navigation pane of Customer Decision Hub, click Data > Profile Data Sources to create a new data set for accessing the customer data in the repository.
- On the Profile Data Sources landing page, in the upper-right corner, click Add > Data Set.
- In the Add data set window, select Create new data set then click Next.
- In the Create data set window, configure the following settings:
- In the Name field, enter Customer File.
- In the Apply to field, enter or select UBank-CDH-Data-Customer.
- Click Next.
- In the Create data set: Repository (3 of 8) window, select filerepo, then click Next.
Note: For the purposes of the exercise, all files use the filerepo (a system-managed temporary file storage) repository. In a real-life scenario, all files are typically stored in a file repository such as AWS S3.
- In the Create data set: Source location (4 of 8) window, in the IngestionData folder, navigate to the Customer manifest file:
- In the Name column, click the IngestionData folder.
- In the Name column, click the CustomerData folder.
- In the Name column, select CustomerDataIngestManifest.xml.
- Click Next.
- In the Name column, click the IngestionData folder.
- In the Create data set: File configuration (5 of 8) window, configure the following settings:
- Select the First row contains fields (header) checkbox.
- In the Delimiter character list, select Comma(,).
- In the Date time format field, enter MM/dd/yyyy HH:mm.
- In the Date format field, enter MM/dd/yyyy.
- Click Next.
- In the Create data set: File preview (6 of 8) window, confirm the file contents:
- On the Manifest tab, confirm that the manifest file content is valid.
- In the Data file tab, confirm that the data file content is valid.
- Click Next.
- On the Manifest tab, confirm that the manifest file content is valid.
- In the Create data set: Field mapping (7 of 8) window, confirm that the field mappings are correct, then click Next.
- In the Create data set: Review (8 of 8) window, review the configuration, then click Create.
3 Create the AccountFile data set
- On the Profile Data Sources landing page, in the upper-right corner, click Add > Data Set.
- In the Add data set window, select Create new data set then click Next.
- In the Create data set window, configure the following settings:
- In the Name field, enter Account File.
- In the Apply to field, enter or select UBank-CDH-Data-Accounts.
- Click Next.
- In the Create data set: Repository (3 of 8) window, select filerepo, then click Next.
- In the Create data set: Source location (4 of 8) window, in the IngestionData folder, navigate to the Account manifest file:
- In the Name column, click the IngestionData folder.
- In the Name column, click the AccountData folder.
- In the Name column, select AccountDataIngestManifest.xml.
- Click Next.
- In the Create data set: File configuration (5 of 8) window, configure the following settings:
- Select the First row contains fields (header) checkbox.
- In the Delimiter character list, select Comma(,).
- In the Date time format field, enter MM/dd/yyyy HH:mm.
- In the Date format field, enter MM/dd/yyyy.
- Click Next.
- In the Create data set: File preview (6 of 8) window, confirm the file contents:
- On the Manifest tab, confirm that the manifest file content is valid.
- In the Data file tab, confirm that the data file content is valid.
- Click Next.
- In the Create data set: Field mapping (7 of 8) window, confirm that the field mappings are correct, then click Next.
- In the Create data set: Review (8 of 8) window, review the configuration, then click Create.
4 Create a new import data job for the customer data
- In the navigation pane of Customer Decision Hub, click Data > Data Jobs to create a new data job for ingesting customer data.
- On the Data Jobs landing page, click Create import data job.
- In the Create import data job: Name & target (1 of 5) window, select Customer, then click Next.
- In the Create import data job: Source location (2 of 5) window, select Customer File, then click Next.
- In the Create import data job: Trigger (3 of 5) window, select File detection, then click Next.
Caution: In a real-life scenario, the trigger that you select depends on the data ingestion requirements. Use the File detection option to trigger the data ingestion process by a token file. File detection is the most used option, as the process initiates only when all files for ingestion are ready for processing. Use the Schedule option only when you know the exact time and guarantee that the system can ingest the file. The system does not process the file if the file is not in the repository at the scheduled time.
- In the Create import data job: Failure policy (4 of 5) window, complete the following settings:
- In the Fail a run after more than field, enter 10.
- Click Next.
- In the Create import data job: Review (5 of 5) window, review the configuration, then click Create.
- On the Data Jobs landing page, double-click the Import Customer row to see its details.
- On the Data Job: Import Customer landing page, in the Runs section, confirm that there are no scheduled jobs.
5 Create a new import data job for the account data
- In the navigation pane of Customer Decision Hub, click Data > Data Jobs to create a new data job for ingesting account data.
- On the Data Jobs landing page, click Create import data job.
- In the Create import data job: Name & target (1 of 5) window, select Account, then click Next.
- In the Create import data job: Source location (2 of 5) window, select Account File, then click Next.
- In the Create import data job: Trigger (3 of 5) window, select File detection, then click Next.
- In the Create import data job: Failure policy (4 of 5) window, complete the following settings:
- In the Fail a run after more than field, enter 10.
- Click Next.
- In the Create import data job: Review (5 of 5) window, review the configuration, then click Create.
- On the Data Jobs landing page, double-click the Import Account row to see its details.
- On the Data Job: Import Account landing page, in the Runs section, confirm that there are no scheduled jobs.
6 Create .tok files to trigger data ingestion
- On the exercise landing page, in the upper-left corner, click the App-Switcher > Filebrowser.
- Log in as the repository administrator:
- In the User name field, enter pega-filerepo.
- In the Password field, enter pega-filerepo.
- Navigate to the customer data folder, IngestionData > CustomerData.
- To begin the customer data ingestion, create a new .tok file in the IngestionData > CustomerData folder:
- In the navigation pane Filebrowser, click New file.
- In the New file window, enter .tok.
- Click Create
- In the upper-right corner click the Save icon.
- Click on the IngestionData folder.
- Click on the AccountData folder.
- To begin the account data ingestion, create a new .tok file in the IngestionData > AccountData folder:
- In the navigation pane Filebrowser, click New file.
- In the New file window, enter .tok.
- Click Create.
7 Verify the data ingestion is complete
- In the navigation pane of Customer Decision Hub, click Data > Data Jobs.
- In the Name column, click Import Customer to view data job details.
- In the Details section, on the Target field, click the Customer data source.
- On the Data set: Customer landing page, click the Records tab to confirm that the system ingested the customer data.
- Optional: To confirm that the system ingested the account data, repeat steps 1–4 for the Import Account data job.
This Challenge is to practice what you learned in the following Module:
Available in the following mission:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?