
Data Flow Components
Learn about various components that are available in a Data Flow to enrich and process data.
Transcript
Data Flows consist of various components that transform and enrich the data in the pipeline.
Every Data Flow begins with one or more data sources. When you create a new Data Flow, it has an abstract source and an abstract destination by default.
An abstract source is a source without a specific Data Set or stream. Instead, it allows the Data Flow to accept input data dynamically when the Data Flow runs. This flexibility enables you to use the same Data Flow for processing different sets of data or to process data on-demand, such as in real-time or Single Case Data Flows.
The primary source of a Data Flow can also be a Data Set, another Data Flow, or a Report Definition.
You can preview a sample of the records that move through a Data Flow at each component when the source is not abstract. To preview the data, right-click the component, and then select Preview. Some components do not support the Preview function.
To add a new component to a Data Flow, click the Add icon on the right side of the component. To add another destination, click Add branch on the destination shape.
There are various components that are available in a Data Flow.
Compose: Use the Compose component to combine the data from two sources into a page or page list to have all the necessary data in a single record. For example, you want to compose the account information or product holdings of a customer with the customer's data. To compose the data, you must select a secondary data source and identify one or more properties that match the input and secondary data sources. The system appends the data from the secondary data source to the incoming data record as an embedded page. The secondary data source can be a Data Set, Data Flow, or a report definition. The input and output classes of the data record remain the same after the compose operation is complete.
Convert: Use the Convert component to convert the class of the incoming data. This action is especially important when the data must be available in a different class than the source. For example, store data in a Data Set in a different class than the Data Flow itself. When the source and destination properties match, you can auto-copy the properties or add mappings to build the relation manually.
To convert the class of the top-level Data Pages to another class in your application, select Top-level.
It is also possible to use the same class as your top-level page. This way, you can quickly adjust the data through additional mapping.
To extract and convert a property that is embedded in the top-level page list property, select Embedded, and then select the property.
Filter: Use the Filter component to add conditions to the Data Flow and reduce the number of incoming records that the system must process. The Filter shape compares a data record that enters it against the defined filter conditions. When the record matches the conditions, the Filter shape outputs the record for further processing in the remaining data flow shapes. The Filter shape excludes records that do not match the conditions.
Reducing the number of records that your Data Flow needs to process decreases the processing time and hardware use.
Merge: Use the Merge component to combine data from the primary and secondary paths into a single track to merge an incomplete record with a data record that comes from the secondary data source. After you merge data from two paths, the output records keep only the unique data from both paths. The Merge shape outputs one or multiple records for every incoming data record depending on the number of records that match the merge condition. The secondary source must support the Browse by keys operation to use the merge functionality.
It is possible to skip the merge operation for an incoming record when the Skip merge operation condition returns true.
You can exclude the records that do not match the merge condition from further processing.
When both sources have data for the same record, you can select which data source takes precedence:
- Primary path: The merge action takes the value in the primary source.
- Secondary path: The merge action takes the value in the secondary source.
Data Transform: A Data Transform is a Pega Platform feature that defines how to convert and manipulate data. Use the Data Transform component to call a Data Transform Rule during Data Flow processing. The Data Transform applies to each record.
Event strategy: Event strategies are a mechanism in the Pega Platform for detecting meaningful patterns across a real-time data stream. They help you react to emerging patterns and identify critical opportunities and risks. Insights from event processing can assist in determining the next best action for your customers, triaging work, updating data, and straight-through processing Cases. Use the Event strategy component to reference Event strategy Rules in the Data Flows.
Decision Strategy: Use the Decision Strategy component to reference Strategy Rules to apply predictive analytics, adaptive analytics, and other business Rules when processing data in your Data Flow. The strategy that the Decision Strategy shape references outputs either the incoming data record to which it adds decision results or only the decision result.
Sub Data Flow: Use the Sub Data Flow component to call another Data Flow as an inline function.
To process records on the current page, select the Current page option. To loop over items in a page list property and save the results in that property, select A page list.
Text Analyzer: Use the Text Analyzer component to reference Text Analyzer Rules. Text Analyzer Rules analyze text data to derive business information from it. For example, you can analyze text-based content, such as emails and chat messages. The Text analyzer shape outputs the incoming data record after following its enhancement with the results of sentiment detection, classification, and intent and entity extraction. The input and output classes of the data record remain the same.
A Data Flow can have multiple destinations. To add a new destination, click Add branch on the destination shape. To add additional components to each branch, click the Add icon or the dotted line.
There are five possible destinations that you can use in Data Flows:
- Abstract: To use the results of the current Data Flow in another Data Flow.
- Activity: To use the results of the current Data Flow in an activity.
- Case: To start a Case as the result of a completed Data Flow.
- Data Flow: To continue the process in a different Data Flow.
- Data Set: To save the output into a Data Set.
This Topic is available in the following Modules:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?