
Understanding and implementing Data Flows
Use Data Flows to process and move data between data sources. Reference business Rules to do complex data operations.
Understanding and implementing Data Flows
Data Flows are pipelines that receive data from an input source and output data to one or more destinations. Within a Data Flow, you can transform or enrich the data, for example, by joining with data from other sources. They are crucial to any Pega implementation and are optimized for performance to handle large quantities of data.
In the context of Pega Customer Decision Hub™, they play a critical role and factor in various key activities, such as ingesting customer data, exporting customer interaction details, and orchestrating inbound/outbound next-best-actions. For example, in a customer service scenario, a Data Flow can retrieve audio transcripts from a Pega Voice AI service™. This Data Flow runs on a single thread, retrieves audio transcripts from the Pega server, and processes the data.
Data Flows consist of various components that transform and enrich the data in the pipeline. The components run concurrently to handle the data, starting from the source and moving towards the destination. A Data Flow can use another Data Flow for its source or destination, create Cases, trigger strategies, and more.
There are three types of Data Flows:
- Batch Data Flows (Batch processing)
- Real-time Data Flows (Real-time processing)
- Single Case Data Flows (Single Case processing)
Batch Data Flows process a finite number of records from their source, such as a database table, report definition, or file. You can schedule them to run at specific times, with a given frequency, or on demand. When the Data Flow run begins, it accesses and reads the data in its source, processes it, and finally sends it to the destination. A typical example is ingesting customer records from a CSV file, manipulating the data, and populating the Customer Insights Cache in Customer Decision Hub or running a 1:1 Next Best Action outbound schedule for a given audience.
Real-time Data Flows process an infinite number of records from their source. They are always active and continuously process an incoming stream of data. For example, they can process data generated by a customer's web activity, where the system aggregates the stream of clicks and page navigation data into meaningful data summaries, or write all customer interactions to a repository as they occur.
Single Case Data Flows always have an abstract data source. They are used to process inbound data, for example, when the call center channel calls Pega Customer Decision Hub to determine the next best actions for a customer. The system runs a Single Case Data Flow to retrieve the customer's data and runs another to invoke the strategies that determine the best action for the customer.
This Topic is available in the following Modules:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?