Aggregated data

Data is a foundational element that spans the entire Case Management lifecycle. It enables Case workers, business users, and automated decisioning components to make timely and informed decisions. With increasing system scale, diverse integration touchpoints, and higher data volumes, enterprise-level Pega applications must process and persist data originating from multiple internal and external systems.

To effectively gather, analyze, summarize, and present this data, Pega provides multiple data aggregation mechanisms, each optimized for specific architectural scenarios.

As a Lead System Architect (LSA), it is critical to select the most appropriate aggregation approach based on data volume, source complexity, refresh requirements, and performance considerations. The primary data aggregation options in Pega Platform™ include:

Aggregation of data using Report Definitions
Aggregation of data using Data Transforms
Aggregation using Aggregate Data Pages
Aggregation using Data Sets and Data Flows

Aggregation of data using report definitions

Pega Report Definitions provide a straightforward and optimized way to query and aggregate persisted data and present it to users. Report Definitions support data aggregation through:

Joins and class associations, which combine data from related tables or classes
Sub-reports, which aggregate data from multiple existing reports into a consolidated result set.

Joins and associations are suitable when aggregating structurally related data at the database level. Sub-reports are the preferred option when multiple existing reports already meet individual business requirements and must be consolidated without duplicating logic.

For example, consider individual reports that provide:

Customer purchase history
Customer engagement levels (for example, loyal, high-spender, infrequent shopper)
Customer demographic information

To create a customer segmentation report for a targeted marketing campaign, you can aggregate results from these reports into a single main report. The existing reports function as sub-reports, allowing reuse of validated reporting logic and ensuring easier maintenance.

Report-based aggregation is best suited for:

Persisted data with relational structure
Reporting and user-facing insights
Scenarios that benefit from database-level optimization

Aggregation of data using Data Transforms

Data Transforms are designed for lightweight data manipulation and consolidation in memory. The Append to and Append and Map to options enable the creation of aggregate lists of data from multiple pages into a single target page or page list.

This approach is effective when:

The data is already loaded into memory
The target page exists (for example, a user page or data page)
Aggregation logic is procedural rather than query-based

While reports and sub-reports are optimal for aggregating persisted data through queries, Data Transforms are better suited for consolidating in-memory data that has already been retrieved or calculated elsewhere in the application.

Aggregate Data Pages

Aggregate Data Pages are an option for aggregating data from multiple sources and are especially useful for data collection with complex requirements. While combinations of reports, sub-reports, or data transforms are suitable for simpler business scenarios, Aggregate Data Pages are excellent in more complex situations. Aggregate Data Pages provide a flexible mechanism for aggregating data from multiple sources and are especially effective for complex data collection scenarios. While combinations of report definitions, sub-reports, or data transforms are typically sufficient for simpler business requirements, Aggregate Data Pages are better suited for situations that require orchestration across multiple systems or data retrieval mechanisms.

Aggregate Data Pages are particularly beneficial when retrieving data from external systems using connectors or robotic automation. They can also aggregate data from internal sources, such as Activities, Data Transforms, Report Definitions, and Lookups. This enables LSAs to model a unified data view without embedding aggregation logic directly into Cases or user interfaces.

Aggregate Data Pages support both single-page and page list structures, making them highly versatile for a wide range of aggregation needs. They are especially advantageous when data from multiple sources must be consolidated into a single logical page that can be reused consistently across the application.

When designing Aggregate Data Pages, LSAs should consider the following architectural aspects:

Asynchronous loading

Enable asynchronous loading when aggregating data from multiple sources or when handling large data volumes, to improve responsiveness and reduce perceived latency.

Keyed page access

Use keyed access where appropriate to avoid repeated traversal of aggregation sources, improving runtime efficiency.

Refresh strategy

Define a clear and intentional refresh strategy to prevent unnecessary reloads of aggregated data, which can negatively impact performance and external system load.

Aggregate Data Pages are the preferred option for complex data aggregation scenarios that require orchestration across multiple systems and run time contexts.

Aggregation of data using Data Sets and Data Flows

Data Sets are designed for managing complex data structures efficiently and can help you more effectively organize and manage your data. One type of Data Set is the Summary Data Set, which aggregates various types of data to refine it for use in decision strategies, models, or Data Flows. Summary Data Sets source their data from Stream Data Sets or Data Flows with a stream source and an abstract destination.

Data Flows function like pipelines. They enable you to sequence and combine data from various sources and write the results to a destination. The source and destination points can be abstract or driven by Data Sets and other decision data flows. When you use Data Flows to aggregate data from different sources and move it to a destination, between the source and destination, you can apply various operations. For example, composing, converting, merging, and running other strategy instructions.

Data Flows support the combination of data from two sources into a single page or page list, ensuring that all necessary data is consolidated into one record. To combine data, you identify a matching property between the two sources. The system appends data from the secondary source to the incoming data record as an Embedded Data Page.

Both Data Sets and Data Flows can be used for data aggregation in complex use cases. As a Lead System Architect for an enterprise Pega implementation, you should use the most optimized design approach.

Check your knowledge with the following interaction:

This Topic is available in the following Module:

Reporting data architecture and performance optimization v1

Get help

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Yes

Want to help us improve this content?

Suggest an edit