Skip to main content
Verify the version tags to ensure you are consuming the intended content or, complete the latest version.

Greenfield data modeling

Data modeling types

 There are two types of data modeling in Pega:

  1. Greenfield data modeling
  2. Extending the existing data model

Greenfield data modeling is the name given to the situation when you create a data model from scratch. Greenfield data modeling is required when Pega, Partner, or Marketplace model does not represent the client's data model (or the client has a licensed model). One technique for developing an object model is to parse a document while extracting nouns and verbs. Nouns are taken as data types, and verbs are considered as processes. Processes can be a case type or something that occurs within a case type. Once nouns are identified, the data modeling job begins.

Industry-standard data modeling techniques focus on the business and application's data needs first, deferring all physical data models' considerations until the fulfillment of the application and business needs.

Data modeling follows a three-level approach.

  1. Conceptual
  2. Logical
  3. Physical

As a lead system architect (LSA), you play a significant role at every level of data modeling by collaborating with other stakeholders. The following table shows what actions are performed at what level.

Design Level Stakeholders Actions
Conceptual Business / Client / SME for business domain  Identify the data source; Understand the flow of information; Right data type for data elements
Logical Business Data Analyst / Data Expert Data collection decision and design ; Choose right layer for elements; Logical group of elements using inheritance or composition
Physical DBA / Developer / Client SME for SOR Choose the right schema and create database table

Conceptual level in data modeling

At this level, the main stakeholders are Business, Client, and Subject Matter Expert (SME). The purpose is to define the business terms and rules. Identify the data and how it flows in the process. The output of this conceptual level is a clear understanding of the data (and data type) that is required for fulfilling the business needs.

Logical level in data modeling

At this level, the main stakeholders are Business Data Analyst and Data Architecture Expert. The Business Data Analyst's role is to clarify how and from where the data is collected. The role provides logical design and data collection decisions in the context of business needs, terms, and policies. The Data Architecture Expert role identifies the relationship between data types (define keys), the reuse layer in which to place the data types (Org layer or specific layer), the inheritance path for the data.

Physical level in data modeling

At this level, the main stakeholders are DBAs, Developers, and the other client architects who are SMEs for SOR. DBA will create physical data base table as per the guidance given by developer or SME. It is a collaborative work. The purpose is to finalize the implementation methodology for data elements. The logical design takes a physical form at this level.

Create the required data classes in Pega mapping to the Pega Database tables or external Database tables. if required create the connector to access the data from external systems. As an LSA specific to Pega, you also need to choose the appropriate schema for the given business requirement. Plan what must go into CustomerData and Pega DATA. Analyze the need to use Pega Reporting Database as well at this level.

Data formatting, calculations, and manipulations are completed at this level so that all data required for the business process is ready.


Polymorphism in data modeling

In Pega, you can model advanced and dynamic data structures by using the Data Relationship field type. The Pega data model is powerful and flexible and supports concepts such as polymorphism. Declare a Data Relationship field type mapped to an abstract class at design-time and at run-time field's pxObjClass will be updated with the required concrete class name.

Consider the following business scenario:

An auto insurance company's application has a list of vehicles to cover as part of a quote. The list of vehicles can include bikes, cars, and trucks. Each of these vehicle types may have differences in its business rules and processes. The following data models are possible solutions for this business problem.

Solution 1: Separate Data Relationship field type (Multiple Records)  for every vehicle type

Create a separate Data Relationship field type with multiple records with lists for Bike, Cars, and Trucks. Each Data Relationship field type list has a static page class; developers might have to create separate user interfaces for each page class.

Embedded field name Applies to class Comments
Bike List Data-Vehicle-Bike  Different class defined for each type of vehicle
Car List Data-Vehicle-Car
Different class defined for each type of vehicle
Truck List Data-Vehicle-Truck
Different class defined for each type of vehicle

Solution 2: Single Data Relationship field type (Multiple Records) and single page class for all vehicle types

Another option is to use just a one-page class for all vehicles and a single Data Relationship field type (Multiple Records). In this case, you must use conditional logic or circumstancing to introduce process and rule differences.

Embedded field name Applies to class Comments
Vehicle List Data-Vehicle Only one embedded page and use conditional logic to identify the required UI & other rules based on the type of the vehicle

Solution 3: Single Data Relationship field type (Multiple Records) and different page class for different vehicle types

You can use a single Data Relationship field type (Multiple Records) of covered vehicles where each page can be of a different class type. Rule resolution uses the runtime class of each page to apply the correct rules, processes, and user interface.

Embedded field name Applies to class Comments
Vehicle List Data-Vehicle Vehicles List is mapped to Data-Vehicle class and every page can have class mapped as required at run time
              Bike Data-Vehicle-Bike First page of Vehicle List is Bike
              Car Data-Vehicle-Car Second page of Vehicle List is Car
              Truck Data-Vehicle-Truck Third page of Vehicle List is Truck
Clipboard View of VehicleList()
This screenshot of a case’s Clipboard shows Bike, Car, and Truck specializations of the FSG-Data-Vehicle class being added to the same embedded Page List named VehicleList.


  • Solution 3 is recommended. You can easily add a new Vehicle type and map the page in the vehicle list to the new vehicle class.
  • Solution 1 is not recommended because it is not scalable. For example, what happens when a new vehicle type is added, such as Boats? Having multiple page-lists might require modification of multiple rules to implement this change.
  • Solution 2 is not recommended. With only one page-list class, business rules become harder to maintain as there are too many circumstances and more variants of business logic.
Note:  Polymorphism is not the only and best solution to fit all needs of data modeling in Pega. An LSA must always explore the possible approaches and perform a comparative study to select the required approach.

This Topic is available in the following Module:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice