Greenfield data modeling
When existing solutions on the Pega Marketplace do not meet the Data Model requirements for an organization, creating a greenfield Data Model from scratch becomes essential.
One technique for developing an object model is to parse a business requirement document while extracting nouns and verbs. Nouns become data types, and verbs become processes. Processes can be a Case Type or an Action that occurs in a Case Type. After project development teams identify the nouns, the data modeling job begins.
Industry-standard data modeling techniques prioritize the data needs of the business and application first, deferring all considerations for the physical Data Models until the application and business needs are complete.
Business-centric data model design
Pega Platform™ emphasizes designing Data Models that reflect business needs rather than technical constraints. This approach improves flexibility and supports evolving requirements by aligning data structures with business processes, terminology, and objectives.
Core principles of Data Model design
To create Data Models that align with business objectives and remain adaptable, apply these principles:
Align with business processes
Analyze processes to understand Data Flow. Group related elements logically to reduce complexity.
Map business terminology
Maintain consistency by mapping business terms to technical implementations. Create a shared vocabulary for business and IT teams.
Plan for future state
Design for scalability by anticipating future needs, such as new products or higher transaction volumes.
Simplify for optimization
Include only data elements with clear business value. Remove redundant attributes for a lean model.
Best practices for Lead System Architects
To ensure that Data Models remain consistent, scalable, and aligned with business goals, apply these best practices:
- Create a glossary to map business terms to technical names for consistency across the application lifecycle.
- Design based on processes by analyzing workflows before defining structures to identify natural data groupings.
- Plan for scalability by documenting growth patterns and designing for expansion without major refactoring.
- Validate for lean modeling by checking each data element against its business purpose and decision-making role.
Implementation example
Conduct collaborative workshops with stakeholders to define entity relationships using business terminology. Validate visual models with business users before translating them into technical specifications. Document governing business rules and reference them during logical implementation.
Approach in greenfield data modeling
Greenfield data modeling follows a three-level approach:
- Conceptual
- Logical
- Physical
As a Lead System Architect (LSA), you play a significant role at every level of data modeling by collaborating with other stakeholders (for example, business users, subject matter experts (SMEs), database administrators (DBAs), and business analysts).
The following table shows what actions different users perform at each level:
| Design Level | Stakeholders | Activities |
|---|---|---|
| Conceptual | Business user and SME for a business domain, Business Analyst, and LSA |
|
| Logical | Business user and SME for a business domain, Business Analyst, and LSA |
|
| Physical | DBA and LSA |
|
Here is the visual diagram illustrating how Conceptual, Logical, and Physical models connect in a business-centric approach:
Conceptual level in data modeling
Conceptual data modeling is the process of creating a high-level, abstract representation of the data requirements in a system or application. It focuses on defining the main data entities, their relationships, and key attributes without going into the specific details of the data structure or implementation. The main purpose of conceptual data modeling is to provide a collective understanding of the data requirements among various stakeholders, facilitating communication and agreement on the overall data structure.
Stakeholders can consider conceptual data modeling in Pega applications to consist of an informal or implicit understanding of the following elements:
- Clear listing of all data elements required for a given business scenario.
- Definition of the type of data element.
- Identified the source of the data element.
- Flow of the data in the business process.
- Recognition of the ownership and access control requirements of the data.
Logical level in data modeling
Logical data modeling is the process of refining the conceptual Data Model by adding more detail and structure to the entities, relationships, and attributes. It defines primary keys, foreign keys, and constraints to ensure data integrity and normalizes data to eliminate redundancy and inconsistencies. The logical Data Model is usually technology-agnostic, focusing on organizing the data independently of any specific database management system (DBMS).
The output of the logical data modeling in Pega applications includes:
- Data objects (data types): Data objects, also known as data types, represent the main entities in the application and encapsulate the data structure and behavior. In Pega software, data objects can be created using App Studio or Dev Studio and are typically based on a single database table or an external data source.
- Application Layer: The Application Layer is where you can place the data object. If it is required across the organization, place it in the enterprise layer. If it is to fulfill a specific requirement, place it in the specific layer.
- Properties: Properties are the attributes or fields in a data object that define the specific data elements and their characteristics, such as data type, default value, and Validation Rules. In Pega applications, properties can be created and managed using the Property Rule form in Dev Studio.
- Relationships: Associations, foreign keys, or other mechanisms define the relationships between data objects. These relationships determine how data objects interact with one another and enforce data integrity across the application. The cardinality of the relationship has to be decided.
- Constraints and Validation Rules: In Pega applications, you can define constraints and Validation Rules in the data object or property Rule forms to ensure data integrity and consistency. These Rules enforce business logic and help maintain the quality of the data in the application.
- Data Pages: Data Pages in Pega applications load, cache, and manage data from various sources. Consider them a part of logical data modeling because they define how the application retrieves and manipulates data.
- Reporting Database: LSAs should analyze the business requirements. If there is a need for granular reporting, and the frequency of reporting is high, then plan for a reporting database.
Physical level in data modeling
Physical data modeling is the process of translating the logical Data Model into a detailed, database-specific implementation. This phase involves defining the actual database objects, such as tables, columns, indexes, constraints, partitions, storage, and other elements that are specific to the chosen database management system (DBMS). The primary goal of physical data modeling is to optimize the Data Model for performance, security, and maintainability within the target database environment.
Physical data modeling in Pega applications typically consists of the following elements:
- Database schema: A schema represents the structure of the database, including tables, columns, data types, and constraints. In Pega applications, the Pega data of the default standard schema is CustomerData. As an LSA, you choose the appropriate schema for the given business requirement and determine what information the CustomerData and PegaDATA require. Pega applications follow a set of default naming conventions and structures for creating tables and columns that correspond to the data objects and properties.
- Indexes: Indexes help optimize the performance of data retrieval operations. In Pega applications, the system automatically creates indexes for some properties, such as primary keys. You can also configure additional custom indexes for properties that are frequently used in search or filtering operations.
- Reporting Database: Create the Pega Reporting Database if you must perform analytical and trend reporting and run reports more frequently.
- Data storage and partitioning: Pega applications use a set of default storage configurations, such as BLOB storage, to store application data. However, you can also configure additional storage settings, such as partitioning and archiving, to optimize the performance and manageability of the database.
- Integration with external data sources: In Pega applications, you can integrate external databases or data sources using connectors and integration Rules. You can consider the configuration of these connectors and Rules part of the physical data modeling output because they define how external data is accessed and managed in the Pega application.
- Security and access control: In Pega applications, security and access control configurations, such as Authentication Profiles and data access roles, define how users can access and manipulate the data stored in the application. These configurations can also be considered part of the physical data modeling output because they impact the overall data management and security of the database environment.
Check your knowledge with the following interaction:
This Topic is available in the following Module:
Want to help us improve this content?