You can use a Stream service to publish and subscribe to streams of data records, store streams of records, and process data in real time. The Stream service is a multi-node component that is based on Apache Kafka. With the Stream service, you can build real-time streaming data pipelines to move data between systems and applications. You can also build applications that transform and react to streams of data.
The Stream service ingests, routes, and delivers high volumes of low-latency data such as web clicks, transactions, sensor data, work objects, data pages, and customer interaction histories. As a resiliency mechanism for operations, the same data is replicated to other nodes in the cluster. Distribution and replication of the stream data records ensure the scalability and fault tolerance of the Stream service. The service runs as a cluster on one or more servers. You can spread the servers among multiple data centers to minimize planned or unplanned downtime.
The Stream service enables the asynchronous flow of data between processes in Pega Platform™. You can use this service, for example, to pass correspondence data to delivery channels in Pega Customer Decision Hub™, process customer responses for the Adaptive Decision Manager (ADM), or initiate background processes in any Pega application.
Ensure that your Stream service operates without causing any errors by regularly monitoring the status of your Stream nodes, partitions, disk space, CPU usage, and database availability.
You can perform a simple, manual status check of the Stream service on the Stream landing page (in the header of Dev Studio, click Configure > Decisioning > Infrastructure > Services > Stream). The Stream service relies on the availability of the Pega Platform database. A slow or unavailable database might have a serious impact on the Stream service.
To understand the health of the Stream service, monitor the queries to the following three databases:
If queries take more than 1 second, consider tuning your database. In the case of planned or unplanned database unavailability, consider restarting your stream nodes, especially if you see that they are unhealthy. By default, the Stream service keeps two replicas of each record. If you increase the number of Stream nodes from two to three or four, ensure that you change the data replication setting to match the number of Stream nodes.
The Stream service replicates every record across a configurable number of servers. This replication allows automatic failover to these replicas when a server in the cluster fails, so messages remain available in the presence of failures.
Check your knowledge with the following interaction: