Stream service
You can now use a Stream service to publish and subscribe to streams of data records, store streams of records, and process data in real time. The Stream service is built on the Apache Kafka platform. With the Stream service, you can build real-time streaming data pipelines to move data between systems and applications. You can also build applications that transform and react to streams of data.
The Stream service ingests, routes, and delivers high volumes of low-latency data such as web clicks, transactions, sensor data, work objects, data pages, and customer interaction history. As a resiliency mechanism for operations, the same data is replicated to other nodes in the cluster. Distribution and replication of the stream data records ensure the scalability and fault tolerance of the Stream service. The service runs as a cluster on one or more servers. You can spread the servers among multiple data centers to minimize planned or unplanned downtime.
The Stream service enables the asynchronous flow of data between processes in Pega Platform™. The Stream service is a multi-node component that is based on Apache Kafka. You can use this service, for example, to pass correspondence data to delivery channels in Pega Customer Decision Hub, process customer responses for the Adaptive Decision Manager (ADM), or initiate background processes in any Pega application.
Ensure that your Stream service operates without causing any errors by regularly monitoring the status of your Stream nodes, partitions, disk space, CPU usage, and database availability.
You can do a simple, manual status check of the Stream service by going to the Stream landing page: in the header of Dev Studio, click Configure > Decisioning > Infrastructure > Services > Stream. The Stream service relies on the availability of the Pega Platform database. A slow or unavailable database might have a serious impact on the Stream service.
To understand the health of the stream service, monitor the queries to the following three databases:
- pr_data_stream_sessions
- pr_data_stream_nodes
- pr_data_stream_node_updates
If queries take more than 1 second, consider tuning your database. In the case of planned or unplanned database unavailability, consider restarting your stream nodes, especially if you see that they are unhealthy.
This Topic is available in the following Module:
Want to help us improve this content?