Stream service

You can use a Stream service to publish and subscribe to streams of data records, store streams of records, and process data in real time. The Stream service is a multi-node component that is based on Apache Kafka. With the Stream service, you can build real-time streaming data pipelines to move data between systems and applications. You can also build applications that transform and react to streams of data.

The Stream service ingests, routes, and delivers high volumes of low-latency data such as web clicks, transactions, sensor data, work objects, data pages, and customer interaction histories. As a resiliency mechanism for operations, the same data is replicated to other nodes in the cluster. Distribution and replication of the stream data records ensure the scalability and fault tolerance of the Stream service. The service runs as a cluster on one or more servers. You can spread the servers among multiple data centers to minimize planned or unplanned downtime.

The Stream service enables the asynchronous flow of data between processes in Pega Platform™. You can use this service, for example, to pass correspondence data to delivery channels in Pega Customer Decision Hub™, process customer responses for the Adaptive Decision Manager (ADM), or initiate background processes in any Pega application.

Ensure that your Stream service operates without causing any errors by regularly monitoring the status of your Stream nodes, partitions, disk space, CPU usage, and database availability.

Note: Use of internal Kafka deployments is deprecated. On-premises systems that have been updated from earlier versions of Pega Platform can continue to use Kafka in embedded mode. However, to ensure future compatibility, do not create any new environments using embedded Kafka. When configuring the Stream service in a new environment, use external Kafka. For more information on how to configure external Kafka, see Configuring External Kafka as a Stream service. If the Stream service runs in External Mode, you do not see any node details on the Stream landing page. Instead, all you see is a NORMAL status message, which means that your installation of Pega Platform can connect with External Kafka and use it for Stream service.

You can perform a simple, manual status check of the Stream service on the Stream landing page (in the header of Dev Studio, click Configure > Decisioning > Infrastructure > Services > Stream). The Stream service relies on the availability of the Pega Platform database. A slow or unavailable database might have a serious impact on the Stream service.

To understand the health of the Stream service, monitor the queries to the following three databases:

pr_data_stream_sessions
pr_data_stream_nodes
pr_data_stream_node_updates

If queries take more than 1 second, consider tuning your database. In the case of planned or unplanned database unavailability, consider restarting your stream nodes, especially if you see that they are unhealthy. By default, the Stream service keeps two replicas of each record. If you increase the number of Stream nodes from two to three or four, ensure that you change the data replication setting to match the number of Stream nodes.

The Stream service replicates every record across a configurable number of servers. This replication allows automatic failover to these replicas when a server in the cluster fails, so messages remain available in the presence of failures.

Check your knowledge with the following interaction:

Get help

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Yes

Want to help us improve this content?

Suggest an edit

Stream service

We'd prefer it if you saw us at our best.