Skip to main content
Verify the version tags to ensure you are consuming the intended content or, complete the latest version.

Data Collections in Pega GenAI Knowledge Buddy

Use data collections to group your data sources into separate collections, which provides greater flexibility and control over data organization.

Data collection

In previous versions of Pega GenAI Knowledge Buddy™, all data was loaded into a single database table, with each piece of content identified by a unique object ID. This setup prevented the same content from existing more than once in the system, even if it came from different data sources.

Data collection solves this limitation by enabling users to create separate collections that act like individual tables, each containing its own set of data sources and content. As a result, the same content can exist in multiple collections without conflict because each collection has its own unique identity.

Note: A single semantic query cannot span multiple collections, which helps ensure data isolation when you need it.

The following figure shows the Data collection landing page in Knowledge Buddy:

data collection in the buddy portal

Key benefits

A data collection provides the following benefits for Knowledge Buddy:

  • Test different chunking algorithms or chunk sizes: Create separate collections to test different chunking methods or chunk sizes for content without having to overwrite existing data.
  • Separate production and test data: Keep your production and test data completely separate to prevent any accidental mixing of data.
  • Isolate sensitive data: Collections provide a clear boundary for isolating sensitive data, such as payroll information, from other data sources that the system should not combine in the same semantic query.
  • Improve performance: By segmenting data into smaller collections, semantic queries can run faster because there is less data to search through.

Setting up a Data collection

The process of setting up a data collection involves the following steps:

  1. Create a new collection: Specify details such as the name, description, chunking settings, and access permissions for the new collection.
    The following figure shows an example of a CustomerService data collection
    the create data collection window
  2. Create data sources and assign them to collections: Create data sources (for example, knowledge articles and documents) and assign them to the desired collection category, as shown in the following figure:
    create data source
    Note: You can only assign a collection when you create the data source, and the system does not allow you to change the collection later. If you assign an incorrect data source, you need to create a new data source with the correct data collection.
  1. Associate Knowledge Buddies with collections: During the Knowledge Buddy setup, select which collections and data sources you want the Buddy to use for its semantic queries.
    collection and data sources when creating a buddy

This Topic is available in the following Modules:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice