Data Collections in Pega GenAI Knowledge Buddy
Use data collections to group your data sources into separate collections, which provides greater flexibility and control over data organization.
Data collection
In previous versions of Pega GenAI Knowledge Buddy™, all data was loaded into a single database table, with each piece of content identified by a unique object ID. This setup prevented the same content from existing more than once in the system, even if it came from different data sources.
Data collection solves this limitation by enabling users to create separate collections that act like individual tables, each containing its own set of data sources and content. As a result, the same content can exist in multiple collections without conflict because each collection has its own unique identity.
The following figure shows the Data collection landing page in Knowledge Buddy:
Key benefits
A data collection provides the following benefits for Knowledge Buddy:
- Test different chunking algorithms or chunk sizes: Create separate collections to test different chunking methods or chunk sizes for content without having to overwrite existing data.
- Separate production and test data: Keep your production and test data completely separate to prevent any accidental mixing of data.
- Isolate sensitive data: Collections provide a clear boundary for isolating sensitive data, such as payroll information, from other data sources that the system should not combine in the same semantic query.
- Improve performance: By segmenting data into smaller collections, semantic queries can run faster because there is less data to search through.
Setting up a Data collection
The process of setting up a data collection involves the following steps:
- Create a new collection: Specify details such as the name, description, chunking settings, and access permissions for the new collection.
The following figure shows an example of a CustomerService data collection - Create data sources and assign them to collections: Create data sources (for example, knowledge articles and documents) and assign them to the desired collection category, as shown in the following figure:
Note: You can only assign a collection when you create the data source, and the system does not allow you to change the collection later. If you assign an incorrect data source, you need to create a new data source with the correct data collection.
- Associate Knowledge Buddies with collections: During the Knowledge Buddy setup, select which collections and data sources you want the Buddy to use for its semantic queries.
Note: For more information, see Data collections and Creating a Data collection for Knowledge Buddy.
This Topic is available in the following Modules:
Want to help us improve this content?