Embedding attributes for Pega GenAI Knowledge Buddy
Knowledge Buddy enables you to embed specific attributes in content chunks; attributes which describe the semantic characteristics of each individual chunk. These attributes allow Knowledge Buddy to interpret and retrieve chunks in response to a user query, even when the query itself does not contain specific keywords.
For example, the content The latest version released in this region is 25.1 has the attributes region: ["US,"EU"] and product: ["Buddy","KM"] embedded to it.
When a user asks What is the current Knowledge Buddy version in Europe? Knowledge Buddy can use the content to build an answer to the question, even though the chunk itself does not contain any keywords that indicate a product or a region.
Embedding attributes improves content searchability, allowing Knowledge Buddy to provide more precise, higher-quality responses.
Transcript
In this video, you learn how Pega GenAI Knowledge Buddy™ embeds attributes in content, how to configure embedding attributes for a data collection in Knowledge Buddy, and see how embedded attributes enhance chunk-level metadata.
To begin, in the Pega GenAI Knowledge Buddy portal, navigate to Data Collection, then click the name of the collection you want to configure to open its details page.
The Advanced Settings section is expanded to show the content processing options for the collection. You can see several drop-down lists which you can use to configure chunking and attribution for this collection. Focus on the Embedding attributes list.
The Embedding attributes list displays a list of available attributes that can be embedded into your content chunks.
After Knowledge Buddy ingests the test content, return to the Advanced Settings of your collection. In the Embedding attributes list, you now see available values. These values represent all the attributes that exist within your collection based on the ingested content.
From the Embedding attributes list you select the attributes you want to embed. Selecting at least the Title and Abstract attributes provides valuable context to every chunk and improves the accuracy of search results and Buddy responses.
For example, if you embed the Title attribute, this can replace the Title chunking method. Title chunking prepends the title text to every chunk, which makes chunks appear different from the original content. Embedding hides the attribute information within the chunk metadata, so the chunk text remains unchanged while still inheriting the attribute values.
Next, let us consider an example, to understand how embedding works. In the navigation pane of the Pega Knowledge Buddy portal, click Content. Open the article What's new.
Note that for the purpose of this demonstration, the content of the article has been simplified in order to fit into a single chunk of content.
Focus on the Title, Abstract, and Content fields. If auto-attribution is enabled, any keywords detected in these fields can automatically generate attribute values. For example, if the word "Buddy" appears in the content, the Knowledge Buddy auto-attributes the document with product:Knowledge Buddy.
You selected Title and Abstract in the Embedding attributes list for the collection that this content is a part of. As a result, if any attributes are generated based on keywords from these fields, Knowledge Buddy embeds those attributes into each content chunk when the content is successfully ingested.
In the Global attributes tab, in the Embedded column, you see that Knowledge Buddy embedded the Title and Abstract attributes, as well as the Product and Version attributes.
Now click Chunks to view the individual chunks created from this document, as well as how the embedded attributes are displayed for each chunk. In this example, Knowledge Buddy created a chunk out of the abstract and a single chunk out of the article content, both of which have the attributes embedded.
Even if a specific chunk does not contain the keyword that triggered the attribute, that chunk still inherits the embedded attribute value. In this example one of the embedded attributes is product:Knowledge Buddy embedded, and every chunk from that document carries this attribute.
This is the key benefit of embedding attributes. When a user asks a question that includes an embedded attribute value, the system can retrieve all relevant chunks, even those that do not explicitly mention the attribute value in their text. This improves search precision and ensures that Buddy can find all relevant information when building an answer.
When you ask Buddy a question, you can analyze the response to see which chunks Knowledge buddy used to build that answer. Notice that Knowledge Buddy used the chunk created from the article What's new, because of the embedded attributes.
To further demonstrate the feature, we return to the article, develop it and split it into content parts, each of which will become a single chunk. Then, ask the Buddy the same question and analyze the response once again. Notice that each individual chunk has the embedded attributes, despite the fact that some of the chunks do not contain any keywords connected to the attributes.
To summarize, embedding attributes enhances the metadata of your content chunks by associating attribute values at the chunk level, without modifying the actual text. By configuring embedding attributes at the collection level and selecting key attributes such as Title and Abstract, you ensure that every chunk carries meaningful context, which improves search accuracy and the quality of Buddy responses.
You have reached the end of this video.
This Topic is available in the following Module:
Want to help us improve this content?