Auto-attribution and auto-filtering for Pega GenAI Knowledge Buddy

Auto-attribution and auto-filtering are intelligent, interconnected features in Pega GenAI Knowledge Buddy™ that enhance content organization and search precision. Auto-attribution automatically applies metadata tags to content based on intelligent analysis, while auto-filtering dynamically applies relevant search filters based on question context.

By using natural language processing (NLP) models configured in Prediction Studio, the system can intelligently detect relevant attributes from content and automatically apply appropriate filters during search operations, which creates a seamless experience that adapts to user needs without requiring explicit configuration for each interaction.

Nota: For more information about Prediction Studio, see Prediction Studio.

Benefits

Auto-attribution and auto-filtering include the following benefits:

Enhanced user experience

Auto-filtering eliminates the need for users to select appropriate filter attributes, which streamlines the search process and reduces the likelihood of missing relevant content because of forgotten filters. Users can focus on formulating their questions naturally without worrying about the technical aspects of search refinement, which makes Knowledge Buddy more accessible and user-friendly.

Improved content organization

The automatic attribution system categorizes content without relying on manual processes that might be inconsistent or incomplete. The system ensures that all content receives appropriate metadata tags and categories through the three-tier attribution system (manual, auto, and system attributes), which creates a well-organized knowledge base that supports effective information retrieval and management.

Search precision and performance

The system applies relevant filters based on question content, which narrows search results to more targeted and relevant information and improves both response time and result quality. As a result, users have a refined search experience that helps them quickly find the content they need without sifting through numerous irrelevant results. The intelligent filtering ensures that search results are contextually appropriate and aligned with the users' information needs.

Auto-attribution

Auto-attribution is an intelligent content categorization system that automatically applies metadata attributes to knowledge content at both the document and individual chunk levels. This feature uses sophisticated NLP models configured in Prediction Studio to analyze content text to identify and assign relevant attributes based on predefined criteria and matching patterns.

Knowledge Buddy supports multiple levels of attribution granularity:

Content-level attribution applies attributes to entire documents. All chunks in that content inherit the same categorical information. For example, if a document is attributed with "country: United States," every chunk derived from that document carries this attribute to provide consistent categorization across all content segments.
Chunk-level attribution applies attributes to individual content segments. This level provides more precision by assigning specific attributes that are only relevant to particular sections of a document.

This dual-level approach ensures both broad categorization and granular specificity.

The attribution process uses hierarchical NLP models. The primary level becomes the attribute name, such as "location" or "theme." The secondary levels define the specific attribute values, such as "Canada" or "finance." These models support sophisticated matching criteria, including words that must match, should match, and should not match, which gives users precise control over how content gets categorized. Based on the configured model parameters, the system can detect entities like country names, product categories, business themes, and other domain-specific classifications.

Nota: Knowledge Buddy includes a model for all regions (countries).

Auto-attribution generates three distinct types of attributes that provide comprehensive content categorization:

Manual attributes represent information explicitly provided by users or systems, such as tags and categories passed through APIs or user interfaces.
Auto-attributes are generated by the NLP analyzers based on content analysis, automatically detecting relevant classifications without human intervention.
System attributes are automatically created by Knowledge Buddy, including content identifiers, content keys, and other technical metadata necessary for system operations.

This three-tier approach ensures that content receives user-defined categorization and intelligent system-generated attributes.

The system also supports embedded attributes that it derives from chunking methods. When Knowledge Buddy processes content by using specific chunking strategies (for example, title-based chunking), the system can embed relevant information as attributes with an embedded column that identifies the chunking method used to generate the attribute.

The following shows the global attributes page of a content that exists on the Knowledge Buddy Portal, with the auto-attribute highlighted

Global attributes table for Italy content with highlighted Country field

Auto-filtering

Auto-filtering is an intelligent search enhancement feature that automatically applies relevant filter attributes to Knowledge Buddy questions and semantic searches without requiring manual user input. The system analyzes the question text using the NLP models configured in Prediction Studio to identify and apply relevant filter attributes.

Auto-filtering operates at two levels:

Ask execution: When users submit questions to Knowledge Buddy, the system automatically detects relevant filters based on the question content and applies them to the search request before processing.
Semantic search execution: The system applies the same intelligent filtering to direct search operations, so that users receive targeted results regardless of how they interact with the knowledge system.

This dual-mode operation ensures consistent filtering behavior across all search interfaces.

Auto-filtering provides transparency through a dedicated Filters section on the Ask Query landing page. This section shows the automatically applied filters and any filters included in the original request. Users can see exactly what automatic filters influenced their search results, which helps them understand how their search was processed and provides context for the results they receive.

The filtering mechanism uses the same NLP models as for auto-attribution, which provides consistency between content categorization and search filtering. When a question contains terms that match the configured model criteria, the system automatically applies the corresponding filter attributes to the search request. As a result, users have a seamless experience where content automatically attributed to specific categories can undergo automatic filtering when users ask related questions.

If users pass both manual filter attributes and auto-detected attributes for the same question, the system applies an AND condition. Only content chunks that match both sets of attributes are returned. This approach gives users the flexibility to combine automatic intelligence with manual control over their search parameters while maintaining precision in search results.

For example, when a user asks, "What are the different content types available in Buddy?" the system automatically analyzes the question text. It detects that it contains terms related to the "buddy" product. Without any manual intervention from the user, the system applies an automatic filter of "products: knowledge buddy" to the search request.

The Buddy then provides a targeted response: "From buddy, we can pass text type content, and from knowledge we can pass all these types, and from knowledge loader also, we can pass all these types of documents." The system automatically narrows the search results to include content chunks only with the "knowledge buddy" product attribute, and user receives information specifically about buddy features rather than generic content management information.

Configuration

The setup and configuration of auto-attribution and auto-filtering involves several coordinated components that enable intelligent content categorization and search filtering.

To configure these features, complete the following process:

Create NLP models in Prediction Studio

Prediction Studio Configuration forms the foundation of auto-attribution and auto-filtering functionality. Use Prediction Studio to define the NLP models that trigger specific attributes. These models use a hierarchical structure:

The primary level defines the attribute name (for example, "products" or "location").
The secondary levels define the attribute values (for example, "knowledge buddy" or "Canada"). The models support advanced matching criteria, including words that must match, should match, and should not match, which gives you precise control over content detection and categorization.

The NLP models can detect various entities, including country names, product categories, business themes, and other domain-specific classifications relevant to the organization's knowledge base. For example, a country-based attribution model might include a primary level "country" with secondary levels for "United States," "United Kingdom," "India," and other nations. Each country can have multiple keyword variations: the United States might include "USA," "America," and "US" as matching terms, while the United Kingdom could include "UK," "Great Britain," and "England."

For auto-attribution, create separate NLP models for content-level and chunk-level attribution to provide appropriate granularity for different use cases. Content-level models typically focus on broader categorizations that apply to entire documents, while chunk-level models can identify more specific attributes that might only be relevant to particular content segments.

This dual-model approach supports comprehensive document categorization and granular chunk-specific attribution.

The following figure shows should part of the Content attributions prediction model configured in Prediction Studio:

Prediction Studio interface showing United Kingdom topic keywords

Apply collection-level settings

Collection-level configuration connects the NLP models and the actual content processing. You can select which NLP analyzer models to use for content- and chunk-level attribution when configuring collections. The system displays a list of available NLP analyzers configured in the Buddy so that you can apply the appropriate models for their specific attribution needs. This configuration determines how the system automatically categorizes content when ingested into the knowledge system.

The collection-level configuration includes three distinct attribution settings:

Content attribution applies attributes to entire documents. All chunks in a document inherit the same attributes.
Chunk attribution applies specific attributes to individual content sections. This setting provides more granular control.
Filter attribution enables auto-filtering. This setting defines which attributes the system automatically detects and applies during search operations.

The following figure shows the content processing options at the collection level:

Attribution settings at a collection level

Configure Buddy-level behavior

Buddy-level configuration includes a checkbox option to enable the auto-filtering functionality, as shown in the following figure. This setting determines whether the Buddy automatically applies detected attributes as filters during question processing. By default, this option is false, which requires you to explicitly enable it if you want auto-filtering. This approach ensures that the feature is active only when intentionally configured and tested.

The auto-filtering checkbox in a Knowledge Buddy

This Topic is available in the following Modules:

Obter ajuda

If you are having problems with your training, please review the Pega Academy Support FAQs.

Este conteúdo foi útil?

Sim

Não

Quer nos ajudar a melhorar esse conteúdo?

Sugerir uma edição