Skip to main content

The Pega Knowledge Loader

The Pega Knowledge Loader is a feature that enables you to pull content from external repositories into a Pega GenAI Knowledge Buddy™. You can extend the framework to ingest content from multiple sources and enhance the extraction and chunking of data. The Knowledge Loader helps you automate the ingestion and management of external content and complements the existing content management features in Pega Knowledge Management.

The Pega Knowledge loader works out of the box with SharePoint and Confluence, but you can also use an external integrator.

sourcing external repositories with knowledge loader

To successfully use Knowledge Loader, you must first create a data collection and assign a data source to the collection. This data collection contains the content that is ingested by Knowledge Loader, and can then be used by a Knowledge Buddy to answer questions.

Nota: For more information about data collections and data sources, see Advanced Buddy features.

SharePoint loader

To use content that you store on a SharePoint site, you must first build a SharePoint Loader from the Pega Knowledge loader portal. The following table lists all the properties of the SharePoint loader that you must provide when you create the SharePoint Loader:

Value

Mandatory

Note

Example

Collection

YES

Specify the collection in which the content ingestion should occur. The collection must already exist in the Knowledge Buddy application.

Knowledge

DataSource

YES

Specify the data source that corresponds to the collection you indicated. The data source must already exist in the Knowledge Buddy application.

ProductGuide

Role

YES

Specify the role for which the content should be available.

KnowldegeBuddy:Public

SiteName

YES

Specify the name of the SharePoint website from which you want to ingest data.

https://example.sharepoint.com/sites/Guide

Resources

YES

Specify the folder path from where you want to begin data ingestion. If you want to indicate the root folder, enter only a forward slash (/).

/Shared Documents/Guides

creating a sharepoint loader

You can also choose whether the SharePoint loader should include sub-folders, and set several optional settings:

Value

Mandatory

Note

Example

File names to include

NO

Control data inclusion based on file types.

.pdf,.docx

File names to exclude

NO

Control data exclusion based on file name.

 

File types to include

NO

Control data inclusion based on file type.

 

Attributes

NO

Custom attributes created on SharePoint that you want the Knowledge Loader to ingest.

Creator,Tag

optional settings for a sharepoint loader

Once you are ready, click Submit to create the SharePoint loader. This creates two background jobs, the first of which runs to extracts the list of files from SharePoint, and the second to extract the files and ingest them into Knowledge Buddy or update them when needed. This happens at regular intervals, as configured in the job scheduler.

Customizations

You can apply the following customizations to the Knowledge Loader to better match the needs of your organization:

  1. Sourcing from any type of repository: The framework supports pulling data from repositories other than SharePoint. To use another repository, you must create a subclass in the PegaKnowledgeLoaderWorkRepository class for the new repository type. Implement the necessary Data Pages and activities to fetch files and folders from the new repository.
  2. Pushing to Any Destination: By default, the framework pushes data to Knowledge Buddy, but you can extend the framework to push data to other destinations by overriding specific activities.
  3. Decoding File Content: The framework uses Apache Tika by default to decode file content. However, you can use a different framework to decode the content by overriding the provided extension points.
  4. Additional Attributes: The framework enables you to ingest additional custom attributes from the source repository. You can extend the data pages and activities to handle these custom attributes.
  5. Job Schedulers: The Knowledge Loader includes job schedulers that periodically ingest new files and check for updates. You can customize the scheduling intervals and the logic for handling updates.

This Topic is available in the following Module:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Este conteúdo foi útil?

Quer nos ajudar a melhorar esse conteúdo?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice