PDF Connector component overview
Consider the following scenario:
An underwriting company receives uploaded documents that contain specific regulatory and customer information required to complete the loan origination process. The representative manually reviews the document to copy and paste the necessary information into a loan financing application. The PDF Connector streamlines that manual process into a robotic automation.
Using the same loan financing software, the end user generates several documents to send to the applicant after the underwriter approves or denies the loan request. The PDF Connector gathers disparate information and fills in a PDF form template to create the necessary documents for the deal.
The PDF Connector component of Pega Robot Studio, located in the Internal data sources category of the Toolbox, allows the developer to define the necessary elements within the document and add them to the Object Explorer for use in automations.
Formatted and unformatted
The PDF Connector works with both formatted files and files of unknown structure.
Formatted files are documents that are of a known structure, such as an invoice or credit card statement. The automation developer knows where and what data to expect, making it easy to work with the document.
For .pdf files, you can build automations to:
- Read and write form fields like text boxes and radio buttons.
- Read text values based on proximity to landmark text.
- Use Optical Mark Recognition to determine if a form is signed or if handwritten marks are present.
- Read tables and easily convert them into a lookup table.
- Use the Reconcile method to present an interface for reconciling and correcting data read from .pdf file documents. Document reconciliation often occurs when working with previously interrogated documents with OCR (Optical Character Recognition).
- Pre-process or post-process PDF files for attended or unattended robotic process automation (RPA).
Unformatted files are documents that do not have a structured layout or that are received for the first time and are not defined as a document type in the robotic solution. For example, consider a letter received as a .pdf file. You may be able to work with the letter, but the location of the data may be inconsistent from document to document and thus requires a different approach to locating and using the data.
For unknown file formats, you can build automations to:
- Search and extract text in lines, segments, and words.
- Extract images.
- Read tables and easily import the data into a lookup table.
PDF Viewer
Along with the PDF Connector, you use the PDF Viewer to display the file to a user when necessary. The PDF Viewer, when added to a Windows form, provides options to search for or select the text, zoom, and print. You can create automations that respond to click events within the text of the viewable file to aid the user.
Check your understanding with the following interaction:
This Topic is available in the following Module:
Want to help us improve this content?