Accessing PDF data with automations
5 Tasks
15 mins
Scenario
The billing department of the Astend Technology company receives invoices in PDF file format and must parse specific data to another system for proper processing. You will create a PDF document type for the invoice to ensure that it is identifiable and its values are available in an automation. The automation passes in a file location parameter to open the document. From the lookup table, the automation loops through the table to retrieve the data subtotal, sales tax, and total values, and then stores them into the global variables.
Complete the following tasks:
- Add the PDF file connector to the Global container.
- Configure the PDF file document type.
- Configure a lookup Table and global variables to store the automation data.
- Create an automation that opens the PDF document file and updates the lookup table.
- Loop through the table to extract data values for Subtotal, Sales Tax, and Total to global variables.
Starting project:
Download, and then open InvoiceTableDataSln. Extract it to C:\Users\<yourLogin>\Documents\Pega Robot Studio\Projects
Note: The starting project contains the Global Container with a lookup table that has required extensions. Required references are installed.
Download the sample PDF file. Save it to the Desktop.
Detailed Tasks
1 Add the PDF connector to the Global container.
- In the Solution Explorer, double-click _GC_Invoice.os to open the Global Container in the Designer windows.
- In the Toolbox, in the search field, enter PdfConnector.
- Drag the PdfConnector to the Designer windows to add it to the Global container.
- On the Property Grid of the PdfConnector, change the Name property to Invoice.
- Save the changes on the Globals tab.
2 Add and configure a document type.
- In the Object Explorer, right-click the invoice, and then select Add Document Type.
- Select the PDF file sample document from the opening dialog box on the Desktop.
- In the Add New Document Type window, in the Document type name field, enter Astend Invoice.
- In the Threshold configuration pane of the dialog box, in the Line menu, click to highlight lines in the preview area on the right.
- Adjust the Line threshold to ensure correct line identification in the PDF file document type. The following figure shows an example of the lines:
- In the dialog box, in the Table configuration section, select Include rectangles in selection.
This feature increases the number of tables recognized in the document, matching tables based on intersecting lines. - Click Identifiers section. to open the
- In the Identifiers section, click Add > Text to configure a unique document type identifier.
- In the upper-right corner, click the blue rectangle to configure the identifier.
- In the Identifier name field, enter Invoice for.
- Draw a rectangle around the Invoice for: text.
- Click , and then click .
- In the Automation values section, select Add > Table.
- Select the blue rectangle on the upper-right corner to configure the table landmark.
- Draw a rectangle around the Invoice for: text.
- Click Validate to confirm the selected landmark.
- In the Table name field, enter tblInvoice.
- In the Table fill option list, select Compact to remove empty cells and move data to the left.
- Click to see the table data output.
- In the Sub/Tax/Total cell, indicate that the Total field is and the character removals are off.
- Click Save, and then click Back twice to return to the Document tab.
- Click Modify, and then accept the change message.
- Clear the Include rectangles in selection checkbox, and then click twice to return to the Values tab for table definition.
- On the Values tab, click the icon on the Invoice For table.
- In the Select Table area, click the blue rectangle to reselect the new table that has no rectangles.
- Click the Show value to confirm the correct table structure.
- In the Select Table section, select My table has headers in row, and then set the 1 value on the select menu.
- Select Advanced column options, and then configure the columns based on the following table.
Column name New column name Filter results Remove spaces from beginning and end of lines Remove all blank lines Remove these characters Col1 Item True True True Col2 Description True True True $, Col3 Qty True True True Col4 Unit Price True True True $, Col5 Discount True True True $, Col6 Price True True True $, - Click Save, and then click Done to finish the Document Type configuration.
- Click File > Save all to save the changes made in the solution.
3 Configure a Lookup Table and add global variables
- In the Solution Explorer, double-click _GC_Invoice.os to open the Global Container in the Designer windows.
- In the Global Container, select the lktblInvoice lookup table to have access to its Property Grid.
- On the Property Grid, select the Fields property, and then click to open the LookupField Collection Editor.
- In the LookupField Collection Editor, click the Add icon to add the following fields to the lookup table:
FieldName Key Type Key True System.Int32 Item False System.String Description False System.String Quantity False System.String Unit Price False System.String Discount False System.String Price False System.String - In the Toolbox, expand the Variables section.
- Add three Double variables to the Global Container, and then enter the following names:
- decTotal
- decSubtotal
- decSalesTax
- Select File > Save all to save the changes made in the solution.
4 Create a sub-automation to open the file location and populate the lookup table
- In the Solution Explorer, right-click the InvoiceTableDataPrj, and then select Add > New Automation.
- In the Name field, enter E_Invoice_Data_Pull.
- Click Add to add the automation to the project.
- On the Designer windows, right-click, and then add the following information:
- An entry point
- Two labels
- Two exit points
- Name the labels and the Exit points: Success and Failed.
- On the Entry point, add a String parameter named fileLocation.
- On the Failed Exit point, add the String parameter named errMsgl.
- On the Object Explorer, expand the _GC_Invoice, and then select Invoice.
- Add the FileName property of the Invoice control to the automation.
- Add the Open method of the Invoice control to the automation.
- On the Project Explorer, select Invoice > Astend_Invoice, and then select the tblInvoice control. Add the Table property to the automation.
- On the Designer windows, right-click, and then add Jump To > Failed. Set the errMsg parameter to Not able to access PDF document.
- On the Object Explorer, select the lookupTable. Add the ReplaceTableAutoKey method to the automation.
- Connect the design blocks as shown in the following figure.
5 Loop through the lookup table and store data to global variables
- In the Toolbox, in the search field, enter ForLoop. Drag the ForLoop to the Designer windows to add a loop to the automation.
- Connect lookupTable1.ReplaceTableAutoKey design block to the forLoop1 design block.
- On the Property grid, configure the ForLoop parameters to the following values:
Parameter Value Initial 1 Increment 1 Limit 32 - In the Designer windows, right-click, and then add Jump To > Success.
- In the Toolbox, expand the Variables section, and then drag the Integer to the Designer windows.
- In the Property Grid, change the Name property to intIndex.
- In the Project Explorer, select lookupTable, and then add the GetRecord method to the automation.
- In the Designer windows, right-click, and then add Jump To > Failed. Set the errMsg parameter to Not able to access Data Table.
- In the Toolbox, in the search field, enter Switch, and then add the Switch component to the automation.
- Connect the Item element from the GetRecord design block to the Input port of a Switch component.
- In the switch design block, click the Add icon, and then add three cases:
- TOTAL
- Invoice Subtotal
- Sales Tax
- In the Project Explorer, expand the E_Invoice_Data_Pull. Drag three intIndex variables to the Designer windows.
- Connect the corresponding Switch items to their intIndex properties design blocks.
- In the Project Explorer, select the lookupTable, and then add a GetRowColumn methods to the automation.
- In the GetRowColumn method, set the columnName parameter to Description.
- Copy, and then paste the GetRowColumn method design block twice.
- In the Project Explorer, drag the following values to the Designer window:
- decSalesTax
- decSubtotal
- decTotal
- Connect each intIndex design block to its coordinating GetRowColumn method and global variable design blocks.
- Confirm the automation links from the following image.
- Select File > Save all to save the changes made in the automation.
Confirm your work
Adjust runtimeconfig.xml to have the Robot Inspector in the context menu.
- Open the file explorer.
- In the address field enter %appdata%
- Open the file.
- Under RuntimeTrayMenu, set: <MenuItem item="LoadRobotInspector" label="Show Pega Robot Inspector" show="true" />
Verify the values flow between the lookup table and variables, using the Robot Inspector.
- In the Designer windows, right-click the first automation link, and then select Toggle breakpoint.
- Select Run > Debug.
- Inthe Windows Task Bar, locate the Pega Robot Runtime icon, and then right-click.
- In the sub-menu, click .
- In the Robot Inspector window, click the Global Variables tab.
- Click each global variable to display in the Results frame.
- Click the Automations tab, and then select the E_Invoice_Data_pull automation.
- In the Automation parameter frame, enter the file location of the desktop PDF file.
- Click Execute.
- Press to step through the automation links.
- After the automation passes the ReplaceTableAutoKey method, from the task manager, activate the Robot inspector window, and then click the Lookup Tables tab.
- Click lookupTable1 to display the results
- Press the green continue arrow to complete the automation.
- In RobotIinspector, click the Global Variables tab. The results frame displays the values from the tables. You might have to click Refresh to update the values.
- Stop the debugger.
This Challenge is to practice what you learned in the following Module:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?