Skip to main content

Accessing PDF data with automations

5 Tasks

15 mins

Pega Robotic Automation 19.1
Visible to: All users
Intermediate Pega Robotic Automation 19.1 Robotic Process Automation English


The billing department of the Astend Technology company receives invoices in PDF file format and must parse specific data to another system for proper processing. You will create a PDF document type for the invoice to ensure that it is identifiable and its values are available in an automation. The automation passes in a file location parameter to open the document. From the lookup table, the automation loops through the table to retrieve the data subtotal, sales tax, and total values, and then stores them into the global variables.

Complete the following tasks:

  1. Add the PDF file connector to the Global container.
  2. Configure the PDF file document type.
  3. Configure a lookup Table and global variables to store the automation data.
  4. Create an automation that opens the PDF document file and updates the lookup table.
  5. Loop through the table to extract data values for Subtotal, Sales Tax, and Total to global variables.

Starting project:

Download, and then open InvoiceTableDataSln. Extract it to C:\Users\<yourLogin>\Documents\Pega Robot Studio\Projects

Note: The starting project contains the Global Container with a lookup table that has required extensions. Required references are installed. 

Download the sample PDF file. Save it to the Desktop.

You must initiate your own Pega instance to complete this Challenge.

Initialization may take up to 5 minutes so please be patient.

Detailed Tasks

1 Add the PDF connector to the Global container.

  1. In the Solution Explorer, double-click _GC_Invoice.os to open the Global Container in the Designer windows. 
  2. In the Toolbox, in the search field, enter PdfConnector.
  3. Drag the PdfConnector to the Designer windows to add it to the Global container. 
  4. On the Property Grid of the PdfConnector, change the Name property to Invoice
  5. Save the changes on the Globals tab.

2 Add and configure a document type.

  1. In the Object Explorer, right-click the invoice, and then select Add Document Type
    Adding the document type menu on Globals
  2. Select the PDF file sample document from the opening dialog box on the Desktop. 
  3. In the Add New Document Type window, in the Document type name field, enter Astend Invoice.
  4. In the Threshold configuration pane of the dialog box, in the Line menu, click Show to highlight lines in the preview area on the right. 
  5. Adjust the Line threshold to ensure correct line identification in the PDF file document type. The following figure shows an example of the lines:
    Line threshold configuration
  6. In the dialog box, in the Table configuration section, select Include rectangles in selection.
    This feature increases the number of tables recognized in the document, matching tables based on intersecting lines. 
  7. Click Next to open the Identifiers section.
  8. In the Identifiers section, click Add > Text to configure a unique document type identifier.
    Adding identifier
  9. In the upper-right corner, click the blue rectangle to configure the identifier.
  10. In the Identifier name field, enter Invoice for.
  11. Draw a rectangle around the Invoice for: text. 
    Selecting the identifier
  12. Click Save, and then click Next.
  13. In the Automation values section, select Add > Table.
  14. Select the blue rectangle on the upper-right corner to configure the table landmark.
  15. Draw a rectangle around the Invoice for: text.
  16. Click Validate to confirm the selected landmark.
  17. In the Table name field, enter tblInvoice.
  18. In the Table fill option list, select Compact to remove empty cells and move data to the left. 
  19. Click Show value to see the table data output.
  20. In the Sub/Tax/Total cell, indicate that the Total field is and the character removals are off.
    Result of table identification
  21. Click Save, and then click Back twice to return to the Document tab.
  22. Click Modify, and then accept the change message.
  23. Clear the Include rectangles in selection checkbox, and then click Next twice to return to the Values tab for table definition.
  24. On the Values tab, click the Edit icon on the Invoice For table.
    Reselecting the table without rectangles
  25. In the Select Table area, click the blue rectangle to reselect the new table that has no rectangles.
  26. Click the Show value to confirm the correct table structure.
    Correct table structure
  27. In the Select Table section, select My table has headers in row, and then set the 1 value on the select menu. 
  28. Select Advanced column options, and then configure the columns based on the following table.
    Column name New column name Filter results Remove spaces from beginning and end of lines Remove all blank lines Remove these characters
    Col1 Item True True True  
    Col2 Description True True True $, 
    Col3 Qty True True True  
    Col4 Unit Price True True True $,
    Col5 Discount True True True $,
    Col6 Price True True True $,
  29. Click Save, and then click Done to finish the Document Type configuration.
  30. Click File > Save all to save the changes made in the solution. 

3 Configure a Lookup Table and add global variables

  1. In the Solution Explorer, double-click _GC_Invoice.os to open the Global Container in the Designer windows. 
  2. In the Global Container, select the lktblInvoice lookup table to have access to its Property Grid.
  3. On the Property Grid, select the Fields property, and then click More to open the LookupField Collection Editor.
  4. In the LookupField Collection Editor, click the Add icon to add the following fields to the lookup table:
    FieldName Key Type
    Key True System.Int32
    Item False System.String
    Description False System.String
    Quantity False System.String
    Unit Price False System.String
    Discount False System.String
    Price False System.String
  5. In the Toolbox, expand the Variables section.
  6. Add three Double variables to the Global Container, and then enter the following names:
    • decTotal
    • decSubtotal
    • decSalesTax
  7. Select File > Save all to save the changes made in the solution. 

4 Create a sub-automation to open the file location and populate the lookup table

  1. In the Solution Explorer, right-click the InvoiceTableDataPrj, and then select Add > New Automation
  2. In the Name field, enter E_Invoice_Data_Pull.
  3. Click Add to add the automation to the project. 
  4. On the Designer windows, right-click, and then add the following information:
    • An entry point
    • Two labels
    • Two exit points
  5. Name the labels and the Exit points: Success and Failed
  6. On the Entry point, add a String parameter named fileLocation.
  7. On the Failed Exit point, add the String parameter named errMsgl. 
  8. On the Object Explorer, expand the _GC_Invoice, and then select Invoice.
  9. Add the FileName property of the Invoice control to the automation.
  10. Add the Open method of the Invoice control to the automation.
  11. On the Project Explorer, select Invoice > Astend_Invoice, and then select the tblInvoice control. Add the Table property to the automation.
  12. On the Designer windows, right-click, and then add Jump To > Failed. Set the errMsg parameter to Not able to access PDF document.
  13. On the Object Explorer, select the lookupTable. Add the ReplaceTableAutoKey method to the automation.
  14. Connect the design blocks as shown in the following figure.
    Open file and replace lookup table auto key

5 Loop through the lookup table and store data to global variables

  1. In the Toolbox, in the search field, enter ForLoop. Drag the ForLoop to the Designer windows to add a loop to the automation.
  2. Connect lookupTable1.ReplaceTableAutoKey design block to the forLoop1 design block.
  3. On the Property grid, configure the ForLoop parameters to the following values: 
    Parameter Value
    Initial 1
    Increment 1
    Limit 32
  4. In the Designer windows, right-click, and then add Jump To > Success.
  5. In the Toolbox, expand the Variables section, and then drag the Integer to the Designer windows. 
  6. In the Property Grid, change the Name property to intIndex
  7. In the Project Explorer, select lookupTable, and then add the GetRecord method to the automation.
  8. In the Designer windows, right-click, and then add Jump To > Failed. Set the errMsg parameter to Not able to access Data Table.
  9. In the Toolbox, in the search field, enter Switch, and then add the Switch component to the automation.
  10. Connect the Item element from the GetRecord design block to the Input port of a Switch component. 
  11. In the switch design block, click the Add icon, and then add three cases:
    • TOTAL
    • Invoice Subtotal
    • Sales Tax
  12. In the Project Explorer, expand the E_Invoice_Data_Pull. Drag three intIndex variables to the Designer windows. 
  13. Connect the corresponding Switch items to their intIndex properties design blocks. Missing media item.
  14. In the Project Explorer, select the lookupTable, and then add a GetRowColumn methods to the automation. 
  15. In the GetRowColumn method, set the columnName parameter to Description
  16. Copy, and then paste the GetRowColumn method design block twice.
  17. In the Project Explorer, drag the following values to the Designer window:
    • decSalesTax
    • decSubtotal
    • decTotal
  18. Connect each intIndex design block to its coordinating GetRowColumn method and global variable design blocks.
    intIndex to GetRecord to global variable automation links
  19. Confirm the automation links from the following image.
    completed PDF Challenge automation
  20. Select File > Save all to save the changes made in the automation.

Confirm your work

Adjust runtimeconfig.xml to have the Robot Inspector in the context menu.

  1. Open the file explorer.
  2. In the address field enter %appdata%
  3. Open the runtimeconfig.xml file.
  4. Under RuntimeTrayMenu, set: <MenuItem item="LoadRobotInspector" label="Show Pega Robot Inspector" show="true" />
    Adding Robot Inspector to the context menu

Verify the values flow between the lookup table and variables, using the Robot Inspector.

  1. In the Designer windows, right-click the first automation link, and then select Toggle breakpoint.
  2. Select Run > Debug.
  3. Inthe Windows Task Bar, locate the Pega Robot Runtime icon, and then right-click. 
  4. In the sub-menu, click Show Pega Robot Inspector.
  5. In the Robot Inspector window, click the Global Variables tab.
  6. Click each global variable to display in the Results frame.
    Global Variables of Robot Inspector
  7. Click the Automations tab, and then select the E_Invoice_Data_pull automation.
  8. In the Automation parameter frame, enter the file location of the desktop PDF file.
    enter file location parameter value
  9. Click Execute.
  10. Press f11 to step through the automation links.
  11. After the automation passes the ReplaceTableAutoKey method, from the task manager, activate the Robot inspector window, and then click the Lookup Tables tab.
  12. Click lookupTable1 to display the results
    Lookup table values
  13. Press the green continue arrow to complete the automation.
  14. In RobotIinspector, click the Global Variables tab. The results frame displays the values from the tables. You might have to click Refresh to update the values.
    Variable results
  15. Stop the debugger.



This Challenge is to practice what you learned in the following Module:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice