Skip to main content

Enhancing entity extraction with machine learning

5 Tasks

25 mins

Visible to: All users
Beginner Pega Customer Decision Hub 8.7 English
Verify the version tags to ensure you are consuming the intended content or, complete the latest version.

Scenario

The U+ Air chatbot channel detects U+ Air ticket numbers using a RUTA-based model. The current entity model recognizes a ticket number in the following format: two letters followed by three digits (for example, WZ266 or AA132). In case of a rescheduled ticket or human error, the application also needs to detect unusual ticket number patterns. Running a RUTA script with machine learning can achieve this business outcome.

As a data scientist, enhance entity extraction with a machine learning model to satisfy the business requirements.

Use the following credentials to log in to the exercise system:

Role User name Password
Data Scientist DataScientist rules
Application Developer ApplicationDeveloper rules
Caution: To reuse the exercise system from a previous challenge, first complete the Creating entity extraction model using Ruta script challenge. Otherwise, click Initialize Pega or Reset Instance in this challenge.


Your assignment consists of the following tasks:

Task 1: Test the text extraction in the chatbot for unusual ticket number

As an application developer, test the entity extraction for ticket numbers different than the default pattern (two letters followed by three digits). Test the chatbot for the following message: I want to cancel my ticket number WZ-266.

Task 2: Train the machine learning entity extraction model by record

As a data scientist, add training data manually by record to train the ticket_number entity extraction machine learning model so that it uses machine learning to detect entities.

Task 3: Train the machine learning entity extraction model with a dataset

Import the Airlines_entity_dataset.xlsx dataset as training data to the ticket_number entity extraction model. Train the entity model with all the available data.

Task 4: Test the entity extraction

Test the entity extraction model, and then observe the results.

Task 5: Test the text extraction in the chatbot after building the model

As an application developer, test the entity extraction model, and then observe the results.

 

You must initiate your own Pega instance to complete this Challenge.

Initialization may take up to 5 minutes so please be patient.

Challenge Walkthrough

Detailed Tasks

1 Test the text extraction in the chatbot for unusual ticket number

  1. On the exercise system landing page, click Pega CRM suite to log in to App Studio.
  2. Log in as an application developer with User name ApplicationDeveloper and Password rules.
  3. In the navigation pane of App Studio, click Channels to view the list of current channels.
  4. In the Current channel interfaces section, click the icon that represents your existing Airline Digital Messaging channel.
    list
  5. In the Preview console on the right, in the Type your message here text box, enter I want to cancel my ticket number WZ-266 to test the chatbot.
  6. Turn on the Show analysis switch to see the details.
    bot 1
  7. Click Yes.
  8. Confirm that the chatbot detects the cancel ticket topic with 87% confidence and runs a preconfigured Cancel a ticket case type but does not recognize the ticket number due to an unusual pattern. The chatbot requests the ticket number even though it is provided in the first message.
    bot 2
  9. In the lower-left corner, click the user icon, and then select Log off to log out of App Studio.

2 Train the machine learning entity extraction model by record

  1. Log in to Prediction Studio as a data scientist with User name DataScientist and Password rules.
  2. On the Predictions landing page, click Airline to open the prediction workspace.
    airline tile
  3. Click the Entities tab to view the list of entities.
    entities tab
  4. In the ticket_number row, click the Gear icon to configure the machine learning data.
    entities list
    1. In the ticket_number dialog box, click Add training data to add a new piece of data to the ticket_number entity.
    2. In the text box, enter I want to cancel my reservation for ticket number JK-294.
      add training
    3. Click Add.
    4. On the right, in the preview pane, select JK-294.
    5. Right-click JK-294, and then select #ticket_number.
      ticket number
    6. Optional: To provide the model with additional training data, select the topic of the message, in the Topic text box, enter or select action > cancel ticket.
    7. Click Save.
    8. Confirm that, in the Total training data column, the new record is displayed as 1, as shown in the following figure:
      list pending

3 Train the machine learning entity extraction model with a dataset

  1. Download the Airlines_entity_dataset.xlsx dataset.
  2. Open, and then inspect the downloaded dataset:
    1. Confirm that the entity in the training data is <START:ticket_number> LO-127 <END>.
      xlsx file
      Note: The ticket number has a single space in front of the two letters, and double space after the three numbers..
    2. Close the Airlines_entity_dataset.xlsx file.
  1. In the list of entities, in the ticket_number row, click the Gear icon.
    gear icon
    1. In the ticket_number dialog box, click Upload to open the dataset upload dialog box.
      1. Click Choose File, and then select the Airlines_entity_dataset.xlsx file.
        choose file
      2. Click Upload.
    2. In the ticket_number dialog box, confirm that the status of the newly added entities is Reviewed, as shown in the following figure:
      ticket number add data
      Note: There are seven pages of new training data to view. You can inspect and edit every record similarly to manually-added training data.
    3. Click Save.
  1. Note the new available training data pending:
    pending data 1
  2. In the upper-right corner of the Airline prediction workspace, click Build to build the topic models.
  3. In the Build models dialog box, select two checkboxes:
    • Airline
    • Airline_entities
      build models
    Note: , that Airline row contains pending training data for the topic detection which was added in the previous task.
  1. Click Build to build the models.
    Note: The building process might take up to a few minutes. You see a green information ribbon at the top of the Airline prediction workspace after the process completes. If the green information ribbon does not appear after few minues, In the top right of the prediction window, click Actions > Refresh.
  1. Once the build is completed, at the top of the prediction window, click View report.
    view report
    1. In the Model training report window, review the build result.
    2. Click Close to close the Model training report window.

4 Test the entity extraction

  1. In the upper-right corner of the Airline prediction workspace, click Test.
    1. In the Test prediction dialog box, in the text box, enter I want to cancel my ticket number WZ-266.
      test prediction
    2. Click the Entity tab.
    3. Confirm that the test correctly identifies WZ-266 even though it is not part of the training dataset.
    4. In the Test prediction dialog box, enter I want to cancel my ticket AAL325.
    5. Confirm that the test correctly identifies AAL325 as a ticket number even though it is not part of the training dataset.
    6. Close the Test prediction dialog box.
  2. In the upper-right corner of the Airline prediction workspace, click Save.
  3. In the lower-left corner, click the user icon, and then select Log off to log out of Prediction Studio.

5 Test the text extraction in the chatbot after building the model

  1. Log in as an application developer with User name ApplicationDeveloper and Password rules.
  2. In the navigation pane of App Studio, click Channels to view the list of current channels.
  3. In the Current channel interfaces section, click the icon that represents your existing Airline Digital Messaging channel.
    list 2
  4. In the Preview console on the right, in the Type your message here text box, enter I want to cancel my ticket number WZ-266 to test the chatbot.
  5. Turn on the Show analysis switch to show the details.
    bot 4
  6. Confirm that the chatbot recognizes WZ-266 as a ticket number.
  7. In the upper-right corner of of the Preview console, click Reset.
  8. In the Type your message here text box, enter I want to cancel my ticket number AAL325 to test the chatbot.
    bot 5
  9. Confirm that the chatbot recognizes AAL325 is recognized as a ticket number.
  10. Click Yes to initiate the routing of the case.
  11. Optional: To view the automatically created case, in the upper right of the App Studio window, click Preview to access the Customer Service portal:
    1. In the in My Work window, click My workbaskets.
    2. In the View queue for list, select Inbound correspondence, and then open the case.

This Challenge is to practice what you learned in the following Module:


Available in the following mission:

If you are having problems with your training, please review the Pega Academy Support FAQs.

Did you find this content helpful?

Want to help us improve this content?

We'd prefer it if you saw us at our best.

Pega Academy has detected you are using a browser which may prevent you from experiencing the site as intended. To improve your experience, please update your browser.

Close Deprecation Notice