Enhancing entity extraction with machine learning
5 Tasks
25 mins
Scenario
The U+ Air chatbot channel detects U+ Air ticket numbers using a RUTA-based model. The current entity model recognizes a ticket number in the following format: two letters followed by three digits (for example, WZ266 or AA132). In case of a rescheduled ticket or human error, the application also needs to detect unusual ticket number patterns. Running a RUTA script with machine learning can achieve this business outcome.
As a data scientist, enhance entity extraction with a machine learning model to satisfy the business requirements.
Use the following credentials to log in to the exercise system:
Role | User name | Password |
---|---|---|
Data Scientist | DataScientist | rules |
Application Developer | ApplicationDeveloper | rules |
Caution: To reuse the exercise system from a previous challenge, first complete the Creating entity extraction model using Ruta script challenge. Otherwise, click Initialize Pega or Reset Instance in this challenge.
Your assignment consists of the following tasks:
Task 1: Test the text extraction in the chatbot for unusual ticket number
As an application developer, test the entity extraction for ticket numbers different than the default pattern (two letters followed by three digits). Test the chatbot for the following message: I want to cancel my ticket number WZ-266.
Task 2: Train the machine learning entity extraction model by record
As a data scientist, add training data manually by record to train the ticket_number entity extraction machine learning model so that it uses machine learning to detect entities.
Task 3: Train the machine learning entity extraction model with a dataset
Import the Airlines_entity_dataset.xlsx dataset as training data to the ticket_number entity extraction model. Train the entity model with all the available data.
Task 4: Test the entity extraction
Test the entity extraction model, and then observe the results.
Task 5: Test the text extraction in the chatbot after building the model
As an application developer, test the entity extraction model, and then observe the results.
Challenge Walkthrough
Detailed Tasks
1 Test the text extraction in the chatbot for unusual ticket number
- On the exercise system landing page, click Pega CRM suite to log in to App Studio.
- Log in as an application developer with User name ApplicationDeveloper and Password rules.
- In the navigation pane of App Studio, click Channels to view the list of current channels.
- In the Current channel interfaces section, click the icon that represents your existing Airline Digital Messaging channel.
- In the Preview console on the right, in the Type your message here text box, enter I want to cancel my ticket number WZ-266 to test the chatbot.
- Turn on the Show analysis switch to see the details.
- Click Yes.
- Confirm that the chatbot detects the cancel ticket topic with 87% confidence and runs a preconfigured Cancel a ticket case type but does not recognize the ticket number due to an unusual pattern. The chatbot requests the ticket number even though it is provided in the first message.
- In the lower-left corner, click the user icon, and then select Log off to log out of App Studio.
2 Train the machine learning entity extraction model by record
- Log in to Prediction Studio as a data scientist with User name DataScientist and Password rules.
- On the Predictions landing page, click Airline to open the prediction workspace.
- Click the Entities tab to view the list of entities.
- In the ticket_number row, click the Gear icon to configure the machine learning data.
- In the ticket_number dialog box, click Add training data to add a new piece of data to the ticket_number entity.
- In the text box, enter I want to cancel my reservation for ticket number JK-294.
- Click Add.
- On the right, in the preview pane, select JK-294.
- Right-click JK-294, and then select #ticket_number.
- Optional: To provide the model with additional training data, select the topic of the message, in the Topic text box, enter or select action > cancel ticket.
- Click Save.
- Confirm that, in the Total training data column, the new record is displayed as 1, as shown in the following figure:
3 Train the machine learning entity extraction model with a dataset
- Download the Airlines_entity_dataset.xlsx dataset.
- Open, and then inspect the downloaded dataset:
- Confirm that the entity in the training data is <START:ticket_number> LO-127 <END>.
Note: The ticket number has a single space in front of the two letters, and double space after the three numbers..
- Close the Airlines_entity_dataset.xlsx file.
- Confirm that the entity in the training data is <START:ticket_number> LO-127 <END>.
- In the list of entities, in the ticket_number row, click the Gear icon.
- In the ticket_number dialog box, click Upload to open the dataset upload dialog box.
- Click Choose File, and then select the Airlines_entity_dataset.xlsx file.
- Click Upload.
- Click Choose File, and then select the Airlines_entity_dataset.xlsx file.
- In the ticket_number dialog box, confirm that the status of the newly added entities is Reviewed, as shown in the following figure:
Note: There are seven pages of new training data to view. You can inspect and edit every record similarly to manually-added training data.
- Click Save.
- In the ticket_number dialog box, click Upload to open the dataset upload dialog box.
- Note the new available training data pending:
- In the upper-right corner of the Airline prediction workspace, click Build to build the topic models.
- In the Build models dialog box, select two checkboxes:
- Airline
- Airline_entities
Note: , that Airline row contains pending training data for the topic detection which was added in the previous task.
- Click Build to build the models.
Note: The building process might take up to a few minutes. You see a green information ribbon at the top of the Airline prediction workspace after the process completes. If the green information ribbon does not appear after few minues, In the top right of the prediction window, click Actions > Refresh.
- Once the build is completed, at the top of the prediction window, click View report.
- In the Model training report window, review the build result.
- Click Close to close the Model training report window.
4 Test the entity extraction
- In the upper-right corner of the Airline prediction workspace, click Test.
- In the Test prediction dialog box, in the text box, enter I want to cancel my ticket number WZ-266.
- Click the Entity tab.
- Confirm that the test correctly identifies WZ-266 even though it is not part of the training dataset.
- In the Test prediction dialog box, enter I want to cancel my ticket AAL325.
- Confirm that the test correctly identifies AAL325 as a ticket number even though it is not part of the training dataset.
- Close the Test prediction dialog box.
- In the Test prediction dialog box, in the text box, enter I want to cancel my ticket number WZ-266.
- In the upper-right corner of the Airline prediction workspace, click Save.
- In the lower-left corner, click the user icon, and then select Log off to log out of Prediction Studio.
5 Test the text extraction in the chatbot after building the model
- Log in as an application developer with User name ApplicationDeveloper and Password rules.
- In the navigation pane of App Studio, click Channels to view the list of current channels.
- In the Current channel interfaces section, click the icon that represents your existing Airline Digital Messaging channel.
- In the Preview console on the right, in the Type your message here text box, enter I want to cancel my ticket number WZ-266 to test the chatbot.
- Turn on the Show analysis switch to show the details.
- Confirm that the chatbot recognizes WZ-266 as a ticket number.
- In the upper-right corner of of the Preview console, click Reset.
- In the Type your message here text box, enter I want to cancel my ticket number AAL325 to test the chatbot.
- Confirm that the chatbot recognizes AAL325 is recognized as a ticket number.
- Click Yes to initiate the routing of the case.
- Optional: To view the automatically created case, in the upper right of the App Studio window, click Preview to access the Customer Service portal:
- In the in My Work window, click My workbaskets.
- In the View queue for list, select Inbound correspondence, and then open the case.
This Challenge is to practice what you learned in the following Module:
Available in the following mission:
If you are having problems with your training, please review the Pega Academy Support FAQs.
Want to help us improve this content?