Using a database index for report performance
3 Tasks
1 hr 30 mins
Scenario
MDC wants to investigate the impact of expanding its revenue reporting capability to include filtering on cities that start with a certain string. They are aware of a source of actual city information that they can use. MDC is concerned that as the invoice table grows larger and with greater use of invoice search, query performance could impact their ability to efficiently process cases.
The following table provides the credentials you need to complete the challenge:
Role | User name | Password | User type |
---|---|---|---|
Admin | Admin@deliveryservice | rules | Application |
pgAdmin4 | pega | pega | Database |
Analyze the possible performance impact of expanding the address search capability to include filtering by city. To do this, you first insert a large amount of test data into the MDC-Data-Invoice table, customerdata.pr_mdc_data_invoice.
Run the modified query from the PostSQL pdAdmin4 tool. Analyze the query and prefix the query with EXPLAIN (ANALYZE true, COSTS true, FORMAT JSON). You first run the query and note the result. Then, you add a database index to the city column. Run the query again and compare the result from your first run.
Detailed Tasks
1 Review solution details
Consider indexing a table column that is used as a filter criterion in the SQL query.
2 Configure indexing
- Download the sample data from the following link:
Excercise_City_0.zip (19.02 MB)
- Transfer the .zip file to the
$/home/architect/Desktop
directory. - Load the .csv files' invoice data into the PostgreSQL database by using the query tool in pgAdmin4.
- In the Exercise_City folder, open and run the pr_mdc_data_invoice_import.sql file.
- Run the following query to verify that all the records are loaded successfully:
Select count(*) from pegadata.pr_mdc_data_invoice - Add a filter criterion, such as AND city LIKE 'Bost%', and then perform a simple query within pgAdmin4 against the address table, to the analyze statistics, as in the following table:
EXPLAIN (ANALYZE true, COSTS true, FORMAT JSON) SELECT * FROM pegadata.pr_mdc_data_invoice where city like 'Bost%' ORDER BY pyguid ASC
- Verify and capture the statistics for the sequential scan.
- Under pr_mdc_data_invoice, select Indexes.
- Right-click, and then select Create.
- Provide an index name such as idx_city or CITYIDX.
- In the Definition table, select btree.
- Add the city column to the index.
- Choose any operator class, such as text_pattern_ops.
- Repeat step 6 to see significantly faster performance due to an Index Scan being used.
- Verify that indexed columns result in significantly improved performance readings.
3 Optional: Compare the performance of report definition with different sources
- Load the large amount of data into the system.
- Run the followingquery plan to view the statistics/analysis:
-
Node Type Seq Scan Parent Relationship Outer Parallel Aware FALSE Async Capable FALSE Relation Name pr_mdc_data_invoice Alias pr_mdc_data_invoice Startup Cost 0 Total Cost 272.5 Plan Rows 1 Plan Width 136 Actual Startup Time 3.522 Actual Total Time 3.907 Actual Rows 1178 Actual Loops 1 Filter ((city)::text ~~ 'Spr%'::text) Rows Removed by Filter 10024 loops 1 inclusive 3.907 exclusive 3.907 inclusive_factor 0.34173008 inclusive_flag 2 rows 1178 rowsx_direction Negative rowsx_flag 4 exclusive_factor 0.34173008 exclusive_flag 2
-
- Create the index for the required columns (for example, City).
- After you create the index for the column, run the query plan to view the analysis and statistics.
- Compare the time spent and the elapsed time for the query, both before and after index creation.
- Next, try the report using the elastic search index. You will need to modify the report and create an elastic search index.
- Update the report definition data retrieval preference to Use search data.
- Create a custom search property rule and add the required column and property.
- Create the dedicated index.
- Verify that the required dedicated index is available. If not, index or re-index to make it available.
- Compare the elapsed time of interaction at the end of database retrieval, compared to when you use elastic search index.
The report definition that is sourced by an elastic search index retrieves results faster than when sourced by a database.
Confirm your work
Available in the following mission:
Want to help us improve this content?