Contents IndexTemporary tables used in query processing Monitoring database performance

User's Guide
   Part III. Using SQL Anywhere
     Chapter 21. Monitoring and Improving Performance
      How the optimizer works

The database engine has an optimizer that attempts to pick the best strategy for executing each query. The best strategy is the one that gets the results in the shortest period of time. The optimizer determines the cost of each strategy by estimating the number of disk reads and writes required. The strategy with the lowest cost is chosen.

The optimizer must decide which order to access the tables in a query, and whether or not to use an index for each table. If a query joins N tables, there are N factorial possible ways to access the tables. The optimizer will estimate the cost of executing the query in the different ways and use the ordering with the lowest cost estimate. The query execution plan in the ISQL statistics window shows the table ordering for the current query and indicates in parentheses the index that was used for each table.

Top of page


Optimizer estimates

The optimizer uses heuristics (educated guesses) to help decide the best strategy.

For each table in a potential execution plan, the optimizer must estimate the number of rows that will be part of the results. The number of rows will depend on the size of the table and the restrictions in the WHERE clause or the ON clause of the query.

Percentage of rows that some comparison operations select

The table below shows the simplest guess at the percentage of rows that some of the comparison operations will select. The other comparison operations such as LIKE, IS NULL, and EXISTS are handled in a similar way.

Comparison Percentage
= 5 (see below)
<> 95
<, <=, >, >= 25
between 6

In many cases, the optimizer uses more sophisticated heuristics. For example, the estimate of 5percent for equality is only used in cases where these other heuristics do not apply.

The optimizer makes use of indexes and keys to improve its guess of the number of rows. Here are a few single-column examples:

Single-column examples

Top of page


Self tuning of the query optimizer

One of the most common constraints in a query is equality with a column value. For example,

     SELECT *
     FROM employee
     WHERE sex = 'f'

tests for equality of the sex column. For this type of constraint, the SQL Anywhere optimizer learns from its experience. A query will not always be optimized the same way the second time it is executed. The estimate for an equality constraint will be modified for columns that have an unusual distribution of values. This information is stored permanently in the database. If needed, the statistics can be deleted with the DROP OPTIMIZER STATISTICS command.

Top of page


Providing estimates to improve query performance

Since the query optimizer is guessing at the number of rows in a result based on the size of tables and particular restrictions used in the WHERE clause, it almost always makes inexact guesses. In many cases, the guess that the query optimizer makes is close enough to the real number of rows that the optimizer will have chosen the best search strategy. However, in some cases this does not occur.

The following query displays a list of order items that shipped later than the end of June, 1994:

     SELECT ship_date
     FROM sales_order_items
     WHERE ship_date > '1994/06/30'
     ORDER BY ship_date DESC

The estimated number of rows is 274. However, the actual number of rows returned is only 12. This estimate is wrong because the query optimizer guesses that a test for greater than will succeed 25 percent of the time. In this example, the condition on the ship_date column:

     ship_date > '1994/06/30'

is assumed to choose 25 percent of rows in the sales_order_items table.

Using an estimate

If you know that a condition has a success rate that differs from the optimizer rule, you can tell the database this information by using an estimate. An estimate is formed by enclosing in brackets the expression followed by a comma and a number. The number represents the percentage of rows that the expression is estimated to select. In this case, you could estimate a success rate of one percent:

     SELECT ship_date
     FROM sales_order_items
     WHERE ( ship_date > '1994/06/30', 1 )
     ORDER BY ship_date DESC

With this estimate, the optimizer estimates ten rows in the query.

Note

Incorrect estimates are only a problem when they lead to poorly optimized queries.

Top of page


Contents IndexTemporary tables used in query processing Monitoring database performance