Thursday, 7 May 2015

Query Optimizer

Query Optimizer

In order to ensure that your SELECT statements retrieve data as quickly as possible, you can ensure
that your tables have been properly indexed

SELECT PartName FROM Parts
WHERE ManfID=’jkl123’;


This query search the total table   , if  the values are sorted in ascending order , then it can find the ManfID value of jkl123 much faster .

Searches are faster in indexes because indexes are sorted
Identical values are grouped together and organized in an easy-to-locate order.

Index columns that appear in search conditions. As a general rule, you should consider defining
an index on any column that you commonly use in WHERE, GROUP BY, or HAVING clauses.

Because these columns define the limitations of a query, they are good candidates for improving
performance because they allow MySQL to identify quickly which rows should be included in a
search and which should not.

Index columns that appear in join conditions. Index any columns that appear in a join condition.
Because join conditions are often based on foreign key columns that reference primary key
columns, MySQL creates the indexes automatically when you define the primary keys and foreign keys.

Do not index columns that appear only in the SELECT clause. If a column appears in the
SELECT clause of a SELECT statement, but does not appear in WHERE, GROUP BY, or HAVING
clauses, you usually shouldn’t index these columns because indexing them provides no performance
benefit but does require additional storage. Indexing columns in the SELECT clause provides
no benefit because the SELECT clause is one of the last parts of a SELECT statement to be
processed. MySQL conducts searches based on the other clauses. After MySQL identifies which
rows to return, it then consults the SELECT clause to determine which columns from the identified
rows to return.

Do not index columns that contain only a few different values. If a column contains many
duplicated values, indexing that column provides little benefit

Specify prefixes for indexes on columns that contain large string values. If you’re adding an
index to a string column, consider defining a prefix on that index so that your index includes
only part of the entire values, as they’re stored in the table. 
For example, if your table includes a  CHAR(150) column, you might consider indexing only the first 10 or 15 bytes, or whatever number provides enough unique values without having to store the entire values in the index.

Create only the indexes that you need. Never create more indexes than you need. If a column is
rarely used in a search or join condition, don’t index that column. You want to index only those
columns that are frequently used
to identify the rows being searched.

MySQL supports five types of indexes:

❑ Primary key: Requires that each value or set of values be unique in the columns on which
the primary key is defined. In addition, NULL values are not allowed. A table can include only
one primary key.

❑ Foreign key: Enforces the relationship between the referencing columns in the child table where
the foreign key is defined and the referenced columns in the parent table.

❑ Regular: Provides a basic index that permits duplicate values and NULL values in the columns
on which the index is defined.

❑ Unique: Requires that each value or set of values be unique in the columns on which the index
is defined. Unlike primary key indexes, NULL values are allowed.

❑ Full-text: Supports full-text searches of the values in the columns on which the index is defined.
A full-text index permits duplicate values and NULL values in those columns. A full-text index
can be defined only on MyISAM tables and only on CHAR, VARCHAR, and TEXT columns.

EXPLAIN Statement
EXPLAIN <select statement>
You should include your SELECT statement after the EXPLAIN keyword exactly as you would use the  SELECT statement in a query. The EXPLAIN statement then returns results that provide details about how the SELECT statement will be executed. From these details, you can determine whether indexes are being used effectively, whether you should add new indexes, or whether you should specify the order of how tables are joined together.


EXPLAIN SELECT PartName, ManfName
FROM Parts AS p, Manufacturers as m
WHERE p.ManfID = m.ManfID
ORDER BY PartName;


suppose that you have a column that permits only three values. If you create an index on the column, the cardinality for that index is 3, no matter how many rows are in that table. In general, MySQL does not use an index with a low cardinality because this is not an efficient use of indexes. As a result, when you add an index to a table or modify it significantly in any other way, you should ensure that the cardinality is correctly represented to the query optimizer. The easiest way to do this is to execute an OPTIMIZE TABLE statement.

Using the OPTIMIZE TABLE Statement  the OPTIMIZE TABLE statement performs a number of functions. For example, it defragments the table and sorts the table’s indexes. It also updates the internal table statistics. One of these statistics is the cardinality of its indexes. If you add an index to an existing table, you might have to use the OPTIMIZE TABLE statement to ensure that the table statistics are accurate when read by the query optimizer. The following syntax shows how to create an OPTIMIZE TABLE statement:

OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE <table name> [{, <table name>}...]
OPTIMIZE TABLE Parts;

Do not use unnecessary wildcards in LIKE clauses
MySQL cannot use an index on a column if  that column appears as an argument in a function or an arithmetic expression. For example, suppose your WHERE clause includes the expression YEAR(DateJoined)>1999, where DateJoined is a column that contains date values. If DateJoined is an indexed column, you might want to rewrite the WHERE clause to something similar to the following: DateJoined>’1999-12-31’.

Turn subqueries into joins.
Try using the FORCE INDEX clause.
changing the order of tables specified in a join condition.
Try using the STRAIGHT_JOIN option. When the query optimizer analyzes a SELECT statement, it
determines the order in which tables will be joined. In some cases, you might find that forcing the
optimizer to join tables in the order specified in the SELECT statement improves the statement’s
performance. This occurs because there are times when the query optimizer does not join tables in
the most optimal order. As a result, more rows are examined than need to be examined in order to
perform an effective join operation. By forcing the join order, you can sometimes see an improvement in performance because fewer rows are being searched. To force the order, you can add the STRAIGHT_JOIN option to your SELECT statement.

How to load data from text file..?

Optimizing Data Insertion

Use a LOAD DATA statement rather than an INSERT statement. Whenever possible, use a LOAD
DATA statement to insert data from a text file
, rather than use an INSERT statement. MySQL can
add data in a database up to 20 times faster when using a LOAD DATA statement, as compared to
using an INSERT statement.

Use INSERT statements with multiple VALUES clauses. When using INSERT statements to add
multiple rows in a table, you can use one of two methods to insert that data. The first is to create
an INSERT statement for each row of data, and the second is to create one INSERT statement
that contains multiple VALUES clauses. Using the second option is much faster because MySQL
must process only one SQL statement rather than many statements, and any related indexes must
be flushed only once,

When using multiple INSERT statements, group them together in a transaction. There will be
times when you must use multiple INSERT statements, such as when you’re inserting data in
multiple tables. In that case, you should isolate your INSERT statement in a transaction. This
process reduces the number of times that the index must be flushed.

Let MySQL insert default values. When using INSERT statements to add data to a table, you
can often insert the data without having to specify default values. If this is an option, do not
specify those values and instead allow MySQL to insert them. This method results in shorter
SQL statements, which means that the server must do less processing on each statement.

When possible, use the DELAYED option in your INSERT statements. When you specify the
DELAYED option in an INSERT statement, the execution of that statement is delayed until no
other client connections are accessing the same table that the INSERT statement is accessing.
You can continue to take other actions while the INSERT statement is in queue.

Optimizing Data Modification and Deletion

Use transaction...
Specify data types that have the correct length. When specifying column types, do not specify

types with lengths greater than what you need. For example, if you are defining a numerical
column, don’t use an INT data type if a SMALLINT data type will do. The smaller the column,
the quicker that MySQL can process values used in computations. In addition, the smaller the
columns, the smaller the indexes and the more data that can be held in memory

Define your columns as NOT NULL when appropriate. Whenever you can define a column as
NOT NULL, you should do so. Columns that permit NULL values take longer to process than
those that do not. If you have a column in which values are often not known, you can still
define the column as NOT NULL, but you can also define a default value for that column, such as
Unknown.

Consider defining your columns with the ENUM data type. An ENUM data type allows you to
specify the values that are permitted in a string column. In some cases, you know exactly what
values can be inserted in a column, there are relatively few of those values, and the values will
seldom change, if ever. In cases such as this, you should use the ENUM data type. Because ENUM
values are represented internally as numerical values,
MySQL can process them much more
quickly than a regular string value.

you should also use the OPTIMIZE TABLE statement to defragment some of your tables once they are created.  When two SQL statements have different syntax, but will predictably and regularly produce the same outputs, they are known as transforms of one another. Most good DBMSs do this sort of thing automatically. But some DBMSs won't try transforms when the expression contains multiple parentheses and NOTs. For example, this

SELECT statement can be slow:

SELECT * FROM Table1
WHERE column1 = 5 AND
NOT (column3 = 7 OR column1 = column2)

Applying the transforms ourselves, we came up with this statement:
SELECT * FROM Table1
WHERE column1 = 5
AND column3 <> 7
AND column2 <> 5
GAIN: 5/8

Exercise time: The MySQL online documentation has this example:

... WHERE a < b AND b = c AND a = 5
transforms to:

... WHERE b > 5 AND b = c AND a = 5

... WHERE a - 3 = 5

to:

... WHERE a = 8 /* a - 3 = 5 */

GAIN: 6/8

When there's arithmetic involved, though, only some DBMSs will transpose. For example, we tested
this transform:
... WHERE column1 - 3 = -column2
transforms to:
... WHERE column1 = -column2 + 3
GAIN: 4/8

No comments:

Post a Comment