StudentCodingHUB

Use programming to create innovative things.
  • new post

    Thursday, 7 May 2015

    Query Optimizer

    Query Optimizer

    In order to ensure that your SELECT statements retrieve data as quickly as possible, you can ensure
    that your tables have been properly indexed

    SELECT PartName FROM Parts
    WHERE ManfID=’jkl123’;


    This query search the total table   , if  the values are sorted in ascending order , then it can find the ManfID value of jkl123 much faster .

    Searches are faster in indexes because indexes are sorted
    Identical values are grouped together and organized in an easy-to-locate order.

    Index columns that appear in search conditions. As a general rule, you should consider defining
    an index on any column that you commonly use in WHERE, GROUP BY, or HAVING clauses.

    Because these columns define the limitations of a query, they are good candidates for improving
    performance because they allow MySQL to identify quickly which rows should be included in a
    search and which should not.

    Index columns that appear in join conditions. Index any columns that appear in a join condition.
    Because join conditions are often based on foreign key columns that reference primary key
    columns, MySQL creates the indexes automatically when you define the primary keys and foreign keys.

    Do not index columns that appear only in the SELECT clause. If a column appears in the
    SELECT clause of a SELECT statement, but does not appear in WHERE, GROUP BY, or HAVING
    clauses, you usually shouldn’t index these columns because indexing them provides no performance
    benefit but does require additional storage. Indexing columns in the SELECT clause provides
    no benefit because the SELECT clause is one of the last parts of a SELECT statement to be
    processed. MySQL conducts searches based on the other clauses. After MySQL identifies which
    rows to return, it then consults the SELECT clause to determine which columns from the identified
    rows to return.

    Do not index columns that contain only a few different values. If a column contains many
    duplicated values, indexing that column provides little benefit

    Specify prefixes for indexes on columns that contain large string values. If you’re adding an
    index to a string column, consider defining a prefix on that index so that your index includes
    only part of the entire values, as they’re stored in the table. 
    For example, if your table includes a  CHAR(150) column, you might consider indexing only the first 10 or 15 bytes, or whatever number provides enough unique values without having to store the entire values in the index.

    Create only the indexes that you need. Never create more indexes than you need. If a column is
    rarely used in a search or join condition, don’t index that column. You want to index only those
    columns that are frequently used
    to identify the rows being searched.

    MySQL supports five types of indexes:

    ❑ Primary key: Requires that each value or set of values be unique in the columns on which
    the primary key is defined. In addition, NULL values are not allowed. A table can include only
    one primary key.

    ❑ Foreign key: Enforces the relationship between the referencing columns in the child table where
    the foreign key is defined and the referenced columns in the parent table.

    ❑ Regular: Provides a basic index that permits duplicate values and NULL values in the columns
    on which the index is defined.

    ❑ Unique: Requires that each value or set of values be unique in the columns on which the index
    is defined. Unlike primary key indexes, NULL values are allowed.

    ❑ Full-text: Supports full-text searches of the values in the columns on which the index is defined.
    A full-text index permits duplicate values and NULL values in those columns. A full-text index
    can be defined only on MyISAM tables and only on CHAR, VARCHAR, and TEXT columns.

    EXPLAIN Statement
    EXPLAIN <select statement>
    You should include your SELECT statement after the EXPLAIN keyword exactly as you would use the  SELECT statement in a query. The EXPLAIN statement then returns results that provide details about how the SELECT statement will be executed. From these details, you can determine whether indexes are being used effectively, whether you should add new indexes, or whether you should specify the order of how tables are joined together.


    EXPLAIN SELECT PartName, ManfName
    FROM Parts AS p, Manufacturers as m
    WHERE p.ManfID = m.ManfID
    ORDER BY PartName;


    suppose that you have a column that permits only three values. If you create an index on the column, the cardinality for that index is 3, no matter how many rows are in that table. In general, MySQL does not use an index with a low cardinality because this is not an efficient use of indexes. As a result, when you add an index to a table or modify it significantly in any other way, you should ensure that the cardinality is correctly represented to the query optimizer. The easiest way to do this is to execute an OPTIMIZE TABLE statement.

    Using the OPTIMIZE TABLE Statement  the OPTIMIZE TABLE statement performs a number of functions. For example, it defragments the table and sorts the table’s indexes. It also updates the internal table statistics. One of these statistics is the cardinality of its indexes. If you add an index to an existing table, you might have to use the OPTIMIZE TABLE statement to ensure that the table statistics are accurate when read by the query optimizer. The following syntax shows how to create an OPTIMIZE TABLE statement:

    OPTIMIZE [LOCAL | NO_WRITE_TO_BINLOG] TABLE <table name> [{, <table name>}...]
    OPTIMIZE TABLE Parts;

    Do not use unnecessary wildcards in LIKE clauses
    MySQL cannot use an index on a column if  that column appears as an argument in a function or an arithmetic expression. For example, suppose your WHERE clause includes the expression YEAR(DateJoined)>1999, where DateJoined is a column that contains date values. If DateJoined is an indexed column, you might want to rewrite the WHERE clause to something similar to the following: DateJoined>’1999-12-31’.

    Turn subqueries into joins.
    Try using the FORCE INDEX clause.
    changing the order of tables specified in a join condition.
    Try using the STRAIGHT_JOIN option. When the query optimizer analyzes a SELECT statement, it
    determines the order in which tables will be joined. In some cases, you might find that forcing the
    optimizer to join tables in the order specified in the SELECT statement improves the statement’s
    performance. This occurs because there are times when the query optimizer does not join tables in
    the most optimal order. As a result, more rows are examined than need to be examined in order to
    perform an effective join operation. By forcing the join order, you can sometimes see an improvement in performance because fewer rows are being searched. To force the order, you can add the STRAIGHT_JOIN option to your SELECT statement.

    How to load data from text file..?

    Optimizing Data Insertion

    Use a LOAD DATA statement rather than an INSERT statement. Whenever possible, use a LOAD
    DATA statement to insert data from a text file
    , rather than use an INSERT statement. MySQL can
    add data in a database up to 20 times faster when using a LOAD DATA statement, as compared to
    using an INSERT statement.

    Use INSERT statements with multiple VALUES clauses. When using INSERT statements to add
    multiple rows in a table, you can use one of two methods to insert that data. The first is to create
    an INSERT statement for each row of data, and the second is to create one INSERT statement
    that contains multiple VALUES clauses. Using the second option is much faster because MySQL
    must process only one SQL statement rather than many statements, and any related indexes must
    be flushed only once,

    When using multiple INSERT statements, group them together in a transaction. There will be
    times when you must use multiple INSERT statements, such as when you’re inserting data in
    multiple tables. In that case, you should isolate your INSERT statement in a transaction. This
    process reduces the number of times that the index must be flushed.

    Let MySQL insert default values. When using INSERT statements to add data to a table, you
    can often insert the data without having to specify default values. If this is an option, do not
    specify those values and instead allow MySQL to insert them. This method results in shorter
    SQL statements, which means that the server must do less processing on each statement.

    When possible, use the DELAYED option in your INSERT statements. When you specify the
    DELAYED option in an INSERT statement, the execution of that statement is delayed until no
    other client connections are accessing the same table that the INSERT statement is accessing.
    You can continue to take other actions while the INSERT statement is in queue.

    Optimizing Data Modification and Deletion

    Use transaction...
    Specify data types that have the correct length. When specifying column types, do not specify

    types with lengths greater than what you need. For example, if you are defining a numerical
    column, don’t use an INT data type if a SMALLINT data type will do. The smaller the column,
    the quicker that MySQL can process values used in computations. In addition, the smaller the
    columns, the smaller the indexes and the more data that can be held in memory

    Define your columns as NOT NULL when appropriate. Whenever you can define a column as
    NOT NULL, you should do so. Columns that permit NULL values take longer to process than
    those that do not. If you have a column in which values are often not known, you can still
    define the column as NOT NULL, but you can also define a default value for that column, such as
    Unknown.

    Consider defining your columns with the ENUM data type. An ENUM data type allows you to
    specify the values that are permitted in a string column. In some cases, you know exactly what
    values can be inserted in a column, there are relatively few of those values, and the values will
    seldom change, if ever. In cases such as this, you should use the ENUM data type. Because ENUM
    values are represented internally as numerical values,
    MySQL can process them much more
    quickly than a regular string value.

    you should also use the OPTIMIZE TABLE statement to defragment some of your tables once they are created.  When two SQL statements have different syntax, but will predictably and regularly produce the same outputs, they are known as transforms of one another. Most good DBMSs do this sort of thing automatically. But some DBMSs won't try transforms when the expression contains multiple parentheses and NOTs. For example, this

    SELECT statement can be slow:

    SELECT * FROM Table1
    WHERE column1 = 5 AND
    NOT (column3 = 7 OR column1 = column2)

    Applying the transforms ourselves, we came up with this statement:
    SELECT * FROM Table1
    WHERE column1 = 5
    AND column3 <> 7
    AND column2 <> 5
    GAIN: 5/8

    Exercise time: The MySQL online documentation has this example:

    ... WHERE a < b AND b = c AND a = 5
    transforms to:

    ... WHERE b > 5 AND b = c AND a = 5

    ... WHERE a - 3 = 5

    to:

    ... WHERE a = 8 /* a - 3 = 5 */

    GAIN: 6/8

    When there's arithmetic involved, though, only some DBMSs will transpose. For example, we tested
    this transform:
    ... WHERE column1 - 3 = -column2
    transforms to:
    ... WHERE column1 = -column2 + 3
    GAIN: 4/8

    No comments:

    Post a Comment