[mysql] How do I add indices to MySQL tables?

I've got a very large MySQL table with about 150,000 rows of data. Currently, when I try and run

SELECT * FROM table WHERE id = '1';

the code runs fine as the ID field is the primary index. However, for a recent development in the project, I have to search the database by another field. For example:

SELECT * FROM table WHERE product_id = '1';

This field was not previously indexed; however, I've added one, so mysql now indexes the field, but when I try to run the above query, it runs very slowly. An EXPLAIN query reveals that there is no index for the product_id field when I've already added one, and as a result the query takes any where from 20 minutes to 30 minutes to return a single row.

My full EXPLAIN results are:

| id | select_type | table | type | possible_keys| key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------+------+--------------+------+---------+------+-------+------------------+
|  1 | SIMPLE      | table | ALL  | NULL         | NULL | NULL    | NULL |157211 | Using where |
+----+-------------+-------+------+--------------+------+---------+------+-------+------------------+

It might be helpful to note that I've just taken a look, and ID field is stored as INT whereas the PRODUCT_ID field is stored as VARCHAR. Could this be the source of the problem?

This question is related to mysql optimization indexing row

The answer is


You say you have an index, the explain says otherwise. However, if you really do, this is how to continue:

If you have an index on the column, and MySQL decides not to use it, it may by because:

  1. There's another index in the query MySQL deems more appropriate to use, and it can use only one. The solution is usually an index spanning multiple columns if their normal method of retrieval is by value of more then one column.
  2. MySQL decides there are to many matching rows, and thinks a tablescan is probably faster. If that isn't the case, sometimes an ANALYZE TABLE helps.
  3. In more complex queries, it decides not to use it based on extremely intelligent thought-out voodoo in the query-plan that for some reason just not fits your current requirements.

In the case of (2) or (3), you could coax MySQL into using the index by index hint sytax, but if you do, be sure run some tests to determine whether it actually improves performance to use the index as you hint it.


A better option is to add the constraints directly during CREATE TABLE query (assuming you have the information about the tables)

CREATE TABLE products(
    productId INT AUTO_INCREMENT PRIMARY KEY,
    productName varchar(100) not null,
    categoryId INT NOT NULL,
    CONSTRAINT fk_category
    FOREIGN KEY (categoryId) 
    REFERENCES categories(categoryId)
        ON UPDATE CASCADE
        ON DELETE CASCADE
) ENGINE=INNODB;

You can use this syntax to add an index and control the kind of index (HASH or BTREE).

create index your_index_name on your_table_name(your_column_name) using HASH;
or
create index your_index_name on your_table_name(your_column_name) using BTREE;

You can learn about differences between BTREE and HASH indexes here: http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html


Indexes of two types can be added: when you define a primary key, MySQL will take it as index by default.

Explanation

Primary key as index

Consider you have a tbl_student table and you want student_id as primary key:

ALTER TABLE `tbl_student` ADD PRIMARY KEY (`student_id`)

Above statement adds a primary key, which means that indexed values must be unique and cannot be NULL.

Specify index name

ALTER TABLE `tbl_student` ADD INDEX student_index (`student_id`)

Above statement will create an ordinary index with student_index name.

Create unique index

ALTER TABLE `tbl_student` ADD UNIQUE student_unique_index (`student_id`)

Here, student_unique_index is the index name assigned to student_id and creates an index for which values must be unique (here null can be accepted).

Fulltext option

ALTER TABLE `tbl_student` ADD FULLTEXT student_fulltext_index (`student_id`)

Above statement will create the Fulltext index name with student_fulltext_index, for which you need MyISAM Mysql Engine.

How to remove indexes ?

DROP INDEX `student_index` ON `tbl_student`

How to check available indexes?

SHOW INDEX FROM `tbl_student`

ALTER TABLE TABLE_NAME ADD INDEX (COLUMN_NAME);

It's worth noting that multiple field indexes can drastically improve your query performance. So in the above example we assume ProductID is the only field to lookup but were the query to say ProductID = 1 AND Category = 7 then a multiple column index helps. This is achieved with the following:

ALTER TABLE `table` ADD INDEX `index_name` (`col1`,`col2`)

Additionally the index should match the order of the query fields. In my extended example the index should be (ProductID,Category) not the other way around.


ALTER TABLE `table` ADD INDEX `product_id_index` (`product_id`)

Never compare integer to strings in MySQL. If id is int, remove the quotes.


Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to optimization

Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Measuring execution time of a function in C++ GROUP BY having MAX date How to efficiently remove duplicates from an array without using Set Storing JSON in database vs. having a new column for each key Read file As String How to write a large buffer into a binary file in C++, fast? Is optimisation level -O3 dangerous in g++? Why is processing a sorted array faster than processing an unsorted array? MySQL my.cnf performance tuning recommendations

Examples related to indexing

numpy array TypeError: only integer scalar arrays can be converted to a scalar index How to print a specific row of a pandas DataFrame? What does 'index 0 is out of bounds for axis 0 with size 0' mean? How does String.Index work in Swift Pandas KeyError: value not in index Update row values where certain condition is met in pandas Pandas split DataFrame by column value Rebuild all indexes in a Database How are iloc and loc different? pandas loc vs. iloc vs. at vs. iat?

Examples related to row

Python Pandas Replacing Header with Top Row Return row number(s) for a particular value in a column in a dataframe How to insert a row in an HTML table body in JavaScript Referencing Row Number in R Python pandas: fill a dataframe row by row MS SQL 2008 - get all table names and their row counts in a DB How do I delete rows in a data frame? Remove table row after clicking table row delete button Javascript get the text value of a column from a particular row of an html table How to add a new row to datagridview programmatically