[mysql] Is there a REAL performance difference between INT and VARCHAR primary keys?

I faced the same dilemma. I made a DW (Constellation schema) with 3 fact tables, Road Accidents, Vehicles in Accidents and Casualties in Accidents. Data includes all accidents recorded in UK from 1979 to 2012, and 60 dimension tables. All together, about 20 million records.

Fact tables relationships:

+----------+          +---------+
| Accident |>--------<| Vehicle |
+-----v----+ 1      * +----v----+
     1|                    |1
      |    +----------+    |
      +---<| Casualty |>---+
         * +----------+ *

RDMS: MySQL 5.6

Natively the Accident index is a varchar(numbers and letters), with 15 digits. I tried not to have surrogate keys, once the accident indexes would never change. In a i7(8 cores) computer, the DW became too slow to query after 12 million records of load depending of the dimensions. After a lot of re-work and adding bigint surrogate keys I got a average 20% speed performance boost. Yet to low performance gain, but valid try. Im working in MySQL tuning and clustering.

Examples related to mysql

Implement specialization in ER diagram How to post query parameters with Axios? PHP with MySQL 8.0+ error: The server requested authentication method unknown to the client Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver' phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' is not supported How to resolve Unable to load authentication plugin 'caching_sha2_password' issue Connection Java-MySql : Public Key Retrieval is not allowed How to grant all privileges to root user in MySQL 8.0 MySQL 8.0 - Client does not support authentication protocol requested by server; consider upgrading MySQL client

Examples related to performance

Why is 2 * (i * i) faster than 2 * i * i in Java? What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism? How to check if a key exists in Json Object and get its value Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? Most efficient way to map function over numpy array The most efficient way to remove first N elements in a list? Fastest way to get the first n elements of a List into an Array Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3? pandas loc vs. iloc vs. at vs. iat? Android Recyclerview vs ListView with Viewholder

Examples related to primary-key

Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object What is Hash and Range Primary Key? Can I use VARCHAR as the PRIMARY KEY? Can a foreign key refer to a primary key in the same table? UUID max character length MySQL duplicate entry error even though there is no duplicate entry Creating composite primary key in SQL Server What are the best practices for using a GUID as a primary key, specifically regarding performance? Add primary key to existing table Create view with primary key?

Examples related to innodb

How can I rebuild indexes and update stats in MySQL innoDB? Database corruption with MariaDB : Table doesn't exist in engine How to regex in a MySQL query mysqldump exports only one table TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT maximum storage sizes What's the difference between MyISAM and InnoDB? Why is MySQL InnoDB insert so slow? How to debug Lock wait timeout exceeded on MySQL? How to change value for innodb_buffer_pool_size in MySQL on Mac OS? #1025 - Error on rename of './database/#sql-2e0f_1254ba7' to './database/table' (errno: 150)

Examples related to myisam

What's the difference between MyISAM and InnoDB? How to test an SQL Update statement before running it? How to properly create composite primary keys - MYSQL Is there a REAL performance difference between INT and VARCHAR primary keys? How can I check MySQL engine type for a specific table? MyISAM versus InnoDB