TINYTEXT TEXT MEDIUMTEXT and LONGTEXT maximum storage sizes

Question

Per the MySQL docs  there are four TEXT types    TINYTEXT TEXT MEDIUMTEXT LONGTEXT   What is the maximum length that I can store in a column of each data type assuming the character encoding is UTF-8

User · Accepted Answer

From the documentation  MySQL 8            Type   Maximum length ----------- -------------------------------------   TINYTEXT             255  2 8 minus 1  bytes       TEXT          65 535  216 minus 1  bytes   64 KiB MEDIUMTEXT      16 777 215  224 minus 1  bytes   16 MiB   LONGTEXT   4 294 967 295  232 minus 1  bytes    4 GiB   Note that the number of characters that can be stored in your column will depend on the character encoding

User · Answer

Rising to  Ankan-Zerob s challenge  this is my estimate of the maximum length which can be stored in each text type measured in words         Type           Bytes   English words   Multi-byte words ----------- --------------- --------------- -----------------   TINYTEXT             255               44                  23       TEXT          65 535           11 000               5 900 MEDIUMTEXT      16 777 215        2 800 000           1 500 000   LONGTEXT   4 294 967 295      740 000 000         380 000 000   In English  4 8 letters per word is probably a good average  eg norvig com mayzner html   though word lengths will vary according to domain  e g  spoken language vs  academic papers   so there s no point being too precise  English is mostly single-byte ASCII characters  with very occasional multi-byte characters  so close to one-byte-per-letter  An extra character has to be allowed for inter-word spaces  so I ve rounded down from 5 8 bytes per word  Languages with lots of accents such as say Polish would store slightly fewer words  as would e g  German with longer words   Languages requiring multi-byte characters such as Greek  Arabic  Hebrew  Hindi  Thai  etc  etc typically require two bytes per character in UTF-8  Guessing wildly at 5 letters per word  I ve rounded down from 11 bytes per word   CJK scripts  Hanzi  Kanji  Hiragana  Katakana  etc  I know nothing of  I believe characters mostly require 3 bytes in UTF-8  and  with massive simplification  they might be considered to use around 2 characters per word  so they would be somewhere between the other two   CJK scripts are likely to require less storage using UTF-16  depending    This is of course ignoring storage overheads etc

User · Answer

This is nice but doesn t answer the question    A VARCHAR should always be used instead of TINYTEXT   Tinytext is useful if you have wide rows - since the data is stored off the record  There is a performance overhead  but it does have a use

User · Answer

Expansion of the same answer   This SO post outlines in detail the overheads and storage mechanisms  As noted from point  1   A VARCHAR should always be used instead of TINYTEXT  However  when using VARCHAR  the max rowsize should not exceeed 65535 bytes  As outlined here http   dev mysql com doc refman 5 0 en charset-unicode-utf8 html  max 3 bytes for utf-8    THIS IS A ROUGH ESTIMATION TABLE FOR QUICK DECISIONS    So the worst case assumptions  3 bytes per utf-8 char  to best case  1 byte per utf-8 char  Assuming the english language has an average of 4 5 letters per word x is the number of bytes allocated   x-x        Type   A  worst case  x 3    B   best case  x    words estimate  A 4 5  -  B 4 5  ----------- ---------------------------------------------------------------------------   TINYTEXT                85       255                 18 - 56       TEXT            21 845       65 535              4 854 44 - 14 563 33   MEDIUMTEXT         5 592 415       16 777 215          1 242 758 8 - 3 728 270   LONGTEXT     1 431 655 765       4 294 967 295       318 145 725 5 - 954 437 176 6   Please refer to Chris V s answer as well   https   stackoverflow com a 35785869 1881812

[mysql] TINYTEXT, TEXT, MEDIUMTEXT, and LONGTEXT maximum storage sizes

Examples related to mysql

Examples related to innodb