How to do a regular expression replace in MySQL

Question

I have a table with  500k rows  varchar 255  UTF8 column filename contains a file name   I m trying to strip out various strange characters out of the filename - thought I d use a character class    a-zA-Z0-9      -   Now  is there a function in MySQL that lets you replace through a regular expression  I m looking for a similar functionality to REPLACE   function - simplified example follows   SELECT REPLACE  stackowerflow    ower    over     Output   stackoverflow      does something like this exist     SELECT X REG REPLACE  Stackoverflow     A-Zf     -      Output   -tackover-low    I know about REGEXP RLIKE  but those only check if there is a match  not what the match is    I could do a  SELECT pkey id filename FROM foo WHERE filename RLIKE    a-zA-Z0-9      -    from a PHP script  do a preg replace and then  UPDATE foo     WHERE pkey id       but that looks like a last-resort slow  amp  ugly hack

User · Answer

We can use IF condition in SELECT query as below:

Suppose that for anything with "ABC","ABC1","ABC2","ABC3",..., we want to replace with "ABC" then using REGEXP and IF() condition in the SELECT query, we can achieve this.

Syntax:

SELECT IF(column_name REGEXP 'ABC[0-9]$','ABC',column_name)
FROM table1 
WHERE column_name LIKE 'ABC%';

Example:

SELECT IF('ABC1' REGEXP 'ABC[0-9]$','ABC','ABC1');

User · Answer

I think there is an easy way to achieve this and It s working fine for me   To SELECT rows using REGEX  SELECT   FROM  table name  WHERE  column name to find  REGEXP  string-to-find    To UPDATE rows using REGEX  UPDATE  table name  SET column name to find REGEXP REPLACE column name to find   string-to-find    string-to-replace   WHERE column name to find REGEXP  string-to-find    REGEXP Reference  https   www geeksforgeeks org mysql-regular-expressions-regexp

User · Answer

You  can  do it     but it s not very wise     this is about as daring as I ll try     as far as full RegEx support your much better off using perl or the like    UPDATE db tbl SET column    CASE  WHEN column REGEXP      lt    WORD TO REPLACE    gt       THEN REPLACE column  WORD TO REPLACE   REPLACEMENT   END  WHERE column REGEXP      lt    WORD TO REPLACE    gt

User · Answer

If you are using MariaDB or MySQL 8 0  they have a function REGEXP REPLACE col  regexp  replace   See MariaDB docs and PCRE Regular expression enhancements Note that you can use regexp grouping as well  I found that very useful   SELECT REGEXP REPLACE  quot stackoverflow quot    quot  stack  over  flow  quot      2 -   1 -   3    returns over - stack - flow

User · Answer

My brute force method to get this to work was just    Dump the table - mysqldump -u user -p database table  gt  dump sql Find and replace a couple patterns - find  path to dump sql -type f -exec sed -i  s old string new string g         There are obviously other perl regeular expressions you could perform on the file as well  Import the table - mysqlimport -u user -p database table  lt  dump sql   If you want to make sure the string isn t elsewhere in your dataset  run a few regular expressions to make sure they all occur in a similar environment  It s also not that tough to create a backup before you run a replace  in case you accidentally destroy something that loses depth of information

User · Answer

I m happy to report that since this question was asked  now there is a satisfactory answer  Take a look at this terrific package   https   github com mysqludf lib mysqludf preg  Sample SQL   SELECT PREG REPLACE         fox       dog     the quick brown fox    AS demo    I found the package from this blog post as linked on this question

User · Answer

With MySQL 8 0  you could use natively REGEXP REPLACE function  12 5 2 Regular Expressions   REGEXP REPLACE expr  pat  repl   pos   occurrence   match type     Replaces occurrences in the string expr that match the regular expression specified by the pattern pat with the replacement string repl  and returns the resulting string  If expr  pat  or repl is NULL  the return value is NULL   and Regular expression support   Previously  MySQL used the Henry Spencer regular expression library to support regular expression operators  REGEXP  RLIKE   Regular expression support has been reimplemented using International Components for Unicode  ICU   which provides full Unicode support and is multibyte safe  The REGEXP LIKE   function performs regular expression matching in the manner of the REGEXP and RLIKE operators  which now are synonyms for that function  In addition  the REGEXP INSTR    REGEXP REPLACE    and REGEXP SUBSTR   functions are available to find match positions and perform substring substitution and extraction  respectively   SELECT REGEXP REPLACE  Stackoverflow    A-Zf    -  1 0  c     -- Output  -tackover-low  DBFiddle Demo

User · Answer

The one below basically finds the first match from the left and then replaces all occurences of it  tested in mysql-5 6    Usage   SELECT REGEX REPLACE  dis ambiguity    dis   space    ambiguity    disambiguity      Implementation   DELIMITER    CREATE FUNCTION REGEX REPLACE    var original VARCHAR 1000     var pattern VARCHAR 1000     var replacement VARCHAR 1000      RETURNS     VARCHAR 1000    COMMENT  Based on https   techras wordpress com 2011 06 02 regex-replace-for-mysql   BEGIN   DECLARE var replaced VARCHAR 1000  DEFAULT var original    DECLARE var leftmost match VARCHAR 1000  DEFAULT     REGEX CAPTURE LEFTMOST var original  var pattern       WHILE var leftmost match IS NOT NULL DO       IF var replacement  lt  gt  var leftmost match THEN         SET var replaced   REPLACE var replaced  var leftmost match  var replacement           SET var leftmost match   REGEX CAPTURE LEFTMOST var replaced  var pattern           ELSE           SET var leftmost match   NULL          END IF        END WHILE    RETURN var replaced  END    DELIMITER       DELIMITER    CREATE FUNCTION REGEX CAPTURE LEFTMOST    var original VARCHAR 1000     var pattern VARCHAR 1000      RETURNS     VARCHAR 1000    COMMENT     Captures the leftmost substring that matches the  var pattern    IN  var original   OR NULL if no match      BEGIN   DECLARE var temp l VARCHAR 1000     DECLARE var temp r VARCHAR 1000     DECLARE var left trim index INT    DECLARE var right trim index INT    SET var left trim index   1    SET var right trim index   1    SET var temp l         SET var temp r         WHILE  CHAR LENGTH var original   gt   var left trim index  DO     SET var temp l   LEFT var original  var left trim index       IF var temp l REGEXP var pattern THEN       WHILE  CHAR LENGTH var temp l   gt   var right trim index  DO         SET var temp r   RIGHT var temp l  var right trim index           IF var temp r REGEXP var pattern THEN           RETURN var temp r            END IF          SET var right trim index   var right trim index   1          END WHILE        END IF      SET var left trim index   var left trim index   1      END WHILE    RETURN NULL  END    DELIMITER

User · Answer

UPDATE 2  A useful set of regex functions including REGEXP REPLACE have now been provided in MySQL 8 0  This renders reading on unnecessary unless you re constrained to using an earlier version     UPDATE 1  Have now made this into a blog post  http   stevettt blogspot co uk 2018 02 a-mysql-regular-expression-replace html    The following expands upon the function provided by Rasika Godawatte but trawls through all necessary substrings rather than just testing single characters   -- ------------------------------------------------------------------------------------ -- USAGE -- ------------------------------------------------------------------------------------ -- SELECT reg replace  lt subject gt   --                     lt pattern gt   --                     lt replacement gt   --                     lt greedy gt   --                     lt minMatchLen gt   --                     lt maxMatchLen gt    -- where  --  lt subject gt  is the string to look in for doing the replacements --  lt pattern gt  is the regular expression to match against --  lt replacement gt  is the replacement string --  lt greedy gt  is TRUE for greedy matching or FALSE for non-greedy matching --  lt minMatchLen gt  specifies the minimum match length --  lt maxMatchLen gt  specifies the maximum match length --  minMatchLen and maxMatchLen are used to improve efficiency but are --  optional and can be set to 0 or NULL if not known required  -- Example  -- SELECT reg replace txt     Tt           a   TRUE  2  0  FROM tbl  DROP FUNCTION IF EXISTS reg replace  DELIMITER    CREATE FUNCTION reg replace subject VARCHAR 21845   pattern VARCHAR 21845     replacement VARCHAR 21845   greedy BOOLEAN  minMatchLen INT  maxMatchLen INT  RETURNS VARCHAR 21845  DETERMINISTIC BEGIN    DECLARE result  subStr  usePattern VARCHAR 21845      DECLARE startPos  prevStartPos  startInc  len  lenInc INT    IF subject REGEXP pattern THEN     SET result           -- Sanitize input parameter values     SET minMatchLen   IF minMatchLen  lt  1  1  minMatchLen       SET maxMatchLen   IF maxMatchLen  lt  1 OR maxMatchLen  gt  CHAR LENGTH subject                            CHAR LENGTH subject   maxMatchLen       -- Set the pattern to use to match an entire string rather than part of a string     SET usePattern   IF  LEFT pattern  1         pattern  CONCAT      pattern        SET usePattern   IF  RIGHT pattern  1         usePattern  CONCAT usePattern             -- Set start position to 1 if pattern starts with   or doesn t end with        IF LEFT pattern  1        OR RIGHT pattern  1   lt  gt      THEN       SET startPos   1  startInc   1      -- Otherwise  i e  pattern ends with   but doesn t start with     Set start pos     -- to the min or max match length from the end  depending on  greedy  flag       ELSEIF greedy THEN       SET startPos   CHAR LENGTH subject  - maxMatchLen   1  startInc   1      ELSE       SET startPos   CHAR LENGTH subject  - minMatchLen   1  startInc   -1      END IF      WHILE startPos  gt   1 AND startPos  lt   CHAR LENGTH subject        AND startPos   minMatchLen - 1  lt   CHAR LENGTH subject        AND   LEFT pattern  1        AND startPos  lt  gt  1        AND   RIGHT pattern  1                    AND startPos   maxMatchLen - 1  lt  CHAR LENGTH subject   DO       -- Set start length to maximum if matching greedily or pattern ends with          -- Otherwise set starting length to the minimum match length        IF greedy OR RIGHT pattern  1        THEN         SET len   LEAST CHAR LENGTH subject  - startPos   1  maxMatchLen   lenInc   -1        ELSE         SET len   minMatchLen  lenInc   1        END IF        SET prevStartPos   startPos        lenLoop  WHILE len  gt   1 AND len  lt   maxMatchLen                  AND startPos   len - 1  lt   CHAR LENGTH subject                   AND   RIGHT pattern  1                                AND startPos   len - 1  lt  gt  CHAR LENGTH subject   DO         SET subStr   SUBSTRING subject  startPos  len           IF subStr REGEXP usePattern THEN           SET result   IF startInc   1                            CONCAT result  replacement   CONCAT replacement  result              SET startPos   startPos   startInc   len            LEAVE lenLoop          END IF          SET len   len   lenInc        END WHILE        IF  startPos   prevStartPos  THEN         SET result   IF startInc   1  CONCAT result  SUBSTRING subject  startPos  1                            CONCAT SUBSTRING subject  startPos  1   result            SET startPos   startPos   startInc        END IF      END WHILE      IF startInc   1 AND startPos  lt   CHAR LENGTH subject  THEN       SET result   CONCAT result  RIGHT subject  CHAR LENGTH subject    1 - startPos        ELSEIF startInc   -1 AND startPos  gt   1 THEN       SET result   CONCAT LEFT subject  startPos   result       END IF    ELSE     SET result   subject    END IF    RETURN result  END   DELIMITER     Demo  Rextester Demo  Limitations   This method is of course going to take a while when the subject string is large  Update  Have now added minimum and maximum match length parameters for improved efficiency when these are known  zero   unknown unlimited   It won t allow substitution of backreferences  e g   1   2 etc   to replace capturing groups  If this functionality is needed  please see this answer which attempts to provide a workaround by updating the function to allow a secondary find and replace within each found match  at the expense of increased complexity   If  and or   is used in the pattern  they must be at the very start and very end respectively - e g  patterns such as   start end   are not supported  There is a  greedy  flag to specify whether the overall matching should be greedy or non-greedy  Combining greedy and lazy matching within a single regular expression  e g  a   b    is not supported    Usage Examples  The function has been used to answer the following StackOverflow questions    How to count words in MySQL   regular expression replacer  How to extract the nth word and count word occurrences in a MySQL string  How to extract two consecutive digits from a text field in MySQL  How to remove all non-alpha numeric characters from a string in MySQL  How to replace every other instance of a particular character in a MySQL string  How to get all distinct words of a specified minimum length from multiple columns in a MySQL table

User · Answer

I recently wrote a MySQL function to replace strings using regular expressions  You could find my post at the following location   http   techras wordpress com 2011 06 02 regex-replace-for-mysql   Here is the function code    DELIMITER     CREATE FUNCTION   regex replace  pattern VARCHAR 1000  replacement VARCHAR 1000  original VARCHAR 1000   RETURNS VARCHAR 1000  DETERMINISTIC BEGIN   DECLARE temp VARCHAR 1000     DECLARE ch VARCHAR 1     DECLARE i INT   SET i   1   SET temp        IF original REGEXP pattern THEN    loop label  LOOP     IF i gt CHAR LENGTH original  THEN     LEAVE loop label       END IF     SET ch   SUBSTRING original i 1      IF NOT ch REGEXP pattern THEN     SET temp   CONCAT temp ch      ELSE     SET temp   CONCAT temp replacement      END IF     SET i i 1    END LOOP   ELSE   SET temp   original   END IF   RETURN temp  END    DELIMITER     Example execution   mysql gt  select regex replace    a-zA-Z0-9 -       2my test3 text-to  check    my- sql  regular   expressions

User · Answer

we solve this problem without using regex this query replace only exact match string  update employee set employee firstname    trim REPLACE concat  quot   quot  employee firstname  quot   quot     jay     abc      Example   emp id employee firstname 1    jay 2    jay ajay 3    jay  After executing query result   emp id  employee firstname 1        abc 2        abc ajay 3        abc

User · Answer

Yes  you can  UPDATE table name    SET column name    seach str name    WHERE column name REGEXP    a-zA-Z0-9      -

User · Answer

MySQL 8 0   You can use the native REGEXP REPLACE function  Older versions  You can use a user-defined function  UDF   like mysql-udf-regexp

[mysql] How to do a regular expression replace in MySQL?

Examples related to mysql

Examples related to regex

Examples related to mysql-udf