How to Replace Multiple Characters in SQL

Question

This is based on a similar question How to Replace Multiple Characters in Access SQL    I wrote this since sql server 2005 seems to have a limit on replace   function to 19 replacements inside a where clause   I have the following task  Need to perform a match on a column  and to improve the chances of a match stripping multiple un-needed chars using replace   function  DECLARE  es NVarChar 1  SET  es      DECLARE  p0 NVarChar 1  SET  p0       DECLARE  p1 NVarChar 1  SET  p1       ---etc     SELECT   FROM t1 t2  WHERE  REPLACE REPLACE t1 stringkey  p0   es    p1   es          REPLACE REPLACE t2 stringkey  p0   es    p1   es      ---etc    If there are  19 REPLACE   in that where clause  it doesn t work  So the solution I came up with is to create a sql function called trimChars in this example  excuse them starting at  22  CREATE FUNCTION  trimChars        string varchar max      RETURNS varchar max   AS BEGIN  DECLARE  es NVarChar 1  SET  es      DECLARE  p22 NVarChar 1  SET  p22       DECLARE  p23 NVarChar 1  SET  p23     amp   DECLARE  p24 NVarChar 1  SET  p24       DECLARE  p25 NVarChar 1  SET  p25       DECLARE  p26 NVarChar 1  SET  p26       DECLARE  p27 NVarChar 1  SET  p27       DECLARE  p28 NVarChar 1  SET  p28       DECLARE  p29 NVarChar 1  SET  p29       DECLARE  p30 NVarChar 1  SET  p30        DECLARE  p31 NVarChar 1  SET  p31       DECLARE  p32 NVarChar 1  SET  p32       DECLARE  p33 NVarChar 1  SET  p33       DECLARE  p34 NVarChar 1  SET  p34       DECLARE  p35 NVarChar 1  SET  p35       DECLARE  p36 NVarChar 1  SET  p36       DECLARE  p37 NVarChar 1  SET  p37       DECLARE  p38 NVarChar 1  SET  p38     lt   DECLARE  p39 NVarChar 1  SET  p39     gt   DECLARE  p40 NVarChar 1  SET  p40       DECLARE  p41 NVarChar 1  SET  p41    -   return   REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE REPLACE          string   p22   es    p23   es    p24   es    p25   es    p26   es    p27   es    p28   es    p29   es    p30   es    p31   es    p32   es    p33   es    p34   es    p35   es    p36   es    p37   es    p38   es    p39   es    p40   es    p41   es  END    This can then be used in addition to the other replace strings  SELECT   FROM t1 t2  WHERE  trimChars REPLACE REPLACE t1 stringkey  p0   es    p1   es              REPLACE REPLACE t2 stringkey  p0   es    p1   es        I created a few more functions to do similar replacing like so trimChars trimMoreChars   SELECT   FROM t1 t2  WHERE  trimChars trimMoreChars REPLACE REPLACE t1 stringkey  p0   es    p1   es              REPLACE REPLACE t2 stringkey  p0   es    p1   es      Can someone give me a better solution to this problem in terms of performance and maybe a cleaner implementation

User · Answer

I really like  Juliett s solution  I would just use a CTE to get all the invalid characters   DECLARE  badStrings VARCHAR 100  DECLARE  teststring VARCHAR 100   SET  badStrings     gt  lt        SET  teststring    Juliet ro gt  lt 0zs my s0x  rz    one           WITH CTE AS     SELECT SUBSTRING  badStrings  1  1  AS  String   1 AS  Start   1 AS  Counter    UNION ALL   SELECT SUBSTRING  badStrings   Start    1  1  AS  String    Start    1   Counter    1    FROM CTE    WHERE  Counter   lt  LEN  badStrings     SELECT  teststring   REPLACE  teststring  CTE  String       FROM CTE  SELECT  teststring      Juliet ro0zs my s0xrzone

User · Answer

One option is to use a numbers tally table to drive an iterative process via a pseudo-set based query   The general idea of char replacement can be demonstrated with a simple character map table approach   create table charMap  srcChar char 1   replaceChar char 1   insert charMap values   a    z   insert charMap values   b    y     create table testChar srcChar char 1   insert testChar values   1   insert testChar values   a   insert testChar values   2   insert testChar values   b    select  coalesce charMap replaceChar  testChar srcChar  as charData from testChar left join charMap on testChar srcChar   charMap srcChar   Then you can bring in the tally table approach to do the lookup on each character position in the string   create table tally  i int  declare  i int set  i   1 while  i  lt   256 begin     insert tally values   i      set  i    i   1 end  create table testData  testString char 10   insert testData values   123a456   insert testData values   123ab456   insert testData values   123b456    select     i      SUBSTRING testString  i  1  as srcChar      coalesce charMap replaceChar  SUBSTRING testString  i  1   as charData from testData cross join tally     left join charMap on SUBSTRING testString  i  1    charMap srcChar where i  lt   LEN testString

User · Answer

I don t know why Charles Bretana deleted his answer  so I m adding it back in as a CW answer  but a persisted computed column is a REALLY good way to handle these cases where you need cleansed or transformed data almost all the time  but need to preserve the original garbage   His suggestion is relevant and appropriate REGARDLESS of how you decide to cleanse your data   Specifically  in my current project  I have a persisted computed column which trims all the leading zeros  luckily this is realtively easily handled in straight T-SQL  from some particular numeric identifiers stored inconsistently with leading zeros   This is stored in persisted computed columns in the tables which need it and indexed because that conformed identifier is often used in joins

User · Answer

declare  testVal varchar 20   set  testVal     t es ti n g 1 2 3    select  testVal   REPLACE  testVal  item      from  select     item union select     union select      list  select  testVal

User · Answer

I would seriously consider making a CLR UDF instead and using regular expressions  both the string and the pattern can be passed in as parameters  to do a complete search and replace for a range of characters   It should easily outperform this SQL UDF

User · Answer

One useful trick in SQL is the ability use  var   function      to assign a value  If you have multiple records in your record set  your var is assigned multiple times with side-effects   declare  badStrings table  item varchar 50    INSERT INTO  badStrings item  SELECT   gt   UNION ALL SELECT   lt   UNION ALL SELECT     UNION ALL SELECT     UNION ALL SELECT     UNION ALL SELECT     UNION ALL SELECT      declare  testString varchar 100    newString varchar 100   set  teststring    Juliet ro gt  lt 0zs my s0x  rz    one         set  newString    testString  SELECT  newString   Replace  newString  item      FROM  badStrings  select  newString -- returns  Juliet ro0zs my s0xrzone

User · Answer

I suggest you to create a scalar user defined function  This is an example  sorry in advance  because the variable names are in spanish    CREATE FUNCTION  dbo   Udf ReplaceChars       cadena VARCHAR 500    -- String to manipulate    caracteresElim VARCHAR 100    -- String of characters to be replaced    caracteresReem VARCHAR 100    -- String of characters for replacement    RETURNS VARCHAR 500  AS BEGIN   DECLARE  cadenaFinal VARCHAR 500    longCad INT   pos INT   caracter CHAR 1    posCarER INT    SELECT      cadenaFinal            longCad   LEN  cadena        pos   1     IF LEN  caracteresElim  lt  gt LEN  caracteresReem      BEGIN       RETURN NULL      END    WHILE  pos  lt    longCad     BEGIN       SELECT          caracter   SUBSTRING  cadena  pos 1            pos    pos   1           posCarER   CHARINDEX  caracter  caracteresElim          IF  posCarER  lt   0         BEGIN           SET  cadenaFinal    cadenaFinal    caracter          END       ELSE         BEGIN           SET  cadenaFinal    cadenaFinal   SUBSTRING  caracteresReem  posCarER 1          END     END    RETURN  cadenaFinal  END   Here is an example using this function   SELECT dbo Udf ReplaceChars  This is a test    sat   Z47      And the result is  7hiZ iZ 4 7eZ7   As you can see  each character of the  caracteresElim parameter is replaced by the character in the same position from the  caracteresReem parameter

User · Answer

While this question was asked about SQL Server 2005  it s worth noting that as of Sql Server 2017  the request can be done with the new TRANSLATE function   https   docs microsoft com en-us sql t-sql functions translate-transact-sql  I hope this information helps people who get to this page in the future

User · Answer

I had a one-off data migration issue where the source data could not output correctly some unusual technical characters plus the ubiquitous extra commas in CSVs   We decided that for each such character the source extract should replace them with something that was recognisable to both the source system and the SQL Server that was loading them but which would not be in the data otherwise    It did mean however that in various columns across various tables these replacement characters would appear and I would have to replace them  Nesting multiple REPLACE functions made the import code look scary and prone to errors in misjudging the placement and number of brackets so I wrote the following function  I know it can process a column in a table of 3 000 rows in less than a second though I m not sure how quickly it will scale up to multi-million row tables   create function  dbo   udf ReplaceMultipleChars         OriginalString nvarchar 4000       ReplaceTheseChars nvarchar 100       LengthOfReplacement int   1   returns nvarchar 4000  begin      declare  RevisedString nvarchar 4000    N        declare  lengthofinput int                             select len  OriginalString                  with AllNumbers as  select 1 as  Number     union all     select Number   1     from AllNumbers     where Number  lt   lengthofinput  select  RevisedString    case                              when  charindex substring  OriginalString  Number  1    ReplaceTheseChars  1  - 1    2       0 then                                  substring                                                 ReplaceTheseChars                                               charindex                                                             substring  OriginalString  Number  1                                                              ReplaceTheseChars                                                            1                                                            1                                                LengthOfReplacement                                                                          else                                  substring  OriginalString  Number  1                           end     from AllNumbers     option  maxrecursion 4000       return   RevisedString   end    It works by submitting both the string to be evaluated and have characters to be replaced   OriginalString  along with a string of paired characters where the first character is to be replaced by the second  the third by the fourth  fifth by sixth and so on   ReplaceTheseChars    Here is the string of chars that I needed to replace and their replacements                                         i e  A opening square bracket denotes an apostrophe  a closing one a double quote  You can see that there were vulgar fractions as well as degrees and diameter symbols in there   There is a default  LengthOfReplacement that is included as a starting point if anyone needed to replace longer strings  I played around with that in my project but the single char replacement was the main function   The condition of the case statement is important  It ensures that it only replaces the character if it is found in your  ReplaceTheseChars variable and that the character has to be found in an odd numbered position  the minus 1 from charindex result ensures that anything NOT found returns a negative modulo value   i e if you find a tilde     in position 5 it will replace it with a comma but if on a subsequent run it found the comma in position 6 it would not replace it with a curly bracket        This can be best demonstrated with an example     declare  ProductDescription nvarchar 20    N abc def     123   select  ProductDescription   dbo udf ReplaceMultipleChars                                     ProductDescription    NB the doubling up of the apostrophe is necessary in the string but resolves to a single apostrophe when passed to the function                                                                                                               default                                  select  ProductDescription    dbo udf ReplaceMultipleChars                                      ProductDescription                                                                             if you didn t know how to type those peculiar chars in then you can build a string like  this          nchar 0x0027              nchar 0x00D8          nchar 0x00B0  etc                                                                       default                                     This will return both the value after the first pass through the function and the second time as follows         abc def     123 abc def     123  A table update would just be   update a set a Col1   udf ReplaceMultipleChars a Col1        1  from TestTable a   Finally  I hear you say    although I ve not had access to the translate function I believe that this function can process the example shown in the documentation quite easily  The TRANSLATE function demo is  SELECT TRANSLATE  2  3 4   7-2                       which returns 2  3 4   7-2  although I understand it might not work on 2  3 4   7-2      My function would approach this as follows listing each char to be replaced followed by its replacement   --       --    etc   select dbo udf ReplaceMultipleChars  2  3 4   7-2                1     which will also work for   select dbo udf ReplaceMultipleChars  2  3 4   7-2                1     I hope someone finds this useful and if you get to test its performance against larger tables do let us know one way or another

User · Answer

Here are the steps    Create a CLR function   See following code   public partial class UserDefinedFunctions      Microsoft SqlServer Server SqlFunction  public static SqlString Replace2 SqlString inputtext  SqlString filter SqlString      replacewith         string str   inputtext ToString        try               string pattern    string filter          string replacement    string replacewith          Regex rgx   new Regex pattern           string result   rgx Replace str  replacement           return  SqlString result             catch  Exception s                return  SqlString s Message               Deploy your CLR function Now Test it   See following code   create table dbo test dummydata varchar 255   Go INSERT INTO dbo test values  P ssw1rd     This 12is  test   Go Update dbo test set dummydata dbo Replace2 dummydata   0-9         select   from dbo test dummydata  Psswrd  This is test booom

[sql] How to Replace Multiple Characters in SQL?

Examples related to sql

Examples related to sql-server

Examples related to sql-server-2005

Examples related to sql-function