How do I do a case-insensitive string comparison

Question

How can I do case insensitive string comparison in Python   I would like to encapsulate comparison of a regular strings to a repository string using in a very simple and Pythonic way  I also would like to have ability to look up values in a dict hashed by strings using regular python strings

User · Accepted Answer

Assuming ASCII strings   string1    Hello  string2    hello   if string1 lower      string2 lower        print  The strings are the same  case insensitive    else      print  The strings are NOT the same  case insensitive

User · Answer

Section 3 13 of the Unicode standard defines algorithms for caseless matching   X casefold      Y casefold   in Python 3 implements the  default caseless matching   D144    Casefolding does not preserve the normalization of strings in all instances and therefore the normalization needs to be done       vs   a      D145 introduces  canonical caseless matching    import unicodedata  def NFD text       return unicodedata normalize  NFD   text   def canonical caseless text       return NFD NFD text  casefold      NFD   is called twice for very infrequent edge cases involving U 0345 character   Example    gt  gt  gt       casefold       a    casefold   False  gt  gt  gt  canonical caseless          canonical caseless  a     True   There are also compatibility caseless matching  D146  for cases such as      U 3392  and   identifier caseless matching  to simplify and optimize caseless matching of identifiers

User · Answer

The usual approach is to uppercase the strings or lower case them for the lookups and comparisons   For example    gt  gt  gt   hello  upper       HELLO  upper   True  gt  gt  gt

User · Answer

The usual approach is to uppercase the strings or lower case them for the lookups and comparisons   For example    gt  gt  gt   hello  upper       HELLO  upper   True  gt  gt  gt

User · Answer

def insenStringCompare s1  s2           Method that takes two strings and returns True or False  based         on if they are equal  regardless of case         try          return s1 lower      s2 lower       except AttributeError          print  Please only pass strings into this method           print  You passed a  s and  s     s1   class    s2   class

User · Answer

How about converting to lowercase first  you can use string lower

User · Answer

This is another regex which I have learned to love hate over the last week so usually import as  in this case yes  something that reflects how im feeling  make a normal function     ask for input  then use     something   re compile r foo  spam     yes I        re I  yes I below  is the same as IGNORECASE but you cant make as many mistakes writing it   You then search your message using regex s but honestly that should be a few pages in its own   but the point is that foo or spam are piped together and case is ignored  Then if either are found then lost n found would display one of them  if neither then lost n found is equal to None  If its not equal to none return the user input in lower case using  return lost n found lower     This allows you to much more easily match up anything thats going to be case sensitive  Lastly  NCS  stands for  no one cares seriously      or not case sensitive    whichever  if anyone has any questions get me on this        import re as yes      def bar or spam             message   raw input   nEnter FoO for BaR or SpaM for EgGs  NCS                message in coconut   yes compile r foo  spam     yes I           lost n found   message in coconut search message  group            if lost n found    None              return lost n found lower           else              print   Make tea not love               return      whatz for breakfast   bar or spam        if whatz for breakfast    foo          print   BaR        elif whatz for breakfast    spam          print   EgGs

User · Answer

def insenStringCompare s1  s2           Method that takes two strings and returns True or False  based         on if they are equal  regardless of case         try          return s1 lower      s2 lower       except AttributeError          print  Please only pass strings into this method           print  You passed a  s and  s     s1   class    s2   class

User · Answer

The usual approach is to uppercase the strings or lower case them for the lookups and comparisons   For example    gt  gt  gt   hello  upper       HELLO  upper   True  gt  gt  gt

User · Answer

I saw this solution here using regex   import re if re search  mandy    Mandy Pande   re IGNORECASE     is True   It works well with accents  In  42   if re search            re IGNORECASE                print 1        1   However  it doesn t work with unicode characters case-insensitive  Thank you  Rhymoid for pointing out that as my understanding was that it needs the exact symbol  for the case to be true  The output is as follows   In  36        lower   Out 36        In  37        upper   Out 37    SS  In  38        upper   lower   Out 38    ss  In  39   if re search              re IGNORECASE                print 1        1 In  40   if re search  SS          re IGNORECASE                print 1        In  41   if re search       SS   re IGNORECASE                print 1

User · Answer

Using Python 2  calling  lower   on each string or Unicode object     string1 lower      string2 lower        will work most of the time  but indeed doesn t work in the situations  tchrist has described   Assume we have a file called unicode txt containing the two strings S s f   and S S F S  With Python 2    gt  gt  gt  utf8 bytes   open  unicode txt    r   read    gt  gt  gt  print repr utf8 bytes    xce xa3 xce xaf xcf x83 xcf x85 xcf x86 xce xbf xcf x82 n xce xa3 xce x8a xce xa3 xce xa5 xce xa6 xce x9f xce xa3 n   gt  gt  gt  u   utf8 bytes decode  utf8    gt  gt  gt  print u S s f   S S F S   gt  gt  gt  first  second   u splitlines    gt  gt  gt  print first lower   s s f    gt  gt  gt  print second lower   s s f s  gt  gt  gt  first lower      second lower   False  gt  gt  gt  first upper      second upper   True   The S character has two lowercase forms    and s  and  lower   won t help compare them case-insensitively   However  as of Python 3  all three forms will resolve to    and calling lower   on both strings will work correctly    gt  gt  gt  s   open  unicode txt   encoding  utf8   read    gt  gt  gt  print s  S s f   S S F S   gt  gt  gt  first  second   s splitlines    gt  gt  gt  print first lower    s s f    gt  gt  gt  print second lower    s s f    gt  gt  gt  first lower      second lower   True  gt  gt  gt  first upper      second upper   True   So if you care about edge-cases like the three sigmas in Greek  use Python 3    For reference  Python 2 7 3 and Python 3 3 0b1 are shown in the interpreter printouts above

User · Answer

def insenStringCompare s1  s2           Method that takes two strings and returns True or False  based         on if they are equal  regardless of case         try          return s1 lower      s2 lower       except AttributeError          print  Please only pass strings into this method           print  You passed a  s and  s     s1   class    s2   class

User · Answer

Using Python 2  calling  lower   on each string or Unicode object     string1 lower      string2 lower        will work most of the time  but indeed doesn t work in the situations  tchrist has described   Assume we have a file called unicode txt containing the two strings S s f   and S S F S  With Python 2    gt  gt  gt  utf8 bytes   open  unicode txt    r   read    gt  gt  gt  print repr utf8 bytes    xce xa3 xce xaf xcf x83 xcf x85 xcf x86 xce xbf xcf x82 n xce xa3 xce x8a xce xa3 xce xa5 xce xa6 xce x9f xce xa3 n   gt  gt  gt  u   utf8 bytes decode  utf8    gt  gt  gt  print u S s f   S S F S   gt  gt  gt  first  second   u splitlines    gt  gt  gt  print first lower   s s f    gt  gt  gt  print second lower   s s f s  gt  gt  gt  first lower      second lower   False  gt  gt  gt  first upper      second upper   True   The S character has two lowercase forms    and s  and  lower   won t help compare them case-insensitively   However  as of Python 3  all three forms will resolve to    and calling lower   on both strings will work correctly    gt  gt  gt  s   open  unicode txt   encoding  utf8   read    gt  gt  gt  print s  S s f   S S F S   gt  gt  gt  first  second   s splitlines    gt  gt  gt  print first lower    s s f    gt  gt  gt  print second lower    s s f    gt  gt  gt  first lower      second lower   True  gt  gt  gt  first upper      second upper   True   So if you care about edge-cases like the three sigmas in Greek  use Python 3    For reference  Python 2 7 3 and Python 3 3 0b1 are shown in the interpreter printouts above

User · Answer

This is another regex which I have learned to love hate over the last week so usually import as  in this case yes  something that reflects how im feeling  make a normal function     ask for input  then use     something   re compile r foo  spam     yes I        re I  yes I below  is the same as IGNORECASE but you cant make as many mistakes writing it   You then search your message using regex s but honestly that should be a few pages in its own   but the point is that foo or spam are piped together and case is ignored  Then if either are found then lost n found would display one of them  if neither then lost n found is equal to None  If its not equal to none return the user input in lower case using  return lost n found lower     This allows you to much more easily match up anything thats going to be case sensitive  Lastly  NCS  stands for  no one cares seriously      or not case sensitive    whichever  if anyone has any questions get me on this        import re as yes      def bar or spam             message   raw input   nEnter FoO for BaR or SpaM for EgGs  NCS                message in coconut   yes compile r foo  spam     yes I           lost n found   message in coconut search message  group            if lost n found    None              return lost n found lower           else              print   Make tea not love               return      whatz for breakfast   bar or spam        if whatz for breakfast    foo          print   BaR        elif whatz for breakfast    spam          print   EgGs

User · Answer

How about converting to lowercase first  you can use string lower

User · Answer

The usual approach is to uppercase the strings or lower case them for the lookups and comparisons   For example    gt  gt  gt   hello  upper       HELLO  upper   True  gt  gt  gt

User · Answer

Comparing strings in a case insensitive way seems trivial  but it s not  I will be using Python 3  since Python 2 is underdeveloped here   The first thing to note is that case-removing conversions in Unicode aren t trivial  There is text for which text lower      text upper   lower    such as             lower     gt  gt  gt             upper   lower     gt  gt  gt   ss    But let s say you wanted to caselessly compare  BUSSE  and  Bu  e   Heck  you probably also want to compare  BUSSE  and  BU E  equal - that s the newer capital form  The recommended way is to use casefold      str casefold        Return a casefolded copy of the string  Casefolded strings may be used for   caseless matching       Casefolding is similar to lowercasing but more aggressive because it is   intended to remove all case distinctions in a string          Do not just use lower  If casefold is not available  doing  upper   lower   helps  but only somewhat    Then you should consider accents  If your font renderer is good  you probably think          e   - but it doesn t            e     gt  gt  gt  False   This is because the accent on the latter is a combining character   import unicodedata   unicodedata name char  for char in         gt  gt  gt    LATIN SMALL LETTER E WITH CIRCUMFLEX     unicodedata name char  for char in  e      gt  gt  gt    LATIN SMALL LETTER E    COMBINING CIRCUMFLEX ACCENT     The simplest way to deal with this is unicodedata normalize  You probably want to use NFKD normalization  but feel free to check the documentation  Then one does  unicodedata normalize  NFKD            unicodedata normalize  NFKD    e      gt  gt  gt  True   To finish up  here this is expressed in functions   import unicodedata  def normalize caseless text       return unicodedata normalize  NFKD   text casefold     def caseless equal left  right       return normalize caseless left     normalize caseless right

User · Answer

I saw this solution here using regex   import re if re search  mandy    Mandy Pande   re IGNORECASE     is True   It works well with accents  In  42   if re search            re IGNORECASE                print 1        1   However  it doesn t work with unicode characters case-insensitive  Thank you  Rhymoid for pointing out that as my understanding was that it needs the exact symbol  for the case to be true  The output is as follows   In  36        lower   Out 36        In  37        upper   Out 37    SS  In  38        upper   lower   Out 38    ss  In  39   if re search              re IGNORECASE                print 1        1 In  40   if re search  SS          re IGNORECASE                print 1        In  41   if re search       SS   re IGNORECASE                print 1

User · Answer

Section 3 13 of the Unicode standard defines algorithms for caseless matching   X casefold      Y casefold   in Python 3 implements the  default caseless matching   D144    Casefolding does not preserve the normalization of strings in all instances and therefore the normalization needs to be done       vs   a      D145 introduces  canonical caseless matching    import unicodedata  def NFD text       return unicodedata normalize  NFD   text   def canonical caseless text       return NFD NFD text  casefold      NFD   is called twice for very infrequent edge cases involving U 0345 character   Example    gt  gt  gt       casefold       a    casefold   False  gt  gt  gt  canonical caseless          canonical caseless  a     True   There are also compatibility caseless matching  D146  for cases such as      U 3392  and   identifier caseless matching  to simplify and optimize caseless matching of identifiers

User · Answer

def insenStringCompare s1  s2           Method that takes two strings and returns True or False  based         on if they are equal  regardless of case         try          return s1 lower      s2 lower       except AttributeError          print  Please only pass strings into this method           print  You passed a  s and  s     s1   class    s2   class

User · Answer

Comparing strings in a case insensitive way seems trivial  but it s not  I will be using Python 3  since Python 2 is underdeveloped here   The first thing to note is that case-removing conversions in Unicode aren t trivial  There is text for which text lower      text upper   lower    such as             lower     gt  gt  gt             upper   lower     gt  gt  gt   ss    But let s say you wanted to caselessly compare  BUSSE  and  Bu  e   Heck  you probably also want to compare  BUSSE  and  BU E  equal - that s the newer capital form  The recommended way is to use casefold      str casefold        Return a casefolded copy of the string  Casefolded strings may be used for   caseless matching       Casefolding is similar to lowercasing but more aggressive because it is   intended to remove all case distinctions in a string          Do not just use lower  If casefold is not available  doing  upper   lower   helps  but only somewhat    Then you should consider accents  If your font renderer is good  you probably think          e   - but it doesn t            e     gt  gt  gt  False   This is because the accent on the latter is a combining character   import unicodedata   unicodedata name char  for char in         gt  gt  gt    LATIN SMALL LETTER E WITH CIRCUMFLEX     unicodedata name char  for char in  e      gt  gt  gt    LATIN SMALL LETTER E    COMBINING CIRCUMFLEX ACCENT     The simplest way to deal with this is unicodedata normalize  You probably want to use NFKD normalization  but feel free to check the documentation  Then one does  unicodedata normalize  NFKD            unicodedata normalize  NFKD    e      gt  gt  gt  True   To finish up  here this is expressed in functions   import unicodedata  def normalize caseless text       return unicodedata normalize  NFKD   text casefold     def caseless equal left  right       return normalize caseless left     normalize caseless right

User · Answer

How about converting to lowercase first  you can use string lower

User · Answer

How about converting to lowercase first  you can use string lower

[python] How do I do a case-insensitive string comparison?

Examples related to python

Examples related to comparison

Examples related to case-insensitive