[c++] Differences between C++ string == and compare()?

I just read some recommendations on using

std::string s = get_string();
std::string t = another_string();

if( !s.compare(t) ) 
{

instead of

if( s == t )
{

I'm almost always using the last one because I'm used to it and it feels natural, more readable. I didn't even know that there was a separate comparison function. To be more precise, I thought == would call compare().

What are the differences? In which contexts should one way be favored to the other?

I'm considering only the cases where I need to know if a string is the same value as another string.

This question is related to c++ string

The answer is


compare has overloads for comparing substrings. If you're comparing whole strings you should just use == operator (and whether it calls compare or not is pretty much irrelevant).


If you just want to check string equality, use the == operator. Determining whether two strings are equal is simpler than finding an ordering (which is what compare() gives,) so it might be better performance-wise in your case to use the equality operator.

Longer answer: The API provides a method to check for string equality and a method to check string ordering. You want string equality, so use the equality operator (so that your expectations and those of the library implementors align.) If performance is important then you might like to test both methods and find the fastest.


Internally, string::operator==() is using string::compare(). Please refer to: CPlusPlus - string::operator==()

I wrote a small application to compare the performance, and apparently if you compile and run your code on debug environment the string::compare() is slightly faster than string::operator==(). However if you compile and run your code in Release environment, both are pretty much the same.

FYI, I ran 1,000,000 iteration in order to come up with such conclusion.

In order to prove why in debug environment the string::compare is faster, I went to the assembly and here is the code:

DEBUG BUILD

string::operator==()

        if (str1 == str2)
00D42A34  lea         eax,[str2]  
00D42A37  push        eax  
00D42A38  lea         ecx,[str1]  
00D42A3B  push        ecx  
00D42A3C  call        std::operator==<char,std::char_traits<char>,std::allocator<char> > (0D23EECh)  
00D42A41  add         esp,8  
00D42A44  movzx       edx,al  
00D42A47  test        edx,edx  
00D42A49  je          Algorithm::PerformanceTest::stringComparison_usingEqualOperator1+0C4h (0D42A54h)  

string::compare()

            if (str1.compare(str2) == 0)
00D424D4  lea         eax,[str2]  
00D424D7  push        eax  
00D424D8  lea         ecx,[str1]  
00D424DB  call        std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0D23582h)  
00D424E0  test        eax,eax  
00D424E2  jne         Algorithm::PerformanceTest::stringComparison_usingCompare1+0BDh (0D424EDh)

You can see that in string::operator==(), it has to perform extra operations (add esp, 8 and movzx edx,al)

RELEASE BUILD

string::operator==()

        if (str1 == str2)
008533F0  cmp         dword ptr [ebp-14h],10h  
008533F4  lea         eax,[str2]  
008533F7  push        dword ptr [ebp-18h]  
008533FA  cmovae      eax,dword ptr [str2]  
008533FE  push        eax  
008533FF  push        dword ptr [ebp-30h]  
00853402  push        ecx  
00853403  lea         ecx,[str1]  
00853406  call        std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0853B80h)  

string::compare()

            if (str1.compare(str2) == 0)
    00853830  cmp         dword ptr [ebp-14h],10h  
    00853834  lea         eax,[str2]  
    00853837  push        dword ptr [ebp-18h]  
    0085383A  cmovae      eax,dword ptr [str2]  
    0085383E  push        eax  
    0085383F  push        dword ptr [ebp-30h]  
    00853842  push        ecx  
00853843  lea         ecx,[str1]  
00853846  call        std::basic_string<char,std::char_traits<char>,std::allocator<char> >::compare (0853B80h)

Both assembly code are very similar as the compiler perform optimization.

Finally, in my opinion, the performance gain is negligible, hence I would really leave it to the developer to decide on which one is the preferred one as both achieve the same outcome (especially when it is release build).


std::string::compare() returns an int:

  • equal to zero if s and t are equal,
  • less than zero if s is less than t,
  • greater than zero if s is greater than t.

If you want your first code snippet to be equivalent to the second one, it should actually read:

if (!s.compare(t)) {
    // 's' and 't' are equal.
}

The equality operator only tests for equality (hence its name) and returns a bool.

To elaborate on the use cases, compare() can be useful if you're interested in how the two strings relate to one another (less or greater) when they happen to be different. PlasmaHH rightfully mentions trees, and it could also be, say, a string insertion algorithm that aims to keep the container sorted, a dichotomic search algorithm for the aforementioned container, and so on.

EDIT: As Steve Jessop points out in the comments, compare() is most useful for quick sort and binary search algorithms. Natural sorts and dichotomic searches can be implemented with only std::less.


compare() will return false (well, 0) if the strings are equal.

So don't take exchanging one for the other lightly.

Use whichever makes the code more readable.


Suppose consider two string s and t.
Give them some values.
When you compare them using (s==t) it returns a boolean value(true or false , 1 or 0).
But when you compare using s.compare(t) ,the expression returns a value
(i) 0 - if s and t are equal
(ii) <0 - either if the value of the first unmatched character in s is less than that of t or the length of s is less than that of t.
(iii) >0 - either if the value of the first unmatched character in t is less than that of s or the length of t is less than that of s.


compare() is equivalent to strcmp(). == is simple equality checking. compare() therefore returns an int, == is a boolean.


One thing that is not covered here is that it depends if we compare string to c string, c string to string or string to string.

A major difference is that for comparing two strings size equality is checked before doing the compare and that makes the == operator faster than a compare.

here is the compare as i see it on g++ Debian 7

// operator ==
  /**
   *  @brief  Test equivalence of two strings.
   *  @param __lhs  First string.
   *  @param __rhs  Second string.
   *  @return  True if @a __lhs.compare(@a __rhs) == 0.  False otherwise.
   */
  template<typename _CharT, typename _Traits, typename _Alloc>
    inline bool
    operator==(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
           const basic_string<_CharT, _Traits, _Alloc>& __rhs)
    { return __lhs.compare(__rhs) == 0; }

  template<typename _CharT>
    inline
    typename __gnu_cxx::__enable_if<__is_char<_CharT>::__value, bool>::__type
    operator==(const basic_string<_CharT>& __lhs,
           const basic_string<_CharT>& __rhs)
    { return (__lhs.size() == __rhs.size()
          && !std::char_traits<_CharT>::compare(__lhs.data(), __rhs.data(),
                            __lhs.size())); }

  /**
   *  @brief  Test equivalence of C string and string.
   *  @param __lhs  C string.
   *  @param __rhs  String.
   *  @return  True if @a __rhs.compare(@a __lhs) == 0.  False otherwise.
   */
  template<typename _CharT, typename _Traits, typename _Alloc>
    inline bool
    operator==(const _CharT* __lhs,
           const basic_string<_CharT, _Traits, _Alloc>& __rhs)
    { return __rhs.compare(__lhs) == 0; }

  /**
   *  @brief  Test equivalence of string and C string.
   *  @param __lhs  String.
   *  @param __rhs  C string.
   *  @return  True if @a __lhs.compare(@a __rhs) == 0.  False otherwise.
   */
  template<typename _CharT, typename _Traits, typename _Alloc>
    inline bool
    operator==(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
           const _CharT* __rhs)
    { return __lhs.compare(__rhs) == 0; }