[c] How does strcmp() work?

I've been looking around a fair bit for an answer. I'm going to make a series of my own string functions like my_strcmp(), my_strcat(), etc.

Does strcmp() work through each index of two arrays of characters and if the ASCII value is smaller at an identical index of two strings, that string is there alphabetically greater and therefore a 0 or 1 or 2 is returned? I guess what Im asking is, does it use the ASCII values of characters to return these results?

Any help would be greatly appreciated.

[REVISED]

OK, so I have come up with this... it works for all cases except when the second string is greater than the first.

Any tips?

int my_strcmp(char s1[], char s2[])
{   
    int i = 0;
    while ( s1[i] != '\0' )
    {
        if( s2[i] == '\0' ) { return 1; }
        else if( s1[i] < s2[i] ) { return -1; }
        else if( s1[i] > s2[i] ) { return 1; }
        i++;
    }   
    return 0;
}


int main (int argc, char *argv[])
{
    int result = my_strcmp(argv[1], argv[2]);

    printf("Value: %d \n", result);

    return 0;

}

This question is related to c string

The answer is


Is just this:

int strcmp(char *str1, char *str2){
    while( (*str1 == *str2) && (*str1 != 0) ){
        ++*str1;
        ++*str2;
    }
    return (*str1-*str2);
}

if you want more fast, you can add "register " before type, like this: register char

then, like this:

int strcmp(register char *str1, register char *str2){
    while( (*str1 == *str2) && (*str1 != 0) ){
        ++*str1;
        ++*str2;
    }
    return (*str1-*str2);
}

this way, if possible, the register of the ALU are used.


Here is my version, written for small microcontroller applications, MISRA-C compliant. The main aim with this code was to write readable code, instead of the one-line goo found in most compiler libs.

int8_t strcmp (const uint8_t* s1, const uint8_t* s2)
{
  while ( (*s1 != '\0') && (*s1 == *s2) )
  {
    s1++; 
    s2++;
  }

  return (int8_t)( (int16_t)*s1 - (int16_t)*s2 );
}

Note: the code assumes 16 bit int type.


Here is the BSD implementation:

int
strcmp(s1, s2)
    register const char *s1, *s2;
{
    while (*s1 == *s2++)
        if (*s1++ == 0)
            return (0);
    return (*(const unsigned char *)s1 - *(const unsigned char *)(s2 - 1));
}

Once there is a mismatch between two characters, it just returns the difference between those two characters.


I found this on web.

http://www.opensource.apple.com/source/Libc/Libc-262/ppc/gen/strcmp.c

int strcmp(const char *s1, const char *s2)
{
    for ( ; *s1 == *s2; s1++, s2++)
        if (*s1 == '\0')
            return 0;
    return ((*(unsigned char *)s1 < *(unsigned char *)s2) ? -1 : +1);
}

It uses the byte values of the characters, returning a negative value if the first string appears before the second (ordered by byte values), zero if they are equal, and a positive value if the first appears after the second. Since it operates on bytes, it is not encoding-aware.

For example:

strcmp("abc", "def") < 0
strcmp("abc", "abcd") < 0 // null character is less than 'd'
strcmp("abc", "ABC") > 0 // 'a' > 'A' in ASCII
strcmp("abc", "abc") == 0

More precisely, as described in the strcmp Open Group specification:

The sign of a non-zero return value shall be determined by the sign of the difference between the values of the first pair of bytes (both interpreted as type unsigned char) that differ in the strings being compared.

Note that the return value may not be equal to this difference, but it will carry the same sign.


This is how I implemented my strcmp: it works like this: it compares first letter of the two strings, if it is identical, it continues to the next letter. If not, it returns the corresponding value. It is very simple and easy to understand: #include

//function declaration:
int strcmp(char string1[], char string2[]);

int main()
{
    char string1[]=" The San Antonio spurs";
    char string2[]=" will be champins again!";
    //calling the function- strcmp
    printf("\n number returned by the strcmp function: %d", strcmp(string1, string2));
    getch();
    return(0);
}

/**This function calculates the dictionary value of the string and compares it to another string.
it returns a number bigger than 0 if the first string is bigger than the second
it returns a number smaller than 0 if the second string is bigger than the first
input: string1, string2
output: value- can be 1, 0 or -1 according to the case*/
int strcmp(char string1[], char string2[])
{
    int i=0;
    int value=2;    //this initialization value could be any number but the numbers that can be      returned by the function
    while(value==2)
    {
        if (string1[i]>string2[i])
        {
            value=1;
        }
        else if (string1[i]<string2[i])
        {
            value=-1;
        }
        else
        {
            i++;
        }
    }
    return(value);
}

This code is equivalent, shorter, and more readable:

int8_t strcmp (const uint8_t* s1, const uint8_t* s2)
{
    while( (*s1!='\0') && (*s1==*s2) ){
        s1++; 
        s2++;
    }

    return (int8_t)*s1 - (int8_t)*s2;
}

We only need to test for end of s1, because if we reach the end of s2 before end of s1, the loop will terminate (since *s2 != *s1).

The return expression calculates the correct value in every case, provided we are only using 7-bit (pure ASCII) characters. Careful thought is needed to produce correct code for 8-bit characters, because of the risk of integer overflow.


This, from the masters themselves (K&R, 2nd ed., pg. 106):

// strcmp: return < 0 if s < t, 0 if s == t, > 0 if s > t
int strcmp(char *s, char *t) 
{
    int i;

    for (i = 0; s[i] == t[i]; i++)
        if (s[i] == '\0')
            return 0;
    return s[i] - t[i];
}