[c] Warning: comparison with string literals results in unspecified behaviour

I am starting a project of writing a simplified shell for linux in C. I am not at all proficient with C nor with Linux that's exactly the reason I decided it would be a good idea.

Starting with the parser, I have already encountered some problems.

The code should be straightforward that's why I didn't include any comments.

I am getting a warning with gcc: "comparison with string literals results in unspecified behaviour" at the lines commented with "WARNING HERE" (see code below).

I have no idea why this causes an warning, but the real problem is that even though I am comparing an "<" to an "<" is doesn't get inside the if...

I am looking for an answer for the problem explained, however if there's something that you see in the code that should be improved please say so. Just take in mind I am not that proficient and that this is still a work in progress (or better yet, a work in start).

Thanks in advance.

#include <stdio.h>
#include <unistd.h>
#include <string.h>

typedef enum {false, true} bool;

typedef struct {
    char **arg;
    char *infile;
    char *outfile;
    int background;
} Command_Info;

int parse_cmd(char *cmd_line, Command_Info *cmd_info)
{
    char *arg;
    char *args[100];    

    int i = 0;
    arg = strtok(cmd_line, " \n");
    while (arg != NULL) {
        args[i] = arg;
        arg = strtok(NULL, " \n");
        i++;
    }

    int num_elems = i;

    cmd_info->infile = NULL;
    cmd_info->outfile = NULL;
    cmd_info->background = 0;

    int iarg = 0;
    for (i = 0; i < num_elems; i++)
    {
        if (args[i] == "&") //WARNING HERE
            return -1;      
        else if (args[i] == "<") //WARNING HERE
            if (args[i+1] != NULL)
                cmd_info->infile = args[i+1];
            else
                return -1;

        else if (args[i] == ">") //WARNING HERE
            if (args[i+1] != NULL)
                cmd_info->outfile = args[i+1];
            else
                return -1;          

        else 
            cmd_info->arg[iarg++] = args[i];
    }

    cmd_info->arg[iarg] = NULL;

    return 0;   
}

void print_cmd(Command_Info *cmd_info)
{
    int i;  
    for (i = 0; cmd_info->arg[i] != NULL; i++)
        printf("arg[%d]=\"%s\"\n", i, cmd_info->arg[i]);
    printf("arg[%d]=\"%s\"\n", i, cmd_info->arg[i]);    
    printf("infile=\"%s\"\n", cmd_info->infile);
    printf("outfile=\"%s\"\n", cmd_info->outfile);
    printf("background=\"%d\"\n", cmd_info->background);
}

int main(int argc, char* argv[])
{
    char cmd_line[100];
    Command_Info cmd_info;

    printf(">>> ");

    fgets(cmd_line, 100, stdin);

    parse_cmd(cmd_line, &cmd_info);

    print_cmd(&cmd_info);

    return 0;
}

This question is related to c

The answer is


This an old question, but I have had to explain it to someone recently and I thought recording the answer here would be helpful at least in understanding how C works.

String literals like

"a"

or

"This is a string"

are put in the text or data segments of your program.

A string in C is actually a pointer to a char, and the string is understood to be the subsequent chars in memory up until a NUL char is encountered. That is, C doesn't really know about strings.

So if I have

char *s1 = "This is a string";

then s1 is a pointer to the first byte of the string.

Now, if I have

char *s2 = "This is a string";

this is also a pointer to the same first byte of that string in the text or data segment of the program.

But if I have

char *s3 = malloc( 17 );
strcpy(s3, "This is a string");

then s3 is a pointer to another place in memory into which I copy all the bytes of the other strings.

Illustrative examples:

Although, as your compiler rightly points out, you shouldn't do this, the following will evaluate to true:

s1 == s2 // True: we are comparing two pointers that contain the same address

but the following will evaluate to false

s1 == s3 // False: Comparing two pointers that don't hold the same address.

And although it might be tempting to have something like this:

struct Vehicle{
    char *type;
    // other stuff
}

if( type == "Car" )
   //blah1
else if( type == "Motorcycle )
   //blah2

You shouldn't do it because it's not something that is guarantied to work. Even if you know that type will always be set using a string literal.

I have tested it and it works. If I do

A.type = "Car";

then blah1 gets executed and similarly for "Motorcycle". And you'd be able to do things like

if( A.type == B.type )

but this is just terrible. I'm writing about it because I think it's interesting to know why it works, and it helps understand why you shouldn't do it.

Solutions:

In your case, what you want to do is use strcmp(a,b) == 0 to replace a == b

In the case of my example, you should use an enum.

enum type {CAR = 0, MOTORCYCLE = 1}

The preceding thing with string was useful because you could print the type, so you might have an array like this

char *types[] = {"Car", "Motorcycle"};

And now that I think about it, this is error prone since one must be careful to maintain the same order in the types array.

Therefore it might be better to do

char *getTypeString(int type)
{
    switch(type)
    case CAR: return "Car";
    case MOTORCYCLE: return "Motorcycle"
    default: return NULL;
}

You can't compare strings with == in C. For C, strings are just (zero-terminated) arrays, so you need to use string functions to compare them. See the man page for strcmp() and strncmp().

If you want to compare a character you need to compare to a character, not a string. "a" is the string a, which occupies two bytes (the a and the terminating null byte), while the character a is represented by 'a' in C.


  1. clang has advantages in error reporting & recovery.

    $ clang errors.c
    errors.c:36:21: warning: result of comparison against a string literal is unspecified (use strcmp instead)
            if (args[i] == "&") //WARNING HERE
                        ^~ ~~~
                strcmp( ,     ) == 0
    errors.c:38:26: warning: result of comparison against a string literal is unspecified (use strcmp instead)
            else if (args[i] == "<") //WARNING HERE
                             ^~ ~~~
                     strcmp( ,     ) == 0
    errors.c:44:26: warning: result of comparison against a string literal is unspecified (use strcmp instead)
            else if (args[i] == ">") //WARNING HERE
                             ^~ ~~~
                     strcmp( ,     ) == 0
    

    It suggests to replace x == y by strcmp(x,y) == 0.

  2. gengetopt writes command-line option parser for you.


There is a distinction between 'a' and "a":

  • 'a' means the value of the character a.
  • "a" means the address of the memory location where the string "a" is stored (which will generally be in the data section of your program's memory space). At that memory location, you will have two bytes -- the character 'a' and the null terminator for the string.

I ran across this issue today working with a clients program. The program works FINE in VS6.0 using the following: (I've changed it slightly)

//
// This is the one include file that every user-written Nextest programs needs.
// Patcom-generated files will also look for this file.
//
#include "stdio.h"
#define IS_NONE( a_key )   ( ( a_key == "none" || a_key == "N/A" ) ? TRUE : FALSE )

//
// Note in my environment we have output() which is printf which adds /n at the end
//
main {
    char *psNameNone = "none";
    char *psNameNA   = "N/A";
    char *psNameCAT  = "CAT";

    if (IS_NONE(psNameNone) ) {
        output("psNameNone Matches NONE");
        output("%s psNameNoneAddr 0x%x  \"none\" addr 0x%X",
            psNameNone,psNameNone,
            "none");
    } else {
        output("psNameNone Does Not Match None");
        output("%s psNameNoneAddr 0x%x  \"none\" addr 0x%X",
            psNameNone,psNameNone,
            "none");
    }

    if (IS_NONE(psNameNA) ) {
        output("psNameNA Matches N/A");
        output("%s psNameNA 0x%x  \"N/A\" addr 0x%X",
        psNameNA,psNameNA,
        "N/A");
    } else {
        output("psNameNone Does Not Match N/A");
        output("%s psNameNA 0x%x  \"N/A\" addr 0x%X",
        psNameNA,psNameNA,
        "N/A");
    }
    if (IS_NONE(psNameCAT)) {
        output("psNameNA Matches CAT");
        output("%s psNameNA 0x%x  \"CAT\" addr 0x%X",
        psNameNone,psNameNone,
        "CAT");
    } else {
        output("psNameNA does not match CAT");
        output("%s psNameNA 0x%x  \"CAT\" addr 0x%X",
        psNameNone,psNameNone,
        "CAT");
    }
}

If built in VS6.0 with Program Database with Edit and Continue. The compares APPEAR to work. With this setting STRING pooling is enabled, and the compiler optimizes all STRING pointers to POINT TO THE SAME ADDRESSS, so this can work. Any strings created on the fly after compile time will have DIFFERENT addresses so will fail the compare. Where Compiler settings are Changing the setting to Program Database only will build the program so that it will fail.


if (args[i] == "&")

Ok, let's disect what this does.

args is an array of pointers. So, here you are comparing args[i] (a pointer) to "&" (also a pointer). Well, the only way this will every be true is if somewhere you have args[i]="&" and even then, "&" is not guaranteed to point to the same place everywhere.

I believe what you are actually looking for is either strcmp to compare the entire string or your wanting to do if (*args[i] == '&') to compare the first character of the args[i] string to the & character