[c] printf, wprintf, %s, %S, %ls, char* and wchar*: Errors not announced by a compiler warning?

I have tried the following code:

wprintf(L"1 %s\n","some string"); //Good
wprintf(L"2 %s\n",L"some string"); //Not good -> print only first character of the string
printf("3 %s\n","some string"); //Good
//printf("4 %s\n",L"some string"); //Doesn't compile
printf("\n");
wprintf(L"1 %S\n","some string"); //Not good -> print some funny stuff
wprintf(L"2 %S\n",L"some string"); //Good
//printf("3 %S\n","some string"); //Doesn't compile
printf("4 %S\n",L"some string");  //Good

And I get the following output:

1 some string
2 s
3 some string

1 g1 %s

2 some string
4 some string

So: it seems that both wprintf and printf are able to print correctly both a char* and a wchar*, but only if the exact specifier is used. If the wrong specifier is used, you might not get a compiling error (nor warning!) and end up with wrong behavior. Do you experience the same behaviour?

Note: This was tested under Windows, compiled with MinGW and g++ 4.7.2 (I will check gcc later)

Edit: I also tried %ls (result is in the comments)

printf("\n");
wprintf(L"1 %ls\n","some string"); //Not good -> print funny stuff
wprintf(L"2 %ls\n",L"some string"); //Good
// printf("3 %ls\n","some string"); //Doesn't compile
printf("4 %ls\n",L"some string");  //Good

This question is related to c mingw

The answer is


The format specifers matter: "%s" says that the next string is a narrow string ("ascii" and typically 8 bits per character). "%S" means wide char string. Mixing the two will give "undefined behaviour", which includes printing garbage, just one character or nothing.

One character is printed because wide chars are, for example, 16 bits wide, and the first byte is non-zero, followed by a zero byte -> end of string in narrow strings. This depends on byte-order, in a "big endian" machine, you'd get no string at all, because the first byte is zero, and the next byte contains a non-zero value.


%S seems to conform to The Single Unix Specification v2 and is also part of the current (2008) POSIX specification.

Equivalent C99 conforming format specifiers would be %s and %ls.


Answer A

None of the answers above pointed out why you might not see some of your prints. This is also because here you are dealing with streams (I didn't know this) and stream has something called orientation. Let me cite something from this source:

Narrow and wide orientation

A newly opened stream has no orientation. The first call to any I/O function establishes the orientation.

A wide I/O function makes the stream wide-oriented, a narrow I/O function makes the stream narrow-oriented. Once set, the orientation can only be changed with freopen.

Narrow I/O functions cannot be called on a wide-oriented stream; wide I/O functions cannot be called on a narrow-oriented stream. Wide I/O functions convert between wide and multibyte characters as if by calling mbrtowc and wcrtomb. Unlike the multibyte character strings that are valid in a program, multibyte character sequences in the file may contain embedded nulls and do not have to begin or end in the initial shift state.

So once you use printf() your orientation becomes narrow and from this point on you can't get anything out of wprintf() and you realy don't. Unless you use freeopen() which is intended to be used on files.


Answer B

As it turns out you can use freeopen() like this:

freopen(NULL, "w", stdout);             

To make stream "not defined" again. Try this example:

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

int main(void)
{
    // We set locale which is the same as the enviromental variable "LANG=en_US.UTF-8".
    setlocale(LC_ALL, "en_US.UTF-8");

    // We define array of wide characters. We indicate this on both sides of equal sign
    // with "wchar_t" on the left and "L" on the right.
    wchar_t y[100] = L"€? ???a??p???? e? a??? est??\n";

    // We print header in ASCII characters
    wprintf(L"content-type:text/html; charset:utf-8\n\n");

    // A newly opened stream has no orientation. The first call to any I/O function
    // establishes the orientation: a wide I/O function makes the stream wide-oriented,
    // a narrow I/O function makes the stream narrow-oriented. Once set, we must respect
    // this, so for the time being we are stuck with either printf() or wprintf().

    wprintf(L"%S\n", y);    // Conversion specifier %S is not standardized (!)
    wprintf(L"%ls\n", y);   // Conversion specifier %s with length modifier %l is 
                            // standardized (!)

    // At this point curent orientation of the stream is wide and this is why folowing
    // narrow function won't print anything! Whether we should use wprintf() or printf()
    // is primarily a question of how we want output to be encoded.

    printf("1\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "2");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"3");   // Print wide string of characters with a narrow function

    // Now we reset the stream to no orientation.
    freopen(NULL, "w", stdout);

    printf("4\n");          // Print narrow string of characters with a narrow function
    printf("%s\n", "5");    // Print narrow string of characters with a narrow function
    printf("%ls\n",L"6");   // Print wide string of characters with a narrow function

    return 0;
}

For s: When used with printf functions, specifies a single-byte or multi-byte character string; when used with wprintf functions, specifies a wide-character string. Characters are displayed up to the first null character or until the precision value is reached.

For S: When used with printf functions, specifies a wide-character string; when used with wprintf functions, specifies a single-byte or multi-byte character string. Characters are displayed up to the first null character or until the precision value is reached.

In Unix-like platform, s and S have the same meaning as windows platform.

Reference: https://msdn.microsoft.com/en-us/library/hf4y5e3w.aspx


At least in Visual C++: printf (and other ACSII functions): %s represents an ASCII string %S is a Unicode string wprintf (and other Unicode functions): %s is a Unicode string %S is an ASCII string

As far as no compiler warnings, printf uses a variable argument list, with only the first argument able to be type checked. The compiler is not designed to parse the format string and type check the parameters that match. In cases of functions like printf, that is up to the programmer