[c++] How do you convert CString and std::string std::wstring to each other?

CString is quite handy, while std::string is more compatible with STL container. I am using hash_map. However, hash_map does not support CStrings as keys, so I want to convert the CString into a std::string.

Writing a CString hash function seems to take a lot of time.

CString -----> std::string

How can I do this?

std::string -----> CString:

inline CString toCString(std::string const& str)
{
    return CString(str.c_str()); 
}

Am I right?


EDIT:

Here are more questions:

How can I convert from wstring to CString and vice versa?

// wstring -> CString
std::wstring src;
CString result(src.c_str());

// CString -> wstring
CString src;
std::wstring des(src.GetString());

Is there any problem with this?

Additionally, how can I convert from std::wstring to std::string and vice versa?

This question is related to c++ mfc c-strings stdstring

The answer is


Works for me:

std::wstring CStringToWString(const CString& s)
{
    std::string s2;
    s2 = std::string((LPCTSTR)s);
    return std::wstring(s2.begin(),s2.end());
}

CString WStringToCString(std::wstring s)
{
    std::string s2;
    s2 = std::string(s.begin(),s.end());
    return s2.c_str();
}

One interesting approach is to cast CString to CStringA inside a string constructor. Unlike std::string s((LPCTSTR)cs); this will work even if _UNICODE is defined. However, if that is the case, this will perform conversion from Unicode to ANSI, so it is unsafe for higher Unicode values beyond the ASCII character set. Such conversion is subject to the _CSTRING_DISABLE_NARROW_WIDE_CONVERSION preprocessor definition. https://msdn.microsoft.com/en-us/library/5bzxfsea.aspx

        CString s1("SomeString");
        string s2((CStringA)s1);

from this post (Thank you Mark Ransom )

Convert CString to string (VC6)

I have tested this and it works fine.

std::string Utils::CString2String(const CString& cString) 
{
    std::string strStd;

    for (int i = 0;  i < cString.GetLength();  ++i)
    {
        if (cString[i] <= 0x7f)
            strStd.append(1, static_cast<char>(cString[i]));
        else
            strStd.append(1, '?');
    }

    return strStd;
}

It is more effecient to convert CString to std::string using the conversion where the length is specified.

CString someStr("Hello how are you");
std::string std(somStr, someStr.GetLength());

In tight loop this makes a significant performance improvement.


This is a follow up to Sal's answer, where he/she provided the solution:

CString someStr("Hello how are you");
std::string std(somStr, someStr.GetLength());

This is useful also when converting a non-typical C-String to a std::string

A use case for me was having a pre-allocated char array (like C-String), but it's not NUL terminated. (i.e. SHA digest). The above syntax allows me to specify the length of the SHA digest of the char array so that std::string doesn't have to look for the terminating NUL char, which may or may not be there.

Such as:

unsigned char hashResult[SHA_DIGEST_LENGTH];    
auto value = std::string(reinterpret_cast<char*>hashResult, SHA_DIGEST_LENGTH);

If you're looking to convert easily between other strings types, perhaps the _bstr_t class would be more appropriate? It supports converstion between char, wchar_t and BSTR.


If you want something more C++-like, this is what I use. Although it depends on Boost, that's just for exceptions. You can easily remove those leaving it to depend only on the STL and the WideCharToMultiByte() Win32 API call.

#include <string>
#include <vector>
#include <cassert>
#include <exception>

#include <boost/system/system_error.hpp>
#include <boost/integer_traits.hpp>

/**
 * Convert a Windows wide string to a UTF-8 (multi-byte) string.
 */
std::string WideStringToUtf8String(const std::wstring& wide)
{
    if (wide.size() > boost::integer_traits<int>::const_max)
        throw std::length_error(
            "Wide string cannot be more than INT_MAX characters long.");
    if (wide.size() == 0)
        return "";

    // Calculate necessary buffer size
    int len = ::WideCharToMultiByte(
        CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()), 
        NULL, 0, NULL, NULL);

    // Perform actual conversion
    if (len > 0)
    {
        std::vector<char> buffer(len);
        len = ::WideCharToMultiByte(
            CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()),
            &buffer[0], static_cast<int>(buffer.size()), NULL, NULL);
        if (len > 0)
        {
            assert(len == static_cast<int>(buffer.size()));
            return std::string(&buffer[0], buffer.size());
        }
    }

    throw boost::system::system_error(
        ::GetLastError(), boost::system::system_category);
}

This is a follow up to Sal's answer, where he/she provided the solution:

CString someStr("Hello how are you");
std::string std(somStr, someStr.GetLength());

This is useful also when converting a non-typical C-String to a std::string

A use case for me was having a pre-allocated char array (like C-String), but it's not NUL terminated. (i.e. SHA digest). The above syntax allows me to specify the length of the SHA digest of the char array so that std::string doesn't have to look for the terminating NUL char, which may or may not be there.

Such as:

unsigned char hashResult[SHA_DIGEST_LENGTH];    
auto value = std::string(reinterpret_cast<char*>hashResult, SHA_DIGEST_LENGTH);

(Since VS2012 ...and at least until VS2017 v15.8.1)

Since it's a MFC project & CString is a MFC class, MS provides a Technical Note TN059: Using MFC MBCS/Unicode Conversion Macros and Generic Conversion Macros:

A2CW      (LPCSTR)  -> (LPCWSTR)  
A2W       (LPCSTR)  -> (LPWSTR)  
W2CA      (LPCWSTR) -> (LPCSTR)  
W2A       (LPCWSTR) -> (LPSTR)  

Use:

void Example() // ** UNICODE case **
{
    USES_CONVERSION; // (1)

    // CString to std::string / std::wstring
    CString strMfc{ "Test" }; // strMfc = L"Test"
    std::string strStd = W2A(strMfc); // ** Conversion Macro: strStd = "Test" **
    std::wstring wstrStd = strMfc.GetString(); // wsrStd = L"Test"

    // std::string to CString / std::wstring
    strStd = "Test 2";
    strMfc = strStd.c_str(); // strMfc = L"Test 2"
    wstrStd = A2W(strStd.c_str()); // ** Conversion Macro: wstrStd = L"Test 2" **

    // std::wstring to CString / std::string 
    wstrStd = L"Test 3";
    strMfc = wstrStd.c_str(); // strMfc = L"Test 3"
    strStd = W2A(wstrStd.c_str()); // ** Conversion Macro: strStd = "Test 3" **
}

--

Footnotes:

(1) In order to for the conversion-macros to have space to store the temporary length, it is necessary to declare a local variable called _convert that does this in each function that uses the conversion macros. This is done by invoking the USES_CONVERSION macro. In VS2017 MFC code (atlconv.h) it looks like this:

#ifndef _DEBUG
    #define USES_CONVERSION int _convert; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw; (_lpw); LPCSTR _lpa; (_lpa)
#else
    #define USES_CONVERSION int _convert = 0; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw = NULL; (_lpw); LPCSTR _lpa = NULL; (_lpa)
#endif

You can cast CString freely to const char* and then assign it to an std::string like this:

CString cstring("MyCString");
std::string str = (const char*)cstring;

It is more effecient to convert CString to std::string using the conversion where the length is specified.

CString someStr("Hello how are you");
std::string std(somStr, someStr.GetLength());

In tight loop this makes a significant performance improvement.


Solve that by using std::basic_string<TCHAR> instead of std::string and it should work fine regardless of your character setting.


If you want something more C++-like, this is what I use. Although it depends on Boost, that's just for exceptions. You can easily remove those leaving it to depend only on the STL and the WideCharToMultiByte() Win32 API call.

#include <string>
#include <vector>
#include <cassert>
#include <exception>

#include <boost/system/system_error.hpp>
#include <boost/integer_traits.hpp>

/**
 * Convert a Windows wide string to a UTF-8 (multi-byte) string.
 */
std::string WideStringToUtf8String(const std::wstring& wide)
{
    if (wide.size() > boost::integer_traits<int>::const_max)
        throw std::length_error(
            "Wide string cannot be more than INT_MAX characters long.");
    if (wide.size() == 0)
        return "";

    // Calculate necessary buffer size
    int len = ::WideCharToMultiByte(
        CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()), 
        NULL, 0, NULL, NULL);

    // Perform actual conversion
    if (len > 0)
    {
        std::vector<char> buffer(len);
        len = ::WideCharToMultiByte(
            CP_UTF8, 0, wide.c_str(), static_cast<int>(wide.size()),
            &buffer[0], static_cast<int>(buffer.size()), NULL, NULL);
        if (len > 0)
        {
            assert(len == static_cast<int>(buffer.size()));
            return std::string(&buffer[0], buffer.size());
        }
    }

    throw boost::system::system_error(
        ::GetLastError(), boost::system::system_category);
}

One interesting approach is to cast CString to CStringA inside a string constructor. Unlike std::string s((LPCTSTR)cs); this will work even if _UNICODE is defined. However, if that is the case, this will perform conversion from Unicode to ANSI, so it is unsafe for higher Unicode values beyond the ASCII character set. Such conversion is subject to the _CSTRING_DISABLE_NARROW_WIDE_CONVERSION preprocessor definition. https://msdn.microsoft.com/en-us/library/5bzxfsea.aspx

        CString s1("SomeString");
        string s2((CStringA)s1);

Is there any problem?

There are several issues:

  • CString is a template specialization of CStringT. Depending on the BaseType describing the character type, there are two concrete specializations: CStringA (using char) and CStringW (using wchar_t).
  • While wchar_t on Windows is ubiquitously used to store UTF-16 encoded code units, using char is ambiguous. The latter commonly stores ANSI encoded characters, but can also store ASCII, UTF-8, or even binary data.
  • We don't know the character encoding (or even character type) of CString (which is controlled through the _UNICODE preprocessor symbol), making the question ambiguous. We also don't know the desired character encoding of std::string.
  • Converting between Unicode and ANSI is inherently lossy: ANSI encoding can only represent a subset of the Unicode character set.

To address these issues, I'm going to assume that wchar_t will store UTF-16 encoded code units, and char will hold UTF-8 octet sequences. That's the only reasonable choice you can make to ensure that source and destination strings retain the same information, without limiting the solution to a subset of the source or destination domains.

The following implementations convert between CStringA/CStringW and std::wstring/std::string mapping from UTF-8 to UTF-16 and vice versa:

#include <string>
#include <atlconv.h>

std::string to_utf8(CStringW const& src_utf16)
{
    return { CW2A(src_utf16.GetString(), CP_UTF8).m_psz };
}

std::wstring to_utf16(CStringA const& src_utf8)
{
    return { CA2W(src_utf8.GetString(), CP_UTF8).m_psz };
}

The remaining two functions construct C++ string objects from MFC strings, leaving the encoding unchanged. Note that while the previous functions cannot cope with embedded NUL characters, these functions are immune to that.

#include <string>
#include <atlconv.h>

std::string to_std_string(CStringA const& src)
{
    return { src.GetString(), src.GetString() + src.GetLength() };
}

std::wstring to_std_wstring(CStringW const& src)
{
    return { src.GetString(), src.GetString() + src.GetLength() };
}

This works fine:

//Convert CString to std::string
inline std::string to_string(const CString& cst)
{
    return CT2A(cst.GetString());
}

If you're looking to convert easily between other strings types, perhaps the _bstr_t class would be more appropriate? It supports converstion between char, wchar_t and BSTR.


Solve that by using std::basic_string<TCHAR> instead of std::string and it should work fine regardless of your character setting.


This works fine:

//Convert CString to std::string
inline std::string to_string(const CString& cst)
{
    return CT2A(cst.GetString());
}

from this post (Thank you Mark Ransom )

Convert CString to string (VC6)

I have tested this and it works fine.

std::string Utils::CString2String(const CString& cString) 
{
    std::string strStd;

    for (int i = 0;  i < cString.GetLength();  ++i)
    {
        if (cString[i] <= 0x7f)
            strStd.append(1, static_cast<char>(cString[i]));
        else
            strStd.append(1, '?');
    }

    return strStd;
}

You can use CT2CA

CString datasetPath;
CT2CA st(datasetPath);
string dataset(st);

to convert CString to std::string. You can use this format.

std::string sText(CW2A(CSText.GetString(), CP_UTF8 ));

You can cast CString freely to const char* and then assign it to an std::string like this:

CString cstring("MyCString");
std::string str = (const char*)cstring;

Works for me:

std::wstring CStringToWString(const CString& s)
{
    std::string s2;
    s2 = std::string((LPCTSTR)s);
    return std::wstring(s2.begin(),s2.end());
}

CString WStringToCString(std::wstring s)
{
    std::string s2;
    s2 = std::string(s.begin(),s.end());
    return s2.c_str();
}

Is there any problem?

There are several issues:

  • CString is a template specialization of CStringT. Depending on the BaseType describing the character type, there are two concrete specializations: CStringA (using char) and CStringW (using wchar_t).
  • While wchar_t on Windows is ubiquitously used to store UTF-16 encoded code units, using char is ambiguous. The latter commonly stores ANSI encoded characters, but can also store ASCII, UTF-8, or even binary data.
  • We don't know the character encoding (or even character type) of CString (which is controlled through the _UNICODE preprocessor symbol), making the question ambiguous. We also don't know the desired character encoding of std::string.
  • Converting between Unicode and ANSI is inherently lossy: ANSI encoding can only represent a subset of the Unicode character set.

To address these issues, I'm going to assume that wchar_t will store UTF-16 encoded code units, and char will hold UTF-8 octet sequences. That's the only reasonable choice you can make to ensure that source and destination strings retain the same information, without limiting the solution to a subset of the source or destination domains.

The following implementations convert between CStringA/CStringW and std::wstring/std::string mapping from UTF-8 to UTF-16 and vice versa:

#include <string>
#include <atlconv.h>

std::string to_utf8(CStringW const& src_utf16)
{
    return { CW2A(src_utf16.GetString(), CP_UTF8).m_psz };
}

std::wstring to_utf16(CStringA const& src_utf8)
{
    return { CA2W(src_utf8.GetString(), CP_UTF8).m_psz };
}

The remaining two functions construct C++ string objects from MFC strings, leaving the encoding unchanged. Note that while the previous functions cannot cope with embedded NUL characters, these functions are immune to that.

#include <string>
#include <atlconv.h>

std::string to_std_string(CStringA const& src)
{
    return { src.GetString(), src.GetString() + src.GetLength() };
}

std::wstring to_std_wstring(CStringW const& src)
{
    return { src.GetString(), src.GetString() + src.GetLength() };
}

All other answers didn't quite address what I was looking for which was to convert CString on the fly as opposed to store the result in a variable.

The solution is similar to above but we need one more step to instantiate a nameless object. I am illustrating with an example. Here is my function which needs std::string but I have CString.

void CStringsPlayDlg::writeLog(const std::string &text)
{
    std::string filename = "c:\\test\\test.txt";

    std::ofstream log_file(filename.c_str(), std::ios_base::out | std::ios_base::app);

    log_file << text << std::endl;
}

How to call it when you have a CString?

std::string firstName = "First";
CString lastName = _T("Last");

writeLog( firstName + ", " + std::string( CT2A( lastName ) ) );     

Note that the last line is not a direct typecast but we are creating a nameless std::string object and supply the CString via its constructor.


(Since VS2012 ...and at least until VS2017 v15.8.1)

Since it's a MFC project & CString is a MFC class, MS provides a Technical Note TN059: Using MFC MBCS/Unicode Conversion Macros and Generic Conversion Macros:

A2CW      (LPCSTR)  -> (LPCWSTR)  
A2W       (LPCSTR)  -> (LPWSTR)  
W2CA      (LPCWSTR) -> (LPCSTR)  
W2A       (LPCWSTR) -> (LPSTR)  

Use:

void Example() // ** UNICODE case **
{
    USES_CONVERSION; // (1)

    // CString to std::string / std::wstring
    CString strMfc{ "Test" }; // strMfc = L"Test"
    std::string strStd = W2A(strMfc); // ** Conversion Macro: strStd = "Test" **
    std::wstring wstrStd = strMfc.GetString(); // wsrStd = L"Test"

    // std::string to CString / std::wstring
    strStd = "Test 2";
    strMfc = strStd.c_str(); // strMfc = L"Test 2"
    wstrStd = A2W(strStd.c_str()); // ** Conversion Macro: wstrStd = L"Test 2" **

    // std::wstring to CString / std::string 
    wstrStd = L"Test 3";
    strMfc = wstrStd.c_str(); // strMfc = L"Test 3"
    strStd = W2A(wstrStd.c_str()); // ** Conversion Macro: strStd = "Test 3" **
}

--

Footnotes:

(1) In order to for the conversion-macros to have space to store the temporary length, it is necessary to declare a local variable called _convert that does this in each function that uses the conversion macros. This is done by invoking the USES_CONVERSION macro. In VS2017 MFC code (atlconv.h) it looks like this:

#ifndef _DEBUG
    #define USES_CONVERSION int _convert; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw; (_lpw); LPCSTR _lpa; (_lpa)
#else
    #define USES_CONVERSION int _convert = 0; (_convert); UINT _acp = ATL::_AtlGetConversionACP() /*CP_THREAD_ACP*/; (_acp); LPCWSTR _lpw = NULL; (_lpw); LPCSTR _lpa = NULL; (_lpa)
#endif

Solve that by using std::basic_string<TCHAR> instead of std::string and it should work fine regardless of your character setting.


You can use CT2CA

CString datasetPath;
CT2CA st(datasetPath);
string dataset(st);

to convert CString to std::string. You can use this format.

std::string sText(CW2A(CSText.GetString(), CP_UTF8 ));

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to mfc

How can I convert an Int to a CString? Convert MFC CString to integer Convert CString to const char* LPCSTR, LPCTSTR and LPTSTR How do you convert CString and std::string std::wstring to each other?

Examples related to c-strings

Converting String to Cstring in C++ C - split string into an array of strings Strip first and last character from C string How do you convert CString and std::string std::wstring to each other?

Examples related to stdstring

Is it possible to use std::string in a constexpr? Error: invalid operands of types ‘const char [35]’ and ‘const char [2]’ to binary ‘operator+’ Remove First and Last Character C++ Concatenating strings doesn't work as expected What does string::npos mean in this code? How to replace all occurrences of a character in string? std::string formatting like sprintf convert a char* to std::string How to get the number of characters in a std::string? c++ integer->std::string conversion. Simple function?