I came across these two methods to concatenate strings:
Common part:
char* first= "First";
char* second = "Second";
char* both = malloc(strlen(first) + strlen(second) + 2);
Method 1:
strcpy(both, first);
strcat(both, " "); // or space could have been part of one of the strings
strcat(both, second);
Method 2:
sprintf(both, "%s %s", first, second);
In both cases the content of both
would be "First Second"
.
I would like to know which one is more efficient (I have to perform several concatenation operations), or if you know a better way to do it.
This question is related to
c
string
performance
concatenation
sprintf() is designed to handle far more than just strings, strcat() is specialist. But I suspect that you are sweating the small stuff. C strings are fundamentally inefficient in ways that make the differences between these two proposed methods insignificant. Read "Back to Basics" by Joel Spolsky for the gory details.
This is an instance where C++ generally performs better than C. For heavy weight string handling using std::string is likely to be more efficient and certainly safer.
[edit]
[2nd edit]Corrected code (too many iterations in C string implementation), timings, and conclusion change accordingly
I was surprised at Andrew Bainbridge's comment that std::string was slower, but he did not post complete code for this test case. I modified his (automating the timing) and added a std::string test. The test was on VC++ 2008 (native code) with default "Release" options (i.e. optimised), Athlon dual core, 2.6GHz. Results:
C string handling = 0.023000 seconds
sprintf = 0.313000 seconds
std::string = 0.500000 seconds
So here strcat() is faster by far (your milage may vary depending on compiler and options), despite the inherent inefficiency of the C string convention, and supports my original suggestion that sprintf() carries a lot of baggage not required for this purpose. It remains by far the least readable and safe however, so when performance is not critical, has little merit IMO.
I also tested a std::stringstream implementation, which was far slower again, but for complex string formatting still has merit.
Corrected code follows:
#include <ctime>
#include <cstdio>
#include <cstring>
#include <string>
void a(char *first, char *second, char *both)
{
for (int i = 0; i != 1000000; i++)
{
strcpy(both, first);
strcat(both, " ");
strcat(both, second);
}
}
void b(char *first, char *second, char *both)
{
for (int i = 0; i != 1000000; i++)
sprintf(both, "%s %s", first, second);
}
void c(char *first, char *second, char *both)
{
std::string first_s(first) ;
std::string second_s(second) ;
std::string both_s(second) ;
for (int i = 0; i != 1000000; i++)
both_s = first_s + " " + second_s ;
}
int main(void)
{
char* first= "First";
char* second = "Second";
char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));
clock_t start ;
start = clock() ;
a(first, second, both);
printf( "C string handling = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
start = clock() ;
b(first, second, both);
printf( "sprintf = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
start = clock() ;
c(first, second, both);
printf( "std::string = %f seconds\n", (float)(clock() - start)/CLOCKS_PER_SEC) ;
return 0;
}
Here's some madness for you, I actually went and measured it. Bloody hell, imagine that. I think I got some meaningful results.
I used a dual core P4, running Windows, using mingw gcc 4.4, building with "gcc foo.c -o foo.exe -std=c99 -Wall -O2".
I tested method 1 and method 2 from the original post. Initially kept the malloc outside the benchmark loop. Method 1 was 48 times faster than method 2. Bizarrely, removing -O2 from the build command made the resulting exe 30% faster (haven't investigated why yet).
Then I added a malloc and free inside the loop. That slowed down method 1 by a factor of 4.4. Method 2 slowed down by a factor of 1.1.
So, malloc + strlen + free DO NOT dominate the profile enough to make avoiding sprintf worth while.
Here's the code I used (apart from the loops were implemented with < instead of != but that broke the HTML rendering of this post):
void a(char *first, char *second, char *both)
{
for (int i = 0; i != 1000000 * 48; i++)
{
strcpy(both, first);
strcat(both, " ");
strcat(both, second);
}
}
void b(char *first, char *second, char *both)
{
for (int i = 0; i != 1000000 * 1; i++)
sprintf(both, "%s %s", first, second);
}
int main(void)
{
char* first= "First";
char* second = "Second";
char* both = (char*) malloc((strlen(first) + strlen(second) + 2) * sizeof(char));
// Takes 3.7 sec with optimisations, 2.7 sec WITHOUT optimisations!
a(first, second, both);
// Takes 3.7 sec with or without optimisations
//b(first, second, both);
return 0;
}
They should be pretty much the same. The difference isn't going to matter. I would go with sprintf
since it requires less code.
I don't know that in case two there's any real concatenation done. Printing them back to back doesn't constitute concatenation.
Tell me though, which would be faster:
1) a) copy string A to new buffer b) copy string B to buffer c) copy buffer to output buffer
or
1)copy string A to output buffer b) copy string b to output buffer
Don't worry about efficiency: make your code readable and maintainable. I doubt the difference between these methods is going to matter in your program.
The difference is unlikely to matter:
As other posters have mentioned, this is a premature optimization. Concentrate on algorithm design, and only come back to this if profiling shows it to be a performance problem.
That said... I suspect method 1 will be faster. There is some---admittedly small---overhead to parse the sprintf format-string. And strcat is more likely "inline-able".
size_t lf = strlen(first);
size_t ls = strlen(second);
char *both = (char*) malloc((lf + ls + 2) * sizeof(char));
strcpy(both, first);
both[lf] = ' ';
strcpy(&both[lf+1], second);
Neither is terribly efficient since both methods have to calculate the string length or scan it each time. Instead, since you calculate the strlen()s of the individual strings anyway, put them in variables and then just strncpy() twice.
Source: Stackoverflow.com