All compilers I know will replace a simple std::copy
with a memcpy
when it is appropriate, or even better, vectorize the copy so that it would be even faster than a memcpy
.
In any case: profile and find out yourself. Different compilers will do different things, and it's quite possible it won't do exactly what you ask.
See this presentation on compiler optimisations (pdf).
Here's what GCC does for a simple std::copy
of a POD type.
#include <algorithm>
struct foo
{
int x, y;
};
void bar(foo* a, foo* b, size_t n)
{
std::copy(a, a + n, b);
}
Here's the disassembly (with only -O
optimisation), showing the call to memmove
:
bar(foo*, foo*, unsigned long):
salq $3, %rdx
sarq $3, %rdx
testq %rdx, %rdx
je .L5
subq $8, %rsp
movq %rsi, %rax
salq $3, %rdx
movq %rdi, %rsi
movq %rax, %rdi
call memmove
addq $8, %rsp
.L5:
rep
ret
If you change the function signature to
void bar(foo* __restrict a, foo* __restrict b, size_t n)
then the memmove
becomes a memcpy
for a slight performance improvement. Note that memcpy
itself will be heavily vectorised.