Some more numbers and explanations for the C version. Apparently noone did it during all those years. Remember to upvote this answer so it can get on top for everyone to see and learn.
Laptop Specifications:
Commands:
compiling on VS x64 command prompt > `for /f %f in ('dir /b *.c') do cl /O2 /Ot /Ox %f -o %f_x64_vs2012.exe`
compiling on cygwin with gcc x64 > `for f in ./*.c; do gcc -m64 -O3 $f -o ${f}_x64_gcc.exe ; done`
time (unix tools) using cygwin > `for f in ./*.exe; do echo "----------"; echo $f ; time $f ; done`
.
----------
$ time python ./original.py
real 2m17.748s
user 2m15.783s
sys 0m0.093s
----------
$ time ./original_x86_vs2012.exe
real 0m8.377s
user 0m0.015s
sys 0m0.000s
----------
$ time ./original_x64_vs2012.exe
real 0m8.408s
user 0m0.000s
sys 0m0.015s
----------
$ time ./original_x64_gcc.exe
real 0m20.951s
user 0m20.732s
sys 0m0.030s
Filenames are: integertype_architecture_compiler.exe
VS is 250% faster than gcc. The two compilers should give a similar speed. Obviously, something is wrong with the code or the compiler options. Let's investigate!
The first point of interest is the integer types. Conversions can be expensive and consistency is important for better code generation & optimizations. All integers should be the same type.
It's a mixed mess of int
and long
right now. We're going to improve that. What type to use? The fastest. Gotta benchmark them'all!
----------
$ time ./int_x86_vs2012.exe
real 0m8.440s
user 0m0.016s
sys 0m0.015s
----------
$ time ./int_x64_vs2012.exe
real 0m8.408s
user 0m0.016s
sys 0m0.015s
----------
$ time ./int32_x86_vs2012.exe
real 0m8.408s
user 0m0.000s
sys 0m0.015s
----------
$ time ./int32_x64_vs2012.exe
real 0m8.362s
user 0m0.000s
sys 0m0.015s
----------
$ time ./int64_x86_vs2012.exe
real 0m18.112s
user 0m0.000s
sys 0m0.015s
----------
$ time ./int64_x64_vs2012.exe
real 0m18.611s
user 0m0.000s
sys 0m0.015s
----------
$ time ./long_x86_vs2012.exe
real 0m8.393s
user 0m0.015s
sys 0m0.000s
----------
$ time ./long_x64_vs2012.exe
real 0m8.440s
user 0m0.000s
sys 0m0.015s
----------
$ time ./uint32_x86_vs2012.exe
real 0m8.362s
user 0m0.000s
sys 0m0.015s
----------
$ time ./uint32_x64_vs2012.exe
real 0m8.393s
user 0m0.015s
sys 0m0.015s
----------
$ time ./uint64_x86_vs2012.exe
real 0m15.428s
user 0m0.000s
sys 0m0.015s
----------
$ time ./uint64_x64_vs2012.exe
real 0m15.725s
user 0m0.015s
sys 0m0.015s
----------
$ time ./int_x64_gcc.exe
real 0m8.531s
user 0m8.329s
sys 0m0.015s
----------
$ time ./int32_x64_gcc.exe
real 0m8.471s
user 0m8.345s
sys 0m0.000s
----------
$ time ./int64_x64_gcc.exe
real 0m20.264s
user 0m20.186s
sys 0m0.015s
----------
$ time ./long_x64_gcc.exe
real 0m20.935s
user 0m20.809s
sys 0m0.015s
----------
$ time ./uint32_x64_gcc.exe
real 0m8.393s
user 0m8.346s
sys 0m0.015s
----------
$ time ./uint64_x64_gcc.exe
real 0m16.973s
user 0m16.879s
sys 0m0.030s
Integer types are int
long
int32_t
uint32_t
int64_t
and uint64_t
from #include <stdint.h>
There are LOTS of integer types in C, plus some signed/unsigned to play with, plus the choice to compile as x86 or x64 (not to be confused with the actual integer size). That is a lot of versions to compile and run ^^
Definitive conclusions:
Trick question: "What are the sizes of int and long in C?"
The right answer is: The size of int and long in C are not well-defined!
From the C spec:
int is at least 32 bits
long is at least an int
From the gcc man page (-m32 and -m64 flags):
The 32-bit environment sets int, long and pointer to 32 bits and generates code that runs on any i386 system.
The 64-bit environment sets int to 32 bits and long and pointer to 64 bits and generates code for AMD’s x86-64 architecture.
From MSDN documentation (Data Type Ranges) https://msdn.microsoft.com/en-us/library/s3f49ktz%28v=vs.110%29.aspx :
int, 4 bytes, also knows as signed
long, 4 bytes, also knows as long int and signed long int
32 bits integers are faster than 64 bits integers.
Standard integers types are not well defined in C nor C++, they vary depending on compilers and architectures. When you need consistency and predictability, use the uint32_t
integer family from #include <stdint.h>
.
Speed issues solved. All other languages are back hundreds percent behind, C & C++ win again ! They always do. The next improvement will be multithreading using OpenMP :D