[c++] Is multiplication and division using shift operators in C actually faster?

Is it actually faster to use say (i<<3)+(i<<1) to multiply with 10 than using i*10 directly?

It might or might not be on your machine - if you care, measure in your real-world usage.

A case study - from 486 to core i7

Benchmarking is very difficult to do meaningfully, but we can look at a few facts. From http://www.penguin.cz/~literakl/intel/s.html#SAL and http://www.penguin.cz/~literakl/intel/i.html#IMUL we get an idea of x86 clock cycles needed for arithmetic shift and multiplication. Say we stick to "486" (the newest one listed), 32 bit registers and immediates, IMUL takes 13-42 cycles and IDIV 44. Each SAL takes 2, and adding 1, so even with a few of those together shifting superficially looks like a winner.

These days, with the core i7:

(from http://software.intel.com/en-us/forums/showthread.php?t=61481)

The latency is 1 cycle for an integer addition and 3 cycles for an integer multiplication. You can find the latencies and thoughput in Appendix C of the "IntelĀ® 64 and IA-32 Architectures Optimization Reference Manual", which is located on http://www.intel.com/products/processor/manuals/.

(from some Intel blurb)

Using SSE, the Core i7 can issue simultaneous add and multiply instructions, resulting in a peak rate of 8 floating-point operations (FLOP) per clock cycle

That gives you an idea of how far things have come. The optimisation trivia - like bit shifting versus * - that was been taken seriously even into the 90s is just obsolete now. Bit-shifting is still faster, but for non-power-of-two mul/div by the time you do all your shifts and add the results it's slower again. Then, more instructions means more cache faults, more potential issues in pipelining, more use of temporary registers may mean more saving and restoring of register content from the stack... it quickly gets too complicated to quantify all the impacts definitively but they're predominantly negative.

functionality in source code vs implementation

More generally, your question is tagged C and C++. As 3rd generation languages, they're specifically designed to hide the details of the underlying CPU instruction set. To satisfy their language Standards, they must support multiplication and shifting operations (and many others) even if the underlying hardware doesn't. In such cases, they must synthesize the required result using many other instructions. Similarly, they must provide software support for floating point operations if the CPU lacks it and there's no FPU. Modern CPUs all support * and <<, so this might seem absurdly theoretical and historical, but the significance thing is that the freedom to choose implementation goes both ways: even if the CPU has an instruction that implements the operation requested in the source code in the general case, the compiler's free to choose something else that it prefers because it's better for the specific case the compiler's faced with.

Examples (with a hypothetical assembly language)

source           literal approach         optimised approach
#define N 0
int x;           .word x                xor registerA, registerA
x *= N;          move x -> registerA
                 move x -> registerB
                 A = B * immediate(0)
                 store registerA -> x
  ...............do something more with x...............

Instructions like exclusive or (xor) have no relationship to the source code, but xor-ing anything with itself clears all the bits, so it can be used to set something to 0. Source code that implies memory addresses may not entail any being used.

These kind of hacks have been used for as long as computers have been around. In the early days of 3GLs, to secure developer uptake the compiler output had to satisfy the existing hardcore hand-optimising assembly-language dev. community that the produced code wasn't slower, more verbose or otherwise worse. Compilers quickly adopted lots of great optimisations - they became a better centralised store of it than any individual assembly language programmer could possibly be, though there's always the chance that they miss a specific optimisation that happens to be crucial in a specific case - humans can sometimes nut it out and grope for something better while compilers just do as they've been told until someone feeds that experience back into them.

So, even if shifting and adding is still faster on some particular hardware, then the compiler writer's likely to have worked out exactly when it's both safe and beneficial.

Maintainability

If your hardware changes you can recompile and it'll look at the target CPU and make another best choice, whereas you're unlikely to ever want to revisit your "optimisations" or list which compilation environments should use multiplication and which should shift. Think of all the non-power-of-two bit-shifted "optimisations" written 10+ years ago that are now slowing down the code they're in as it runs on modern processors...!

Thankfully, good compilers like GCC can typically replace a series of bitshifts and arithmetic with a direct multiplication when any optimisation is enabled (i.e. ...main(...) { return (argc << 4) + (argc << 2) + argc; } -> imull $21, 8(%ebp), %eax) so a recompilation may help even without fixing the code, but that's not guaranteed.

Strange bitshifting code implementing multiplication or division is far less expressive of what you were conceptually trying to achieve, so other developers will be confused by that, and a confused programmer's more likely to introduce bugs or remove something essential in an effort to restore seeming sanity. If you only do non-obvious things when they're really tangibly beneficial, and then document them well (but don't document other stuff that's intuitive anyway), everyone will be happier.

General solutions versus partial solutions

If you have some extra knowledge, such as that your int will really only be storing values x, y and z, then you may be able to work out some instructions that work for those values and get you your result more quickly than when the compiler's doesn't have that insight and needs an implementation that works for all int values. For example, consider your question:

Multiplication and division can be achieved using bit operators...

You illustrate multiplication, but how about division?

int x;
x >> 1;   // divide by 2?

According to the C++ Standard 5.8:

-3- The value of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 divided by the quantity 2 raised to the power E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

So, your bit shift has an implementation defined result when x is negative: it may not work the same way on different machines. But, / works far more predictably. (It may not be perfectly consistent either, as different machines may have different representations of negative numbers, and hence different ranges even when there are the same number of bits making up the representation.)

You may say "I don't care... that int is storing the age of the employee, it can never be negative". If you have that kind of special insight, then yes - your >> safe optimisation might be passed over by the compiler unless you explicitly do it in your code. But, it's risky and rarely useful as much of the time you won't have this kind of insight, and other programmers working on the same code won't know that you've bet the house on some unusual expectations of the data you'll be handling... what seems a totally safe change to them might backfire because of your "optimisation".

Is there any sort of input that can't be multiplied or divided in this way?

Yes... as mentioned above, negative numbers have implementation defined behaviour when "divided" by bit-shifting.

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to c

conflicting types for 'outchar' Can't compile C program on a Mac after upgrade to Mojave Program to find largest and second largest number in array Prime numbers between 1 to 100 in C Programming Language In c, in bool, true == 1 and false == 0? How I can print to stderr in C? Visual Studio Code includePath "error: assignment to expression with array type error" when I assign a struct field (C) Compiling an application for use in highly radioactive environments How can you print multiple variables inside a string using printf?

Examples related to division

Division in Python 2.7. and 3.3 Python3 integer division How to do integer division in javascript (Getting division answer in int not float)? How can I do division with variables in a Linux shell? Python: Remove division decimal How to get a float result by dividing two integer values using T-SQL? Divide a number by 3 without using *, /, +, -, % operators Why does integer division in C# return an integer and not a float? How to check if number is divisible by a certain number? C++ Best way to get integer division and remainder

Examples related to multiplication

How to multiply all integers inside list How can I multiply all items in a list together with Python? Why does multiplication repeats the number several times? How to perform element-wise multiplication of two lists? How to multiply individual elements of a list with a number? Is multiplication and division using shift operators in C actually faster? How can a query multiply 2 cell for each row MySQL? Create list of single item repeated N times How can I multiply and divide using only bit shifting and adding?

Examples related to bit-shift

What is (x & 1) and (x >>= 1)? What does AND 0xFF do? How do shift operators work in Java? What does a bitwise shift (left or right) do and what is it used for? Is multiplication and division using shift operators in C actually faster? What are bitwise shift (bit-shift) operators and how do they work? Are the shift operators (<<, >>) arithmetic or logical in C?