How do the likely unlikely macros in the Linux kernel work and what is their benefit

Question

I ve been digging through some parts of the Linux kernel  and found calls like this   if  unlikely fd  lt  0            Do something        or  if  likely  err            Do something        I ve found the definition of them    define likely x          builtin expect  x  1   define unlikely x        builtin expect  x  0    I know that they are for optimization  but how do they work  And how much performance size decrease can be expected from using them  And is it worth the hassle  and losing the portability probably  at least in bottleneck code  in userspace  of course

User · Answer

general comment - other answers cover the details   There s no reason that you should lose portability by using them   You always have the option of creating a simple nil-effect  inline  or macro that will allow you to compile on other platforms with other compilers   You just won t get the benefit of the optimization if you re on other platforms

User · Answer

As per the comment by Cody  this has nothing to do with Linux  but is a hint to the compiler   What happens will depend on the architecture and compiler version     This particular feature in Linux is somewhat mis-used in drivers   As osgx points out in semantics of hot attribute  any hot or cold function called with in a block can automatically hint that the condition is likely or not   For instance  dump stack   is marked cold so this is redundant    if unlikely err          printk  Driver error found   d n   err        dump stack         Future versions of gcc may selectively inline a function based on these hints   There have also been suggestions that it is not boolean  but a score as in most likely  etc   Generally  it should be preferred to use some alternate mechanism like cold   There is no reason to use it in any place but hot paths   What a compiler will do on one architecture can be completely different on another

User · Answer

These are macros that give hints to the compiler about which way a branch may go   The macros expand to GCC specific extensions  if they re available     GCC uses these to to optimize for branch prediction   For example  if you have something like the following  if  unlikely x       dosomething       return x    Then it can restructure this code to be something more like   if   x      return x     dosomething    return x    The benefit of this is that when the processor takes a branch the first time  there is significant overhead  because it may have been speculatively loading and executing code further ahead   When it determines it will take the branch  then it has to invalidate that  and start at the branch target   Most modern processors now have some sort of branch prediction  but that only assists when you ve been through the branch before  and the branch is still in the branch prediction cache   There are a number of other strategies that the compiler and processor can use in these scenarios   You can find more details on how branch predictors work at Wikipedia  http   en wikipedia org wiki Branch predictor

User · Answer

They are hint to the compiler to emit instructions that will cause branch prediction to favour the  likely  side of a jump instruction  This can be a big win  if the prediction is correct it means that the jump instruction is basically free and will take zero cycles  On the other hand if the prediction is wrong  then it means the processor pipeline needs to be flushed and it can cost several cycles  So long as the prediction is correct most of the time  this will tend to be good for performance   Like all such performance optimisations you should only do it after extensive profiling to ensure the code really is in a bottleneck  and probably given the micro nature  that it is being run in a tight loop  Generally the Linux developers are pretty experienced so I would imagine they would have done that  They don t really care too much about portability as they only target gcc  and they have a very close idea of the assembly they want it to generate

User · Answer

In many linux release  you can find complier h in  usr linux    you can include it for use simply  And another opinion  unlikely   is more useful rather than likely    because   if   likely                 doSomething        it can be optimized as well in many compiler   And by the way  if you want to observe the detail behavior of the code  you can do simply as follow      gcc -c test c   objdump -d test o   obj s   Then  open obj s  you can find the answer

User · Answer

These are GCC functions for the programmer to give a hint to the compiler about what the most likely branch condition will be in a given expression  This allows the compiler to build the branch instructions so that the most common case takes the fewest number of instructions to execute   How the branch instructions are built are dependent upon the processor architecture

User · Answer

They re hints to the compiler to generate the hint prefixes on branches   On x86 x64  they take up one byte  so you ll get at most a one-byte increase for each branch   As for performance  it entirely depends on the application -- in most cases  the branch predictor on the processor will ignore them  these days   Edit  Forgot about one place they can actually really help with   It can allow the compiler to reorder the control-flow graph to reduce the number of branches taken for the  likely  path   This can have a marked improvement in loops where you re checking multiple exit cases

User · Answer

long   builtin expect long EXP  long C     This construct tells the compiler that the expression EXP most likely will have the value C  The return value is EXP    builtin expect is meant to be used in an conditional expression  In almost all cases will it be used in the context of boolean expressions in which case it is much more convenient to define two helper macros    define unlikely expr    builtin expect    expr   0   define likely expr    builtin expect    expr   1    These macros can then be used as in  if  likely a  gt  1     Reference  https   www akkadia org drepper cpumemory pdf

User · Answer

They cause the compiler to emit the appropriate branch hints where the hardware supports them  This usually just means twiddling a few bits in the instruction opcode  so code size will not change  The CPU will start fetching instructions from the predicted location  and flush the pipeline and start over if that turns out to be wrong when the branch is reached  in the case where the hint is correct  this will make the branch much faster - precisely how much faster will depend on the hardware  and how much this affects the performance of the code will depend on what proportion of the time hint is correct   For instance  on a PowerPC CPU an unhinted branch might take 16 cycles  a correctly hinted one 8 and an incorrectly hinted one 24  In innermost loops good hinting can make an enormous difference   Portability isn t really an issue - presumably the definition is in a per-platform header  you can simply define  likely  and  unlikely  to nothing for platforms that do not support static branch hints

User · Answer

Let s decompile to see what GCC 4 8 does with it  Without   builtin expect   include  stdio h   include  time h   int main            Use time to prevent it from being optimized away         int i    time NULL       if  i          printf   d n   i       puts  a        return 0      Compile and decompile with GCC 4 8 2 x86 64 Linux   gcc -c -O3 -std gnu11 main c objdump -dr main o   Output   0000000000000000  lt main gt      0        48 83 ec 08             sub     0x8  rsp    4        31 ff                   xor     edi  edi    6        e8 00 00 00 00          callq  b  lt main 0xb gt                      7  R X86 64 PC32        time-0x4    b        48 85 c0                test    rax  rax    e        75 14                   jne    24  lt main 0x24 gt    10        ba 01 00 00 00          mov     0x1  edx   15        be 00 00 00 00          mov     0x0  esi                     16  R X86 64 32  rodata str1 1   1a        bf 01 00 00 00          mov     0x1  edi   1f        e8 00 00 00 00          callq  24  lt main 0x24 gt                      20  R X86 64 PC32         printf chk-0x4   24        bf 00 00 00 00          mov     0x0  edi                     25  R X86 64 32  rodata str1 1 0x4   29        e8 00 00 00 00          callq  2e  lt main 0x2e gt                      2a  R X86 64 PC32       puts-0x4   2e        31 c0                   xor     eax  eax   30        48 83 c4 08             add     0x8  rsp   34        c3                      retq   The instruction order in memory was unchanged  first the printf and then puts and the retq return   With   builtin expect  Now replace if  i  with   if    builtin expect i  0     and we get   0000000000000000  lt main gt      0        48 83 ec 08             sub     0x8  rsp    4        31 ff                   xor     edi  edi    6        e8 00 00 00 00          callq  b  lt main 0xb gt                      7  R X86 64 PC32        time-0x4    b        48 85 c0                test    rax  rax    e        74 11                   je     21  lt main 0x21 gt    10        bf 00 00 00 00          mov     0x0  edi                     11  R X86 64 32  rodata str1 1 0x4   15        e8 00 00 00 00          callq  1a  lt main 0x1a gt                      16  R X86 64 PC32       puts-0x4   1a        31 c0                   xor     eax  eax   1c        48 83 c4 08             add     0x8  rsp   20        c3                      retq   21        ba 01 00 00 00          mov     0x1  edx   26        be 00 00 00 00          mov     0x0  esi                     27  R X86 64 32  rodata str1 1   2b        bf 01 00 00 00          mov     0x1  edi   30        e8 00 00 00 00          callq  35  lt main 0x35 gt                      31  R X86 64 PC32         printf chk-0x4   35        eb d9                   jmp    10  lt main 0x10 gt    The printf  compiled to   printf chk  was moved to the very end of the function  after puts and the return to improve branch prediction as mentioned by other answers   So it is basically the same as   int main         int i    time NULL       if  i          goto printf  puts      puts  a        return 0  printf      printf   d n   i       goto puts      This optimization was not done with -O0   But good luck on writing an example that runs faster with   builtin expect than without  CPUs are really smart these days  My naive attempts are here   C  20   likely   and   unlikely    C  20 has standardized those C   built-ins  How to use C  20  39 s likely unlikely attribute in if-else statement They will likely  a pun   do the same thing

[linux] How do the likely/unlikely macros in the Linux kernel work and what is their benefit?

Examples related to linux

Examples related to gcc

Examples related to linux-kernel

Examples related to likely-unlikely