They cause the compiler to emit the appropriate branch hints where the hardware supports them. This usually just means twiddling a few bits in the instruction opcode, so code size will not change. The CPU will start fetching instructions from the predicted location, and flush the pipeline and start over if that turns out to be wrong when the branch is reached; in the case where the hint is correct, this will make the branch much faster - precisely how much faster will depend on the hardware; and how much this affects the performance of the code will depend on what proportion of the time hint is correct.
For instance, on a PowerPC CPU an unhinted branch might take 16 cycles, a correctly hinted one 8 and an incorrectly hinted one 24. In innermost loops good hinting can make an enormous difference.
Portability isn't really an issue - presumably the definition is in a per-platform header; you can simply define "likely" and "unlikely" to nothing for platforms that do not support static branch hints.