Before implementing any algorithmic solution, check the assembly language for whatever CPU architecture you are using. Your architecture may include instructions which handle bitwise manipulations like this (and what could be simpler than a single assembly instruction?).
If such an instruction is not available, then I would suggest going with the lookup table route. You can write a script/program to generate the table for you, and the lookup operations would be faster than any of the bit-reversing algorithms here (at the cost of having to store the lookup table somewhere).