Since the exact question is "what does enable bitcode do", I'd like to give a few thin technical details I've figured out thus far. Most of this is practically impossible to figure out with 100% certainty until Apple releases the source code for this compiler
First, Apple's bitcode does not appear to be the same thing as LLVM bytecode. At least, I've not been able to figure out any resemblance between them. It appears to have a proprietary header (always starts with "xar!") and probably some link-time reference magic that prevents data duplications. If you write out a hardcoded string, this string will only be put into the data once, rather than twice as would be expected if it was normal LLVM bytecode.
Second, bitcode is not really shipped in the binary archive as a separate architecture as might be expected. It is not shipped in the same way as say x86 and ARM are put into one binary (FAT archive). Instead, they use a special section in the architecture specific MachO binary named "__LLVM" which is shipped with every architecture supported (ie, duplicated). I assume this is a short coming with their compiler system and may be fixed in the future to avoid the duplication.
C code (compiled with clang -fembed-bitcode hi.c -S -emit-llvm
):
#include <stdio.h>
int main() {
printf("hi there!");
return 0;
}
LLVM IR output:
; ModuleID = '/var/folders/rd/sv6v2_f50nzbrn4f64gnd4gh0000gq/T/hi-a8c16c.bc'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
@.str = private unnamed_addr constant [10 x i8] c"hi there!\00", align 1
@llvm.embedded.module = appending constant [1600 x i8] c"\DE\C0\17\0B\00\00\00\00\14\00\00\00$\06\00\00\07\00\00\01BC\C0\DE!\0C\00\00\86\01\00\00\0B\82 \00\02\00\00\00\12\00\00\00\07\81#\91A\C8\04I\06\1029\92\01\84\0C%\05\08\19\1E\04\8Bb\80\10E\02B\92\0BB\84\102\148\08\18I\0A2D$H\0A\90!#\C4R\80\0C\19!r$\07\C8\08\11b\A8\A0\A8@\C6\F0\01\00\00\00Q\18\00\00\C7\00\00\00\1Bp$\F8\FF\FF\FF\FF\01\90\00\0D\08\03\82\1D\CAa\1E\E6\A1\0D\E0A\1E\CAa\1C\D2a\1E\CA\A1\0D\CC\01\1E\DA!\1C\C8\010\87p`\87y(\07\80p\87wh\03s\90\87ph\87rh\03xx\87tp\07z(\07yh\83r`\87th\07\80\1E\E4\A1\1E\CA\01\18\DC\E1\1D\DA\C0\1C\E4!\1C\DA\A1\1C\DA\00\1E\DE!\1D\DC\81\1E\CAA\1E\DA\A0\1C\D8!\1D\DA\A1\0D\DC\E1\1D\DC\A1\0D\D8\A1\1C\C2\C1\1C\00\C2\1D\DE\A1\0D\D2\C1\1D\CCa\1E\DA\C0\1C\E0\A1\0D\DA!\1C\E8\01\1D\00s\08\07v\98\87r\00\08wx\876p\87pp\87yh\03s\80\876h\87p\A0\07t\00\CC!\1C\D8a\1E\CA\01 \E6\81\1E\C2a\1C\D6\A1\0D\E0A\1E\DE\81\1E\CAa\1C\E8\E1\1D\E4\A1\0D\C4\A1\1E\CC\C1\1C\CAA\1E\DA`\1E\D2A\1F\CA\01\C0\03\80\A0\87p\90\87s(\07zh\83q\80\87z\00\C6\E1\1D\E4\A1\1C\E4\00 \E8!\1C\E4\E1\1C\CA\81\1E\DA\C0\1C\CA!\1C\E8\A1\1E\E4\A1\1C\E6\01X\83y\98\87y(\879`\835\18\07|\88\03;`\835\98\87y(\076X\83y\98\87r\90\036X\83y\98\87r\98\03\80\A8\07w\98\87p0\87rh\03s\80\876h\87p\A0\07t\00\CC!\1C\D8a\1E\CA\01 \EAa\1E\CA\A1\0D\E6\E1\1D\CC\81\1E\DA\C0\1C\D8\E1\1D\C2\81\1E\00s\08\07v\98\87r\006\C8\88\F0\FF\FF\FF\FF\03\C1\0E\E50\0F\F3\D0\06\F0 \0F\E50\0E\E90\0F\E5\D0\06\E6\00\0F\ED\10\0E\E4\00\98C8\B0\C3<\94\03@\B8\C3;\B4\819\C8C8\B4C9\B4\01<\BCC:\B8\03=\94\83<\B4A9\B0C:\B4\03@\0F\F2P\0F\E5\00\0C\EE\F0\0Em`\0E\F2\10\0E\EDP\0Em\00\0F\EF\90\0E\EE@\0F\E5 \0FmP\0E\EC\90\0E\ED\D0\06\EE\F0\0E\EE\D0\06\ECP\0E\E1`\0E\00\E1\0E\EF\D0\06\E9\E0\0E\E60\0Fm`\0E\F0\D0\06\ED\10\0E\F4\80\0E\809\84\03;\CCC9\00\84;\BCC\1B\B8C8\B8\C3<\B4\819\C0C\1B\B4C8\D0\03:\00\E6\10\0E\EC0\0F\E5\00\10\F3@\0F\E10\0E\EB\D0\06\F0 \0F\EF@\0F\E50\0E\F4\F0\0E\F2\D0\06\E2P\0F\E6`\0E\E5 \0Fm0\0F\E9\A0\0F\E5\00\E0\01@\D0C8\C8\C39\94\03=\B4\C18\C0C=\00\E3\F0\0E\F2P\0Er\00\10\F4\10\0E\F2p\0E\E5@\0Fm`\0E\E5\10\0E\F4P\0F\F2P\0E\F3\00\AC\C1<\CC\C3<\94\C3\1C\B0\C1\1A\8C\03>\C4\81\1D\B0\C1\1A\CC\C3<\94\03\1B\AC\C1<\CCC9\C8\01\1B\AC\C1<\CCC9\CC\01@\D4\83;\CCC8\98C9\B4\819\C0C\1B\B4C8\D0\03:\00\E6\10\0E\EC0\0F\E5\00\10\F50\0F\E5\D0\06\F3\F0\0E\E6@\0Fm`\0E\EC\F0\0E\E1@\0F\809\84\03;\CCC9\00\00I\18\00\00\02\00\00\00\13\82`B \00\00\00\89 \00\00\0D\00\00\002\22\08\09 d\85\04\13\22\A4\84\04\13\22\E3\84\A1\90\14\12L\88\8C\0B\84\84L\100s\04H*\00\C5\1C\01\18\94`\88\08\AA0F7\10@3\02\00\134|\C0\03;\F8\05;\A0\836\08\07x\80\07v(\876h\87p\18\87w\98\07|\88\038p\838\80\037\80\83\0DeP\0Em\D0\0Ez\F0\0Em\90\0Ev@\07z`\07t\D0\06\E6\80\07p\A0\07q \07x\D0\06\EE\80\07z\10\07v\A0\07s \07z`\07t\D0\06\B3\10\07r\80\07:\0FDH #EB\80\1D\8C\10\18I\00\00@\00\00\C0\10\A7\00\00 \00\00\00\00\00\00\00\868\08\10\00\02\00\00\00\00\00\00\90\05\02\00\00\08\00\00\002\1E\98\0C\19\11L\90\8C\09&G\C6\04C\9A\22(\01\0AM\D0i\10\1D]\96\97C\00\00\00y\18\00\00\1C\00\00\00\1A\03L\90F\02\134A\18\08&PIC Level\13\84a\D80\04\C2\C05\08\82\83c+\03ab\B2j\02\B1+\93\9BK{s\03\B9q\81q\81\01A\19c\0Bs;k\B9\81\81q\81q\A9\99q\99I\D9\10\14\8D\D8\D8\EC\DA\5C\DA\DE\C8\EA\D8\CA\5C\CC\D8\C2\CE\E6\A6\04C\1566\BB6\974\B227\BA)A\01\00y\18\00\002\00\00\003\08\80\1C\C4\E1\1Cf\14\01=\88C8\84\C3\8CB\80\07yx\07s\98q\0C\E6\00\0F\ED\10\0E\F4\80\0E3\0CB\1E\C2\C1\1D\CE\A1\1Cf0\05=\88C8\84\83\1B\CC\03=\C8C=\8C\03=\CCx\8Ctp\07{\08\07yH\87pp\07zp\03vx\87p \87\19\CC\11\0E\EC\90\0E\E10\0Fn0\0F\E3\F0\0E\F0P\0E3\10\C4\1D\DE!\1C\D8!\1D\C2a\1Ef0\89;\BC\83;\D0C9\B4\03<\BC\83<\84\03;\CC\F0\14v`\07{h\077h\87rh\077\80\87p\90\87p`\07v(\07v\F8\05vx\87w\80\87_\08\87q\18\87r\98\87y\98\81,\EE\F0\0E\EE\E0\0E\F5\C0\0E\EC\00q \00\00\05\00\00\00&`<\11\D2L\85\05\10\0C\804\06@\F8\D2\14\01\00\00a \00\00\0B\00\00\00\13\04A,\10\00\00\00\03\00\00\004#\00dC\19\020\18\83\01\003\11\CA@\0C\83\11\C1\00\00#\06\04\00\1CB\12\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00", section "__LLVM,__bitcode"
@llvm.cmdline = appending constant [67 x i8] c"-triple\00x86_64-apple-macosx10.10.0\00-emit-llvm\00-disable-llvm-optzns\00", section "__LLVM,__cmdline"
; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([10 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 7.0.0 (clang-700.0.53.3)"}
The data array that is in the IR also changes depending on the optimization and other code generation settings of clang. It's completely unknown to me what format or anything that this is in.
EDIT:
Following the hint on Twitter, I decided to revisit this and to confirm it. I followed this blog post and used his bitcode extractor tool to get the Apple Archive binary out of the MachO executable. And after extracting the Apple Archive with the xar utility, I got this (converted to text with llvm-dis of course)
; ModuleID = '1'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.10.0"
@.str = private unnamed_addr constant [10 x i8] c"hi there!\00", align 1
; Function Attrs: nounwind ssp uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0))
ret i32 0
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="core2" "target-features"="+ssse3,+cx16,+sse,+sse2,+sse3" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 7.0.0 (clang-700.1.76)"}
The only notable difference really between the non-bitcode IR and the bitcode IR is that filenames have been stripped to just 1, 2, etc for each architecture.
I also confirmed that the bitcode embedded in a binary is generated after optimizations. If you compile with -O3 and extract out the bitcode, it'll be different than if you compile with -O0.
And just to get extra credit, I also confirmed that Apple does not ship bitcode to devices when you download an iOS 9 app. They include a number of other strange sections that I don't recognized like __LINKEDIT, but they do not include __LLVM.__bundle, and thus do not appear to include bitcode in the final binary that runs on a device. Oddly enough, Apple still ships fat binaries with separate 32/64bit code to iOS 8 devices though.