Understanding Bitcode for iOS Applications

Posted by Scott on June 20, 2017 in iOS |

Before the iPhone, Apple’s predominant compiler technology used GCC to compile Objective-C applications down to native executable code that was specific to the computers’s processor. The compiler produced executable “fat binaries” — equivalent to exe files on Windows and ELF on Linux – but unlike those, a fat binary can contain multiple versions of the same program; so the same executable file could run on different processors. It’s primarily this technology which allowed Apple to migrate from PowerPC to PowerPC64, and then onto Intel (and later Intel64). The downside of this approach is that there are multiple copies of the executable stored in the file, most of which won’t be used. These were marketed under the “Universal Binary” name — while the term was initially used to mean both PowerPC and Intel support, it was later repurposed to mean both Intel 32-bit and 64-bit support. In a fat binary, the runtime dynamically selects the right version of the code; but the application carries extra weight in case it is used on a different processor. Various thinning utilities (such as lipo) can be used to remove incompatible processor code as a means of reducing the size of the executable. These didn’t change the behaviour of the application, just the size.

With mobile devices the code size becomes more important, mainly because the device itself has much less storage space than a typical hard drive. As Apple moved from the original ARM processor to the custom A4 processor and onwards, the instruction set changed and different versions of code were used. These options are transparently set in Xcode based on the minimum level of iOS support and the resulting binaries will contain multiple variants.

The increasing importance of bitcode — and the migration towards LLVM — started happening several years go, when Apple decided to move from GCC to invest heavily in the LLVM tool chain and infrastructure. This initially took the place of compiling GPU specific code for OpenGL but then later extended to the Clang compiler. As support for Objective-C grew, it became a default first for Xcode and then started driving improvements to the Objective-C language at the same time.

This unlocked the potential for a complete LLVM based tool chain to compile iOS applications. LLVM provides a virtual instruction set that can be translated to (and optimised for) a specific processor architecture. The generic instruction set also has several representation forms: it can be stored in a textual based assembler format called IR (like assembly) or translated to a binary format (like an object file). It is this binary format that is called bitcode.

Bitcode differs from a traditional executable instruction set in that it maintains type of functions and signatures. Instead of (for example) a set of boolean fields could be compressed into a single byte in a traditional instruction set, but are kept separate in bitcode. In addition, logical operations (such as setting a register to zero) have their logical representation of $R=0; when this is translated to a specific instruction set it can be replaced with an optimised form of xor eax,eax. (This has the same effect — setting a register’s value to zero — but encodes the operation in fewer bytes than a direct assignment would take.)

However, bitcode is not completely architecture or calling convention independent. The size of registers is a fairly important property in an instruction set; more data can be stored in a 64-bit register than a 32-bit register. Generated bit code for a 64-bit platform will therefore look different than bit code generated for a 32-bit platform. In addition, calling conventions can be defined for both function calls and function definitions; this specifies (for example) whether the arguments are passed on the stack or as register values. Some languages also use pre-processor directives such as sizeof(long) which are translated before they even hit the generated bit code layer. In general, for a 64-bit platform that supports the fastcc convention will have compatible bit code.

So why does Apple require bitcode uploads for the watchOS and tvOS? Well, by moving the uploads to a centralised Apple server it is possible for Apple to optimise the binaries between compilation with Xcode and delivery to the target device. It’s also possible for developers to upload multiple variants and instead of packaging them into a single delivery (which would take up more space on the device). Finally, it also allows Apple to perform the code-signing of the application on the server side, without exposing any keys to the end developer.

The other main advantage of performing server side optimisation is to take advantage of whole and inter module optimisation. When using a statically compiled language without a dynamic runtime component the target of a function or method call can often be proven directly, allowing the table indirection to be avoided and be replaced with an equivalent direct call. This in turn opens up additional peephole optimisations that allow the function to be optimised further; for example, if it can be proven that a caller has a non-null value then null checks in the called function can be optimised away. Such optimisations are typically enabled through the use of -O flags at compile time, but will often just optimise the content of private functions within the same file. Whole module optimisation can consider optimisations across all functions within the same module, but will stop short of module boundaries (such as the dependencies on external frameworks). Inter module optimisations allow the code from different modules to be in-lined and then optimised further.

Each step up the optimisation chain provides more and more benefits but takes correspondingly more and more time to process. By offering optimisation-as-a-service and integrating it within the app store process, Apple allows developers to take advantage of compiler optimisations that may be prohibitively expensive to run at development time but can be batch processed by Apple’s servers at App Store provisioning time.

Perhaps more interestingly, it allows future optimisations to be developed after the application is uploaded and then have the application re-optimised to produce a faster or smaller application executable in future. Bitcode will provide Apple with a wealth of test cases for optimisation experiments; instead of having to construct examples from scratch they will be able to use real world code bases.

Finally, the bitcode on the server can be translated to support new architectures and instruction sets as they evolve. Provided that they maintain the calling convention and size of the alignment and words, a bitcode application might be translated into different architecture types and optimised specifically for a new processor. If standard libraries for math and vector routines are used, these can be optimised into processor specific vector instructions to gain the best performance for a given application. The optimisers might even generate multiple different encodings and judge based on size or execution speed.

Bitcode also has some disadvantages. Developers can debug crash reports from applications by storing copies of the debug symbols corresponding to the binary that was shipped to Apple. When a crash happens in a given stack, the developer can restore the original stack trace by symbolicating the crash report, using these debug symbols. However, the symbols are a by-product of translating the intermediate form to the binary; but if that step is done on the server, this information is lost. Apple provides a crash reporting service (InfoQ covered the purchase of TestFlight last year) that can play the part of the debugger, provided that the developer has uploaded the debug symbols at the time of application publication. The fact that the developer never sees the exact binary means that they may not be able to test for speciic issues as new hardware evolves. There are also some concerns about ceding power to Apple to perform compilation – including the ability to inject additional routines or code snippets – but since Apple is in full control of the publication process these are currently possible whether or not the developer uses bitcode or compiled binaries.

In addition, Apple’s initial roll-out of the bitcode and app thinning service was put on hold, because issues in upgrading from one type of hardware to a different type of hardware didn’t restore the right versions of binaries. This issue was subsequently fixed with iOS 9.0.2 and the feature re-enabled