TIL#02: What are Dex Files and Why Android Needs Them

TIL#02: What are Dex Files and Why Android Needs Them

TLDR; Dalvik

So I've been reading Androids: The Team That Built the Android Operating System. It's a book filled with stories from the team that built the Android OS from the time it was still intended to be an OS for cameras until their journey past Google's acquisition.

As you probably already know, Java is the main language of Android. Even though Java isn't my main language, I have a general idea of how Java codes are run, which can be illustrated by the following image:

Illustration of the process of how Java code are typically run

This model is quite common in the world of programming languages. Basically a language compiles to another simpler language (which is typically called "bytecode") and then that bytecode gets executed by a runtime (typically called VM).

In the case of Java, the runtime is the JVM. As you can see the JVM doesn't run the Java source code directly, rather it runs the bytecode that's produced by the Java compiler.

The implications of this nature is:

  1. As long as there's a runtime for a certain platform that can run JVM bytecode, Java is ensured to run on that platform.
  2. If someone's writing a new language, they have the option to leverage Java's popularity by making their language compiles down to JVM bytecode.

JVM is just a concept, there are many implementations of it by various organizations and individuals such as HotSpot, GraalVM, OpenJ9, Apache Harmony, JamVM, and many more.

Which Runtime does Android Use?

Given the nature of how Java codes are run, it's only natural to expect that Android also have a runtime that they use to run Java codes as Java is Android's main language.

However as you might have already guessed, they won't use some heavy runtimes such as OpenJ9 or HotSpot as these VMs didn't actually designed for mobile devices which have much less computing power and memory capacity as PCs (which is totally the case for Android devices in the early days).

According to the book, in the early days the Android team experimented using several existing runtimes, mainly those that are targeted for devices with constrained hardware. Waba was the first, but then got substituted with JamVM which was used until 2007 when their in-house VM was ready.

Waba and JamVM was enough for enabling the team's use of Java for prototyping and early development. However, both of those runtimes interpret JVM bytecode directly, but the team felt there's something to be gained (performance and memory-wise) by converting JVM bytecode into another more optimal format first before interpreting them.

Because they wanted to create a new bytecode format, that means they can't use any of the existing runtimes because obviously those runtimes are designed to run only JVM bytecodes. Thus begins the work on a new VM called Dalvik.

This resulted in a bit of change in the process of running Java code for Android compared to other platforms, which is illustrated by the following image:

Illustration of the process of Java code in Android

As you can see the JVM bytecode is compiled to another bytecode format -- Dalvik bytecode -- by the dx compiler, and that Dalvik bytecode is saved in the form of... dex files!!

But What About ART?

That was basically the gist of how dex files came about for Android.

But with version 4.4 Android brings a new runtime as technology preview which is called Android Runtime (ART). Dalvik was still the default runtime until the launch of Android 5.0 "Lollipop" when it was entirely replaced by ART.

To preserve backward-compatibility, ART still uses the same bytecode format so that apps developed for Dalvik can still work when running with ART. However it also came with a different approach on how programs are run, the difference mainly resides after the app installation phase, the dex files are compiled to native code upon app installation which essentially changes the diagram above to: