Wednesday, September 24, 2008

The Cost Of Indirection

Consider these two methods, identical except for that one takes an ArrayList<String>, while the other takes a List<String>.

static void direct(ArrayList list, String string) {
    for (int j = 10000; j > 0; j--) {
        list.clear();
        for (int i = 10000; i > 0; i--)
            list.add(string);
    }
}

static void indirect(List list, String string) {
    for (int j = 10000; j > 0; j--) {
        list.clear();
        for (int i = 10000; i > 0; i--)
            list.add(string);
    }
}

Believe it or not, the method which takes an ArrayList will run 30% faster than the method which takes a List. Even more interestingly, switching from List<String> to the Collection<String> interface would make no difference. Why is this? Perhaps the JVM is skipping vtable lookups within the method that uses ArrayList...

If our theory is correct, then the JVM will skip vtable lookups in code that it can prove to be using a specific, concrete method. Therefore, if we were to extend the ArrayList class, we should see a slowdown in the performance of the direct method above. Indeed, we can demonstrate this as follows:

private static class MyArrayList extends ArrayList {
    public boolean add(T e) {
        return super.add(e);
    }
};

public static void main(String[] args) {
    ArrayList list = new ArrayList();
    String string = "dummy";

    for (int j = 0; j < 2; j++) {
        for (int i = 0; i < 3; i++) {
            long directTime = 0;
            long indirectTime = 0;

            // Do a few runs to increase precision
            for (int k = 0; k < 10; k++) {

                indirectTime -= System.nanoTime();
                indirect(list, string);
                indirectTime += System.nanoTime();

                directTime -= System.nanoTime();
                direct(list, string);
                directTime += System.nanoTime();
            }

            // Print the results
            print(directTime/1e9, indirectTime/1e9);
        }

        System.out.println(">> Loading MyArrayList");
        System.out.println();
        MyArrayList.class.getName();
    }
}

static void print(double directTime, double indirectTime) {
    double delta = 100.0 * (1.0 - directTime/indirectTime);
    System.out.printf("%.3gms for List\n", indirectTime);
    System.out.printf("%.3gms for ArrayList\n", directTime);
    System.out.printf("Difference is %.3g%%\n", delta);
    System.out.println();
}

The theory is correct! The output of the above program looks like this:

31.7ms for List
21.7ms for ArrayList
Difference is 31.6%

31.7ms for List
21.3ms for ArrayList
Difference is 32.7%

31.4ms for List
21.4ms for ArrayList
Difference is 31.7%

>> Loading MyArrayList into memory

31.2ms for List
31.5ms for ArrayList
Difference is -0.782%

31.4ms for List
31.1ms for ArrayList
Difference is 1.20%

31.3ms for List
31.5ms for ArrayList
Difference is -0.726%

So what rules can we learn from this?

  1. Try to avoid method indirection in performance critical code. If your code needs to be modular then you must use indirection, but try to limit it to as few indirections as necessary. Overly general code does have a performance hit.
  2. Extending a class in one piece of code can have surprising performance effects on another piece of code. You're not in C++ land anymore folks!

I do wonder though: is there scope for runtime templates in the JVM? Could the JVM perhaps produce multiple instances of the indirect method (one instance for each concrete subclass of List) and jump directly to the most appropriate method? It seems to have potential for improving JVM performance.

No comments: