ClassDynamic

June 2014: Initial draft

Brian Goetz

This is an informal sketch of proposed enhancements to the Java Virtual Machine to support structural descriptions of dynamically generated, pattern-instantiating classes. The precipitating motivation for this feature is generic specialization, but it has much broader applicability.

Many hand-written classes amount to mechanical transformations on other classes. For example, the class implementing Collections.synchronizedList() instantiates the "proxy" pattern, where each method is the instantiation of a template of the form below, for all the methods in the List interface:

R methodName(ARGS) {
    synchronized(this) { 
        return underlying.methodName(ARGS);
    }
}

Ideally, it would be preferable to specify this pattern once as a metaprogram, and simply apply it to interfaces as needed, rather than manually macro-expanding it every time the pattern is needed.

This proposal does not propose a language-level metaprogramming facility; it is restricted to VM features for representing dynamically generated classes.

As invokedynamic extends the semantics of method invocation to include a dynamic, customizable linkage step, this proposal is called classdynamic in that it extends the semantics of class resolution to also include a dynamic, customizable linkage step.

Nominal class description, loading and loader constraints

Currently, classes are described in a class file using one of two mechanisms. For language elements that refer to classes directly, such as naming a superclass or cast target, a CONSTANT_Class_info structure is used:

CONSTANT_Class_info {
    u1 tag;
    u2 name_index;
}

For member signatures, a class name may be embedded in a signature string, referred to via a CONSTANT_NameAndType_info:

CONSTANT_NameAndType_info {
    u1 tag;
    u2 name_index;
    u2 descriptor_index;
}

where the descriptor index refers to a string such as (Ljava/lang/Object;)V. In both cases, a class is referred to by name, and that name is provided to a ClassLoader. (Typically, a class loader uses that name to search for a file containing the class bytecode.)

Within a class loader, all occurrences of a given class name should resolve to the same class. Across class loaders, the VM imposes loading constraints to ensure that when a class member is referenced from a class loaded in a different class loader than that which loaded the referenced class, both loaders agree on the resolution of all type names appearing in that member's signature.

Proposed: structural class description

We propose to describe pattern-instantiating classes declaratively; "the result of applying the XYZ pattern to input class A and B". The pattern is described by a bootstrap method (e.g., "forwarding proxy"); a dynamic class description is a combination of a bootstrap method and a set of static arguments to that method.

This would be accomplished in part by a new constant pool type:

CONSTANT_ClassDynamic_info {
    u1 tag;
    u2 bootstrap_method_attr_index;
}

Here, we reuse the bootstrap method table (introduced for invokedynamic) to represent the structural description of a dynamic class; each entry in the bootstrap method table contains a MethodHandle for the bootstrap plus a number of static bootstrap arguments (which must be constants.)

Just as two nominal classes are considered to be the same class (modulo using the same class loader) if they have the same name, two dynamically described classes are considered to be the same class if they have the same bootstrap and static arguments.

For purposes of exposition, we will denote the constant pool structures corresponding to a dynamic class with bootstrap B and static arguments A0, A1, ... as:

{ B(A0, A1, ...) }

Because class names can also appear in method and field signatures using the L convention, there would need to also be a means of denoting a dynamic class when used as a method argument type, method return type, or field type. To address this limitation (that all types must have a nominal, embeddable representation) we could add a new constant pool type to describe signatures using references to class constants:

CONSTANT_NameAndTypeExt_info {
    u1 tag;
    u1 count;
    u2 name_index;
    u2 descriptor_index;
    u2[count] type_index;
}

where the descriptor would look like (I##)V, and the # symbols indicate that we should use the class referred to by the appropriate element in the type_index array. This allows a NameAndType to refer to dynamic class constants (and also provides a form of compression of the substantial redundancy that is currently present in classfiles.)

Example: forwarding proxy

Consider a forwarding proxy for interface I. This is a class whose constructor takes an instance of I, and for each method of I, forwards that method invocation to the underlying instance. For example, a forwarding proxy for Function would look like:

class ForwardingProxyForFunction<T,U> implements Function<T,U> { 
    private final Function<T,U> underlying;

    ForwardingProxyForFunction(Function<T,U> underlying) {
        this.underlying = underlying;
    }

    U apply(T t) {
        return underlying.apply(t);
    }
}

(Forwarding proxies can easily be generalized to more than a single interface.) Forwarding proxies are useful on their own for narrowing the set of methods supported by an instance; they also are useful as supertypes as it is common to want to override a small number of methods and forward the remaining ones on to the proxied instance.

Such a pattern is amenable to implementation as a classdynamic bootstrap. The arguments would be the interface(s) to be proxied; the result would be a class with a one-argument constructor (whose argument must implement all the interfaces) and which implements the specified interfaces according to the forwarding proxy pattern.

Example: synchronized proxy

Similarly, the JDK has several classes which do nothing but synchronize on this and then forward the method to an underlying instance:

class SyncProxyForList<T> implements List<T> {
    private final List<T> underlying;
    private final Object mutex;

    SyncProxyForList(List<T> underlying) { 
        this.underlying = underlying; 
        this.mutex = underlying;
    }

    int size() {
        synchronized (mutex) { 
            return underlying.size();
        }
    }
    ...
}

Again, this pattern can be automatically expanded by a classdynamic bootstrap, whose type arguments are the interface to be proxied.

Many even more interesting examples can arise (e.g., tuples) when a classdynamic bootstrap can generate value types.

Caching and identity

With nominal identification of classes, two classes are considered the same if they have the same name and same class loader; the mapping from (class loader, name) to class is cached so that subsequent references to the same class do not re-trigger classloading. Similarly, we need to do the same with structurally defined classes; a straightforward structural equality can be defined on structural class descriptors, and similar caching is needed.

Interaction with class loaders

When describing a dynamic class like { ForwardingProxy(I) }, there are two points of interaction with class loaders; with which class loader should the class I be loaded, and in which class loader should the resulting forwarding proxy be loaded?

For a simple class description like { ForwardingProxy(I) }, there are three obvious choices: the class loader of the class initiating the request, the class loader for I, or the class loader for ForwardingProxy (subject to the constraint that, whatever class loader is chosen, I must be visible to that class loader). When more types are involved, such as { ForwardingProxy(I1, I2) }, the possibility exists that I1 and I2 do not come from the same loader.

Further, the "right" answer will likely depend on what the bootstrap does. For a bootstrap that generates a primitive specialization of a class, the right answer is probably "the same loader that loaded the generic class." However, this is almost certainly not the right answer for all cases (and raises significant security issues.) Significant additional analysis is needed in this area.