Removing deadlocks from supertype initialization: A proposal

John Rose, 10/2023

Context: https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-5.html#jvms-5.5

5.5. Initialization

Initialization of a class or interface consists of executing its class or interface initialization method (§2.9.2)…

(Note: Within any given thread which has acquired the lock object LC of a class or interface C, the Class object for C indicates that C is in exactly one of four initialization states: uninitialized (no initialization action has been taken yet), being initialized in the current thread, being initialized in another thread, fully initialized, or erroneous. Within a thread performing initialization of C, the initialization procedure moves this state from uninitialized to being initialized by the current thread, to either fully initialized or erroneous. Other threads may observe any of these states, except that it will be seen as being initialized in another thread if it is neither uninitialized nor fully initialized nor erroneous. The state of being initialized in this thread or another thread is further split, in this new proposal, into two substates, a new state pending being initialized… and the legacy state, called called currently being initialized….)

… The procedure for initializing C is then as follows:

  1. Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.

  2. If the Class object for C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed (or, if pending, as been canceled), at which time repeat this procedure.

    Thread interrupt status is unaffected by execution of the initialization procedure.

  3. If the Class object for C indicates that initialization is in progress for C by the current thread, then this must be a recursive request for initialization.

    (3a.) If the indication shows that C is currently being initialized in the current thread, then

    release LC and complete normally.

    (3b.) Otherwise, if C is pending being initialized in the current thread, record the new fact that C is currently being initialized in the current thread, release LC, and skip forward to step 7, to complete the initialization of the Class object of C in the current thread.

    (Note: C in fact must be a superclass or superinterface for another class currently being initialized in the current thread.)

  4. If the Class object for C indicates that C has already been initialized, then no further action is required. Release LC and complete normally.

  5. If the Class object for C is in an erroneous state, then initialization is not possible. Release LC and throw a NoClassDefFoundError.

  6. (Note: At this point, because the state is neither fully initialized, initialized in another thread, nor in an erroneous state, the current thread is observing C in an uninitialized state, under the lock LC. At this point the procedure prepares to initialize C in the current thread, and perhaps its supers, subject to the detection of certain potential deadlocks involving supertype initialization.)

    Otherwise, if C is a class rather than an interface, determine its superclass SC and its superinterfaces SI1, …, SIn (as in step 7 below, recalling that these are superinterfaces that declare at least one non-abstract, non-static method). For this step, the order in which this class and these interfaces are considered is defined by the implementation, in such a way that the class comes before the interfaces, sub-interfaces come before their super-interfaces, and interfaces not related to each other are considered in a relative order that is fixed by the implementation and not changed for distinct applications of this initialization procedure. As defined for this step only, this ordered list is the super-list of C. In what follows, if C is an interface, define its super-list as the empty list. Optionally, the super-list can be defined as empty even for classes, skipping the logic that follows.

    Optionally, before associating the initialization of C with the current thread, scan all elements of the super-list for all classes or interfaces in either two states, uninitialized or being initialized in another thread, detecting and processing these two states as a single transaction for the whole list, as follows.

    Specifically, in the implementation-defined order, for each superclass or superinterface UC in the super-list, acquire the lock LUC of UC. (6a.) If UC is seen to be initializing in another thread, release LC, LUC, and all locks previously acquired in this step, and block the current thread until informed that the in-progress initialization of UC has completed, at which time repeat this procedure.

    (6b.) Otherwise, with LUC held for this UC, if UC is not simply uninitialized, release LUC and continue processing the super-list. (6c.) Otherwise, with LUC still held for this UC, enter UC temporarily in a local list UCL of uninitialized supers, created for this specific instance of the initialization procedure.

    (6d.) After the super-list UCL is processed, for each uninitialized superclass or superinterface UC temporarily entered on the local list UCL in step (6c.), record the fact that the Class object for UC is pending being initialized by the current thread, and release the lock LUC.

    (Note: The optional pass over the super-list looks ahead to supertypes of C so that, in the rare event that one of the supertype initializations nakes a recursive request to initialize C, then that other thread is allowed to complete the initialization of C, instead of the current thread. Marking each uninitialized supertype UC as being initialized by the current thread causes any other thread that attempts to initialize UC to stop and wait at step 2 above, until the current thread either finishes initializing UC (as it normally must) or else until the current thread places C into an erroneous state, at which point another thread may possibly initialize UC. The pass is optional to allow simplified implementations to conform to the specification. The optional pass is also designed so that it is, in practice, impossible to detect if it is present or not, because in most cases the decision to initialize supers UC in the same thread will be made even without the optional pass.)

    In any case, after optionally processing the super-list, unless a superclass or superinterface being initialized in another thread was detected by that processing, and after recording any previously uninitialized superclass or superinterface (from the super-list) as being initialized by the current thread,

    record the fact that initialization of the Class object for C is currently in progress by the current thread, and release LC.

    Then, initialize each final static field of C with the constant value in its ConstantValue attribute (§4.7.2), in the order the fields appear in the ClassFile structure.

  7. Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, …, SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-abstract, non-static method. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of the interfaces array of C), the enumeration recurs on I’s superinterfaces (in the order of the interfaces array of I) before returning I.

    For each S in the list [ SC, SI1, …, SIn ], if S has not yet been initialized, then recursively perform this entire procedure for S. If necessary, verify and prepare S first.

    (Note: If S was marked pending being initialized by the current thread in step 6d., then its initialization will proceed immediately in the current thread.)

    If the initialization of S completes abruptly because of a thrown exception, then acquire LC, label the Class object for C as erroneous, notify all waiting threads, and release LC.

    For each uninitialized superclass or superinterface UC temporarily entered on the local list UCL in step (6c.), acquire its lock LUC, label the Class object for UC as uninitialized if it is recorded as being initialized in the current thread, record instead that it is uninitialized, release LUC, and notify all waiting threads.

    Then,

    complete abruptly, throwing the same exception that resulted from initializing SC.

  8. Next, determine whether assertions are enabled for C by querying its defining loader.

  9. Next, execute the class or interface initialization method of C.

  10. If the execution of the class or interface initialization method completes normally, then acquire LC, label the Class object for C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.

  11. Otherwise, the class or interface initialization method must have completed abruptly by throwing some exception E. If the class of E is not Error or one of its subclasses, then create a new instance of the class ExceptionInInitializerError with E as the argument, and use this object in place of E in the following step. If a new instance of ExceptionInInitializerError cannot be created because an OutOfMemoryError occurs, then use an OutOfMemoryError object in place of E in the following step.

  12. Acquire LC, label the Class object for C as erroneous, notify all waiting threads and release LC.

    Then, just as in the similar case of abrupt completion in step 7, for each uninitialized superclass or superinterface UC temporarily entered on the local list UCL in step (6c.), acquire its lock LUC, label the Class object for UC as uninitialized if it is recorded as being initialized in the current thread, record instead that it is uninitialized, notify all waiting threads, and release LUC.

    Then,

    complete this procedure abruptly with reason E or its replacement as determined in the previous step.

A Java Virtual Machine implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the Java memory model, all happens-before orderings (JLS §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.

Implementation Note: The states of being initialized (whether pending or currently) in some thread T would seem to require allocation of a machine word to point to the thread T, in each class metaobject. However, this pointer can be inverted (as with some lock structures) so that each such thread T holds a short list of classes being initialized in T. This will save metaspace, at the cost of a short search when processing the being initialized states in the above procedure. That may be a good tradeoff. (As with some lock algorithms, only the current thread can safely test if it has recorded some class on its own list; threads cannot inspect the lists of other threads. But that is enough to distinction between being initialized in the current thread, and in another thread.) Apart from the indication of such a thread T, the encoding of the various initialization states only requires a few bits per class metaobject.

Implementation Note: The temporary local list UCL of uninitialized supers is likely to be very short, since typically most or all of the supers will be initialized already, before the above procedure is applied to C. Therefore, the super-list should not be actually materialized as a whole. Rather, the supers should be traversed in any order convenient to the implementation, pruning the supertype graph traversal when encoutering fully initialized supers (since all of their supers must also be fully initialized). When an unimplemented super is encountered, then it must be added to the temporary local list UCL, and it seems likely that a simple insertion sort of that list will suffice (since it is almost always very short), or in the unlikely worst cse, an unordered insertion followed by a separate sorting pass. It is only on that sort list that the problem of ordering the items correctly applies, not on the whole super-list. Ordering can use simple metaspace addresses (if metadata never moves), or else class loaders (which load classes) can be assigned successive counters (or the addresses of their metadata, if that does not move). If at least the class loaders can be ordered in some way, then string comparisons can further sort the interfaces. It is suggested that each interface be given a depth number which is one more than the maximum depth number of all its super-interfaces (or zero if none). In that way, using the depth number as a primary sort key will fulfill the requirement that subinterfaces precede superinterfaces on the UCL list. Then the metadata address or class loader counter plus string can be used to complete the total ordering of the UCL list.