Using Computed Constants to Manage Static State in Leyden

John Rose, 7/2023

I very much like the proposal for ComputedConstant, as found at https://openjdk.org/jeps/8312611. (I will call them CC’s here.) Let’s adopt these nifty CC’s. And let’s make use of them in bundled form, as CC-lists, when possible, since that reduces the per-value footprint to something like a single memory word. Cool stuff!

Here’s a link into the initial public conversation about CC: https://mail.openjdk.org/pipermail/leyden-dev/2023-July/000209.html.

As a follow-up investigation, I’d like try to parley CC’s, and specifically CC-lists, into a mechanism that scales so well that users can routinely empty out their <clinit>s. This will help Leyden, Graal/NI, and other static analyzers to make more granular observations about program dependencies, at the field/method level (instead of today’s class level, as mandated by <clinit> side effects).

Evacuating <clinit> almost certainly requires changes to source code. Programmers know more about the initialization states of their classes than any static analyzer can know. (And vice versa, for some analyses!) So we need to convince users to refactor their normal <clinit>-based statics as explicit CC’s.

Doing that effectively might require translation strategy modifications and even JVM & language changes, which we try to avoid because of their great cost. It makes sense to settle on CC semantics first, and then, consider maybe settling them into the language/TS/JVM, maybe. (Note: TS is translation strategy, an engineering artifact between the JLS and JVMS which often goes unseen.) That will only happen if the changes solve an important problem not solvable by some other means. Scalable replacement of <clinit> by per-field dependencies might be that important problem. Maybe, maybenot.

If it is (after all those “maybes”), then I have prototyped some VM and runtime support for lazy static finals which will be illuminating. (I don’t believe the language design is nearly done yet though; it’s not even started!)

As a rough sketch of a path for programmers to evacuate <clinit>, I’d like to propose several steps:

First, define a best practice for refactoring single CCs.
Then, extend that to a best practice for bulk refactoring of many fields (in one class at a time, of course). It would use CC-lists to reduce footprint.
Then, a translation (or condensation) strategy for making that scale well, using tricks like BSMs and reflection.
Finally, some kind of support in the language and/or VM to make this more cost-effective (at the expense of JLS and JVMS changes).

First, a refactoring for individual fields. There would be a way to opt into CC-based state management for a static field. Today you would refactor both uses and defs of the field, replacing each field definition by a no-argument method with a special body, or else replacing each use of X.F by X.F_CC.get(). Maybe there’s a more declarative way in the future, and one that leave use sites less disturbed. I’m holding that lightly.

Second, a “bulk” refactoring move for multiple fields, perhaps all non-constant final fields in a given class (excepting an opt-out for single fields, of course). This begins to look more like a new declarative notation. But maybe there’s a clever way to express it in today’s Java.

Third, and toward the end of a more clever way to express bulk refactoring, a single magic utility method which is caller sensitive and can sense which class is asking for initialization of which field. I don’t think the pieces are there yet today, but they are close. (A bootstrap method would seem to be necessary here, since it gets exactly the right information.) This utility method would be somehow attached to all refactored fields in a given class.

The contract of this magic method would be to detect (or be told) the requesting class, detect (or be told) the field being requested for initialization, and then reflectively look for a static private method named $fieldinit$ or some such fixed name. That method would take the name of the static and quickly string-switch over to some shared code. I don’t know how to do this gracefully without language support, but maybe there’s an annotation-driven transform that could work, as a Leyden condenser, working from a cruder input, an input with many lambdas. The condenser would consolidate all of the lambda bodies into a single $fieldinit$ method, and require all CC constructions to instead use elements in a common CC-list. This would bring footprint down much closer to what <clinit> uses.

I have prototyped this $fieldinit$ based protocol, and a BSM that connects to it. I did the dumb thing and added a keyword to javac to enable it, so it immediately (without Leyden) takes the initializer and moves it into a switch case of $fieldinit$ instead of into the big basic block of <clinit>. I also prototyped just enough field adaptation logic in the VM, so that a DynamicValue field attribute (much like today’s ConstantValue attribute) points at a CP entry (a condy) whose BSM is the magic method I mentioned. I think it hangs together well. And I think the approach would work with a Leyden condenser as well, and without so many language and VM changes.

My prototype uses a stable array for storing CC-like states, and I would replace that by a real CC-array when we have that in the bag.

Fourth, maybe we want some sort of field-bridging mechanism which instructs the VM to call the $fieldinit$ guy more directly (via a BSM which naturally sees class and field name, unlike the reflective magic I mentioned above). That I also prototyped, and it is simple to add it to the existing one-time constant pool linkage paths for the getstatic opcode. But it feels ad hoc to me, not the right graceful solution. And VM changes are expensive, as I’ve already mentioned.

I’m writing all of the above to show you where my head is at personally about the problem of <clinit> unscrambling, and why I think ComputedConstant is a necessary step in that direction. Maybe it is even a sufficient step, if we can persuade users to stop using public static final fields and refactor their APIs to be based on zero-argument methods instead. That will be the case for some users, maybe enough to matter for Leyden. If not, the above thoughts are my take (just mine, for now) on how to “rescue” public fields as API points that are friendly to Leyden. But maybe we don’t need to rescue them; maybe we just consign them to the ash-heap of bad Java styles.

For the record, here is my prototype of the stuff mentioned above. It preceded ComputedConstant, and so does not use it, but it anticipates such a thing. I haven’t touched it for a while, so it’s just sketchy code.

https://github.com/rose00/jdk/tree/auto-static