the isthmus in the VM

John Rose

Tuesday, March 18, 2014

https://blogs.oracle.com/jrose/the-isthmus-in-the-vm

This is a good time to consider new options for a “native interconnect” between code managed by the JVM and APIs for libraries not managed by the JVM.

Notably, Charles Nutter has followed up on his JVM Language Summit talk (video on this page) by proposing JEP 191, to provide a new foreign function interface for Java. To access native data formats (and/or native-like ones inside the JVM), there are several projects under way including David Chase’s data layout package, Marcel Mitran’s packed object proposal, and Gil Tene’s object layout project.

This article describes some of the many questions related to native interconnect, along with some approaches for solving them. We will start Project Panama in OpenJDK to air out these questions thoroughly, and do some serious engineering to address them, for the JDK.

Let us use the term native interconnect for connections between the JVM and “native” libraries and their APIs. By “native” libraries I simply mean those routinely used by programmers of statically compiled languages outside the JVM.

the big goal

I think the general, basic, idealistic goal is something like this:

If non-Java programmers find some library useful and easy to access, it should be similarly accessible to Java programmers.

That ideal is easy to state but hard to carry out. The fundamental reason is simple—the languages are different. C++ programmers use the #include statement for pulling in APIs, but it would be deeply misguided to try to add #includes to the Java language. For more details on how language differences affect interconnection, see the discussion below. Happily, this is not completely new ground, since managed languages (including Lisp, Smalltalk, Haskell, Python, Lua, and more) have a rich history of support for native interconnect.

Most subtly, even if all the superficial differences could be adjusted, the rules for safe and secure usage of Java differ from those of the native, statically-compiled languages. There is a range of choices for ensuring that a native library gets safely used. The main two requirements are to make VM-damaging errors very rare, and (as a corollary) to make intentional attacks very difficult. We will get into more details below.

Besides safety, Java has a distinctive constellation of “cultural” values and practices, notably the features which provide safety and error management. So, the access to C APIs must be be adapted to the client language (Java) by means of numerous delicate compromises and engineering choices to preserve not only the “look and feel” of Java expressions but also their deeper cultural norms. By using the metaphor of culture, I don’t imagine a “Java way of life”, but I observe that there are “Java ways” of coding, which differ interestingly from other ways of coding. Cultural awareness becomes salient when cultures meet and mix.

Anyway, to get this done, we need to build a number of different artifacts, including Java libraries, JVM support mechanisms, tools, and format specifications. A number of possibilities are enumerated below.

why this is difficult

First, let’s survey some of the main challenges to full native interconnect.

syntax: Since the languages differ, Java user code for a native API will differ in syntax from the corresponding native user code, sometimes surprisingly. For example, Java 8 lambdas are very different in detail from C function pointers, although they sometimes have corresponding uses. Java has no general notions corresponding to C macros or C++ templates.
naming: Different languages have different rules for identifier formation, API scoping (packages vs. namespaces), and API element naming. Languages even have differing kinds of names: Java has distinct name spaces for fields and methods, while C++ has just members.
data types: Basic data types differ. Booleans, characters, strings, arrays differ between the languages C++ uses pointers, sometimes for information hiding, sometimes for structurally transparent data. Java uses managed references, which always have some hidden structure (the object header). And so on. A user-friendly Java interconnect to a native API needs to adjust the types of API arguments and return values to reduce surprises.
storage management: Many native libraries operate through pointers to memory, and they provide rules for managing that memory’s lifetime. Java and native languages have very distinct tactics for this. Java uses garbage collection and C++ libraries usually require manual storage management. Even if C++ were to add garbage collection, the details would probably be difficult to reconcile. A safe Java interconnect to a native API needs to manage native storage in a way that cannot crash the JVM.
exceptions: As with storage management, languages differ in how they handle error conditions. C++ and Java both have exceptions, but they are used (and behave) in very different ways. For example, C++ does not mandate null pointer exceptions. C APIs sometimes require ad hoc polling for errors. A user-friendly Java interconnect to a native API needs a clear story for producing exceptions, which is somehow derived from the native library’s notion of error reporting.
other semantics: Java’s strings are persistent (used to be called “immutable”) while C’s strings are directly addressable character arrays which can sometimes change. (And C++ strings are yet another thing.)
performance: Code which uses Java primitives performs on a par corresponding C code, but if an API exchanges information using other types, including strings, boxing or copying can cause performance “potholes”. I expect that value types will narrow the gap eventually for other C types, but they are not here yet.
safety: I’m putting this last, but it is the most difficult and important thing to get right. It deserves its own list of issues, but the gist of it is the JVM as a whole must continue to operate correctly even in the face of errors or abuse of any single API. The next section examines this requirement in detail.

safety first

The JVM as a whole must continue to operate correctly when native APIs are in use by various kinds of users.

no attacks from untrusted code: Untrusted code must not be allowed to subvert the correct operation of the JVM, even if it makes very unusual requests of native APIs available to it. This implies that many native APIs must be made inaccessible to untrusted code.
no privilege escalation from untrusted code: Untrusted users should not be able to access files, resources, or Java APIs via native APIs, if they would not already have access to them via Java code.
no crashes: It must be difficult for ordinary user code, and impossible for untrusted code, to crash the JVM using using a native API. Native API calls which might lead to unpredictable behavior must be detected and prevented in Java code, preferably by throwing exceptions. Pointers to native memory must be checked for null before all use, and discarded (e.g., set to null) when freed.
no leaks: It must be difficult or impossible for ordinary user code to use a native API to use memory or other system resources in a way that they cannot be recovered when the user code exits. Native resources must be used in a manner that is scoped
no hangs: It must be difficult or impossible for ordinary user to cause deadlocks or long pauses in system execution. Pauses for JVM housekeeping, like garbage collection, must not be noticeably lengthened because of waits for threads running native code.
rare outages: Even if code is partially or fully trusted, errors that might lead to crashes, leaks, or hangs must be detected before they cause the outage, almost always.
no unguarded casts: If privileged Java code must use cast-like operators to adjust its view of native data or functions, the casting must be done only after some kind of check has proven that the cast will be valid. This implies that native data and functions must be accessed through Java APIs that fully describe the native APIs and can mechanically check their use.

From these observations, it is evident that there are at least three trust levels that are relevant to native interconnect: untrusted, normal, and privileged.

Java enforces configurable security policies on untrusted code, using APIs like the security manager. This ensures that untrusted code cannot break the system (or elevate privileges) even if APIs are abused.

Normal code is the sort of code which can run in a JVM without a security manager set. Such code might be able to damage the JVM, using APIs like sun.misc.Unsafe, but will not do so by accident. As a practical way to reduce risk, we can search normal code for risky operations, which should be isolated, and review their use for safety.

I think many of the tricky details of native interconnect are related to this concept of privileged code. Any system like the JVM that enforces safety invariants or access restrictions has trusted, privileged code that performs unsafe or all-access operations, such as file system access, on behalf of other kinds of code.

Put another way, privileged code is expected to be in the risky business. It is engineered with great care to conform to safety and security policies. It supports requests from non-privileged code—even untrusted code—after access checks on behalf of the requester. Privileged code needs maximum access to native APIs of the underlying system, and must use them in a way that does not propagate that access to other requesters.

engineering privileged wrapper code

In the present discussion, we can identify at least two levels of binding from Java code to native APIs: a privileged “raw access” to most or all API features, and a wrapped access that provides safety guarantees that match the cultural expectation of Java programmers.

So let’s examine the process of engineering the wrapper code that stands between normal Java users and native APIs.

In current implementations of the JDK, native APIs are wrapped in hand-written JNI wrapper code, written in C. In particular, all C function calls are initiated from JNI wrappers.

(There is plenty of other privileged code written both in Java and C++. Much Java code in packages under java.lang and sun is privileged in some way. Most of it is not relevant to the present subject.)

Ideally, wrapper code should be constructed or checked mechanically when possible. In the present system, the javah tool assists, slightly, in bridging between Java APIs and JNI code. JNI wrapper code is checked by the native C compiler. And that is about all. Surely Java-centered tools could do more.

On the other hand, as we saw above, bringing the languages together is hard. No tool can erase the cultural differences between Java and native languages. There will always be ad hoc adjustment to reduce or remove hazards from native APIs. Such adjustments will usually be engineered by hand in privileged code, as they are today in JNI wrapper code.

We must ask ourselves, why bother to build new mechanisms for native interconnect when JNI wrappers already do the job? If manual coding will always be required, perhaps it is better to do the coding in the native language, where (obviously) the native APIs are most handy. In that case, there would be no need for Java code ever to perform unsafe operations. Isn’t this desirable?

I think the general answer is that we can improve on the trade-offs provided by the present set of tools and procedures. Specifically, by using more Java-centered tools and procedures, we can improve performance. Independently of performance, we can also decrease the engineering costs of safety.

better performance without compromising safety

Safety will always trade against performance, but—as Java has proven over its lifetime—it is possible with care to formulate and optimize safety checks that do not interfere unacceptably with performance.

Classic JNI performance is relatively poor, and some of the reasons are inherent in its design. JNI wrappers are created and maintained by hand, which means that the JVM cannot “see into” them for optimizing them.

If the JNI wrappers were recoded in Java (or some other transparent representation) then the JVM could much better optimize the enforcement of safety checks. For example, a program containing many JNI calls could be reorganized as one which grouped the required safety checks (and other housekeeping) into a smaller number of common blocks of code. These blocks could then be optimized, amortizing the cost of safety checks across many JNI calls.

Analogous optimizations of lock coarsening or boxing elimination are possible because all the operations are fully transparent to the JVM. By comparison, there is much unnecessary overhead around native calls today.

This sort of optimization is routine when the thing being called can be broken down into analyzable parts by the JIT compiler. But C-coded JNI wrappers are totally opaque to it. The same is currently true of the wrappers created by JNR, but they are regular enough in structure that the JIT can begin to optimize them.

In my opinion, a good goal is to continue opening up the representation of native API calls until the optimized JIT code for a native API call is, well, optimal. That is, it can and should consist of a direct call to the native API, surrounded by a modest amount of housekeeping, and all inlined and optimized with the client Java code.

Making this happen in the compiler will require certain design adjustments. Specifically, the metadata for the native API must be provided in a form suitable for both the JVM interpreter and compiler. More precisely, it must support both execution by the JVM interpreter and/or first-level JIT, and also optimizing compilation by the full JIT. This implies that the native API metadata must contain some of the same kind of information about function and data shape that a C compiler uses to compile calls within C code.

lower engineering costs for safety

I also think that coding more wrapper logic in Java instead of C will provide more correctness at a lower engineering cost. Although wrapper code in C has the advantage of direct access to native APIs, the code itself is difficult to write and to review for correctness. C programmers can create errors such as unsafe casts in a few benign-looking keystrokes. C-oriented tools can flag potential errors, but they are not designed to enforce Java safety norms.

If direct access to C APIs were available to Java code, all other aspects of wrapper engineering would be simpler and easier to verify as correct. Java code is safer and more verifiable than C code. If written by hand, it is often more compact and simple than corresponding C code. Routine aspects of wrapper engineering could be specified declaratively, using specialized tools to generate Java code or bytecode automatically. Whether Java wrapper code is created manually or automatically, it is subject to layers of safety checking (verifying and dynamic linking) that C code does not enjoy. And Java code (both source files and class files) can be easily inspected by tools such FindBugs.

The strength of such an automated approach can be seen in the work noted by JEP 191, the excellent JNR project. For a quick look at a “hello world” type example from JNR, see Getpid.java. Although the emphasis on JNR is on function calling, integrated native interconnect to functions, data, and types is also possible.

Side note: My personal favorite example of automated language integration is an old project that integrated C++ and Scheme on Solaris. The native interconnect was strong enough in that system to allow full interactive exploration of C++ APIs using the Scheme interpreter. That was fun.

One way we can improve on the safe use of these prior technologies is to provide more mechanical infrastructure for reasoning about the safety of Java application components. It should be possible to create wrapper libraries that internally use unsafe native APIs but reliably block their users from accessing those APIs. To me this feels like a module system design problem. In any case, it must be possible to correctly label, track, review, and control both unsafe code and the wrapper code that secures it.

wrapper tactics

A likely advantage of Java-based wrappers is easier access to good engineering tactics for wrapping native APIs. Here are a few examples of such tactics:

exception conversion: Error reporting conventions specific to native languages or APIs can be converted to Java exceptions.
pointer handles: Native pointers which can or must be freed can be stored in Java wrapper objects which nullify the saved pointer when it is freed, and check for this state as needed.
wrapper objects: Native data can be encapsulated inside Java objects to mediate access by providing a safe view. The object can use an internal handle field to manage native lifetime.
(Future wrapper values: In cases where stateless wrappers can do the job, value types are likely to provide provide cheaper encapsulation in the future. This would be the case with primitive types not in Java, such as unsigned long or platform specific vectors. When native lifetime is not an issue, value types could also provide encapsulating views of native pointers, structs, and arrays.)
resource scoping: APIs which require critical sections or paired primitives can be mapped to the Java try-with-resources syntax or refactored into a callback driven style (using lambdas).
language feature mapping: Corresponding types and operations can usually be mapped according to simple conventional rules. For example, a C char* can usually be represented by a Java String object at an API boundary. (But, these mappings must be tunable on a case-by-case basis.)
static typing: The Java type system can represent a wide variety of type shapes.
design rule checking: Ad hoc usage rules for native APIs can be enforced as executable assertions in code wrapped around the unchecked native API.
interfaces: Every transfer of control or data into or out of a native API can (and should) be mediated through a Java interface. In this way fully abstract API shapes can be presented directly to the (unprivileged) end user without exposing sensitive implementations.

Most of these tactics can be made automatic or semi-automatic within a code generation tool, and apply routinely unless manually disabled. This will further reduce the need for tricky hand-maintained code.

Interfaces are particularly useful for expressing groups of methods, since they express (mostly) pure behavior rather than Java object implementation. Also, interfaces are easy to compose and adapt, allowing flexible application of many of the above tactics.

As used to represent an extracted native API, an interface would be unique to that API. Uses of such interfaces would tend to be in one-to-one correspondence with their implementations. In that case JVMs are routinely able to remove the overhead of method selection and invocation by inlining the only relevant implementation.

questions to answer, artifacts to build

A native interconnect story will supply answers to a number of related questions:

How do we simplify the user experience for Java programmers who use C and C++ APIs? (The benchmark is the corresponding experiences of C and C++ programmers, as well as the experiences of today’s JNI programmers.)
What appropriate tools, APIs, and data formats support these experiences? Specifically, how is API metadata produced, stored, loaded, and used? How are native libraries named and loaded?
What appropriate JVM and JDK infrastructure works with native API elements (layouts, functions, etc.) from Java code (interpreter and JIT)?
How performant are calls and data access to native libraries? (Again, the benchmark is the corresponding experiences of C and C++ programmers, as well as the experiences of today’s JNI programmers.) enjoyed by their primary users (programmers of C, C++, Fortran, etc.).
What are the definite, reliable safety levels available for using native libraries from Java? This includes the question: What is the range of options between automatic, perhaps unsafe import, and engineered hand-adjustments?
What are the options for managing portability? This includes the use of platform-specific libraries, and a story for switching between platform-specific bindings and portable backup implementations.

Answering these questions affirmatively will require us to build some interesting technology, including discrete and separable projects to enable these functions:

native function calling from JVM (C, C++)
native data access from JVM or inside JVM heap
new data layouts in JVM heap
native metadata definition for JVM
header file API extraction tools (see below)
native library management APIs
native-oriented interpreter and runtime “hooks”
class and method resolution “hooks”
native-oriented JIT optimizations
tooling or wrapper interposition for safety
exploratory work with difficult-to-integrate native libraries

Project Panama in OpenJDK will provide a venue for exploring these projects. Some of them will be closely aligned with OpenJDK JEPs, notably JEP 191, allowing the Project to incubate early work on them.

Other inspiration and/or implementation starting points include:

the Java Native Runtime package and the libffi native call binder
Java data layout packages
JVM support for new layouts (IBM packed objects, Sun Labs Maxine hybrids, Arrays 2.0)
metadata-based native API extractors (WinRT metadata)
existing JVM infrastructure (class files, SA, JNI, sun.misc.Unsafe)

A native header file import tool scans C or C++ header files and provides raw native bindings for privileged Java code. Such tools exist already for other languages, and can get colorful names like SWIG or Groveller.

For the present purposes, I suggest a simpler name like jextract. A high-quality implementation for Java could start with an off-the-shelf front end like libclang. It would apply Java-oriented rules (with hand-tunable defaults) and produce some form of metadata, such as loadable class files.

A toolchain that embodies many of these ideas could look something like this:

 /-----------|    /-----------|
|  stdio.h   |   | stdio.java |
|------------|   |------------|
      |               |
      v               |
|------------|        |
|  jextract  |  <-----/
|------------|
      |
      v
 /-----------|
| stdio.jar  |     /------------|
|------------|     | userapp.jar|
      |            |------------|
      v                  |
|------------|           |
|    jvm     |  <--------/      /---------|
|            |  <--------------| libc.dll |
|------------|                 |----------|

The stdio.java file would contain hand-written adjustments to the raw API from the header file. The stdio.jar file would contain automatically gathered metadata from the header file, plus the results of compiling stdio.java. The contents of stdio.java could be straight Java code for the user-level API, but could also be annotations to be expanded by a code generation step in the extraction process.

The code in userapp.jar would access the features it needs from stdio.jar. The implementations of these interfaces would avoid C code as much as possible, so that the JVM’s JIT can optimize them suitably.

Side note: The familiar header file I am picking on is actually unlikely to need this full treatment. In a more typical case, a whole suite of header files would be extracted and wrapped.

For bootstrapping or pure interpretation, a minimum set of trusted primitives are required in the JVM to perform data access and function call. these would be coded in C and also known to the JIT as intrinsics. They can be made general enough to implement once in the JVM, rather than loaded (as JNI wrappers are loaded today) separately for each native API. For example, JNR uses a set of less than 100 specially designed JNI methods to perform all native calls; these methods are collectively called jffi.

Building such toolchains will allow cheaper, faster commerce between Java applications and native APIs, much as the famous Panama Canal cuts through the rocky isthmus that separates the Atlantic and Pacific Oceans.

Let’s keep digging.

Appendix: preserving Java culture

Let’s go back to the metaphor of culture as it applies to the world of Java programming.

Here is a list of benefits about Java that programmers rely on, which any design for native interconnect must preserve. As a group, these features support a set of basic programming practices and styles which allow programmers great freedom to create good code. They can be viewed as the basis of a programming “culture”, peculiar to Java, which fosters safe, useful, performant, maintainable code.

Side note: This list contains many truisms and will be unsurprising to Java users. Remember that culture is often overlooked until two cultures meet. I am writing this list in hopes it will prove useful as a checklist to help analyze design problems with native interconnect, and to evaluate solutions. Also, I am claiming that the sum total of these items underlies a unique programming culture or ecosystem to Java, but not that they are individually unique to Java.

basic type safety: Pointers, integers, and floats must not be confused; conversions must be explicit and must preserve VM integrity. This applies to values of all kinds, in memory and elsewhere.
basic operation safety: Any basic VM operation either completes according its specification, or produces a catchable exception. It cannot corrupt memory or any other VM state.
class safety: Pointer conversions must be explicit and checked. There are exceptions for conversion to a Java superclass (which is always safe), to a Java interface (which is always checked later at any use point), and to an erased generic type (which is checked implicitly).
storage lifetime safety: No block of memory can be accessed after it has been deallocated. This is why we have automatic storage management.
variable domain safety: There is no way to obtain “garbage” or indeterminately initialized values of any type (especially pointers, of course).
API type checking: Every use of an API, such as a method call, is fully type-consistent with its definition (such has a method definition). This requirement serves the earlier ones, of course; it shows up in detail in the operation of Java’s dynamic linkage rules.
late linking: All uses of names, including class, method, and field names, are resolved and access-checked not only at compile time but also at run time. Separately compiled modules (classes) cannot observe the implementation details of other modules.
concurrency safety: Race conditions between threads can be prevented, or their effects can be predicted usefully, or (at worst) they cannot violate the other safety invariants.
error manifestation: Exceptional or erroneous conditions are not discarded. They are manifested as thrown exceptions, which will be caught and/or displayed.
access control: Non-public or otherwise restricted API points cannot be accessed except by their specified users. Access is enforced at all phases of compilation and execution. System internals cannot be touched except by highly trusted code.
appropriately concise: Typically, Java code does not pay for any of Java’s built-in safety features by unnecessary verbosity. Safe and sane practices are encouraged by simpler notations. The “semantic payload” of a bit of code is not obscured by any necessary ceremony. (But note next points.)
predictably explicit: Typically, complex or potentially expensive features of Java are made explicit by a visible syntax, such as a method call. (This point is in tension with the previous point, and reasonable people differ on the proper resolution.)
explicit types: Java code has reasonably strong static typing, with many types explicitly written in the source code. (Notably, declaration types are explicit on the left, despite type inference elsewhere.) This feature catches errors early and gives IDEs helpful context for each name.
transparent code: Programs are represented using bytecode, which automated tools can inspect, verify, and transform. User-written annotations can help guide these tasks. There are easy to use, open source implementations of offline processors for both source code and bytecode, as well as the VM itself. Multiple good IDEs exist.
transparent data: Data can be inspected using reflection and other ubiquitous self-description machinery such as toString and debuggers. (Transparency of data is balanced with access control, of course.)
robust performance: With moderate programmer care and experience, simple single-threaded programs tend to not show surprising performance “potholes”, not even when they are composed together. Multi-threaded programs preserve and scale up throughput with additional CPUs, in the absence of algorithmic bottlenecks.

All of these benefits are familiar to Java programmers, perhaps even taken for granted. The corresponding benefits for a native language like C++ are often more complex, and require more work and care from the native programmer to achieve.

A good native interconnect story will provide ways to reliably dispose of this work and care before it gets to the end user coding Java to a native API.

This requires native APIs to be acculturated to Java by the artful creation of wrapper code, as noted above.