Changes to the Java® Virtual Machine Specification • Version 14-internal+0-adhoc.dlsmith.20190618
This document describes changes to the Java Virtual Machine Specification to clean up its treatment of types. Changes mainly fall into one of the following categories:
Consistently recognizing that a CONSTANT_Class_info
may reference a class, an interface, or an array type. Eliminating attempts to model array types as if they were classes.
Clarifying the treatment of boolean
, byte
, short
, and char
: they are legitimate types, but when placed on the stack, values of these types are implicitly converted to int
.
Revisions to verification: distinguishing between, and making appropriate use of, class/interface types and loaded classes; refinements to subtyping and instruction encoding.
Centralizing rules for identifying classes in descriptors that are subject to loader constraints.
Eliminating unnecessary references to java.lang.Class
; instead, the representation of classes and interfaces is usually an implementation detail.
Removing unnecessary references to the Java language.
These changes are presentational: no change in the behavior of a JVM implementation is intended.
Changes are described with respect to existing sections of the JVM Specification. New text is indicated like this and deleted text is indicated like this. Explanation and discussion, as needed, is set aside in grey boxes.
Like the Java programming language, the The Java Virtual Machine operates on two kinds of types: primitive types and reference types. There are, correspondingly, two kinds of values that can be stored in variables, passed as arguments, returned by methods, and operated upon: primitive values and reference values.
The Java Virtual Machine expects that nearly all type checking is done prior to run time, typically by a compiler, and does not have to be done by the Java Virtual Machine itself. Values of primitive types need not be tagged or otherwise be inspectable to determine their types at run time, or to be distinguished from values of reference types. Instead, the instruction set of the Java Virtual Machine distinguishes its operand types using instructions intended to operate on values of specific types. For instance, iadd, ladd, fadd, and dadd are all Java Virtual Machine instructions that add two numeric values and produce numeric results, but each is specialized for its operand type: int
, long
, float
, and double
, respectively. For a summary of type support in the Java Virtual Machine instruction set, see 2.11.1.
The Java Virtual Machine contains explicit support for objects. An object is either a dynamically allocated class instance or an array. A reference to an object is considered to have Java Virtual Machine type reference
. References are polymorphic: a single reference
may also be a value of multiple class types, interface types, or array types. Values of type reference
can be thought of as pointers to objects. More than one reference to an object may exist. Objects are always operated on, passed, and tested via values of type reference
.
Each frame (2.6) contains an array of variables known as its local variables. The length of the local variable array of a frame is determined at compile-time and supplied in the binary representation of a class or interface along with the code for the method associated with the frame (4.7.3).
A single local variable can hold a value of type boolean
, byte
, char
, short
,int
, float
, reference
, or returnAddress
. A pair of local variables can hold a value of type long
or double
.
Local variables are addressed by indexing. The index of the first local variable is zero. An integer is considered to be an index into the local variable array if and only if that integer is between zero and one less than the size of the local variable array.
A value of type long
or type double
occupies two consecutive local variables. Such a value may only be addressed using the lesser index. For example, a value of type double
stored in the local variable array at index n actually occupies the local variables with indices n and n+1; however, the local variable at index n+1 cannot be loaded from. It can be stored into. However, doing so invalidates the contents of local variable n.
The Java Virtual Machine does not require n to be even. In intuitive terms, values of types long
and double
need not be 64-bit aligned in the local variables array. Implementors are free to decide the appropriate way to represent such values using the two local variables reserved for the value.
The Java Virtual Machine uses local variables to pass parameters on method invocation. On class method invocation, any parameters are passed in consecutive local variables starting from local variable 0. On instance method invocation, local variable 0 is always used to pass a reference to the object on which the instance method is being invoked (this
in the Java programming language). Any parameters are subsequently passed in consecutive local variables starting from local variable 1.
Most of the instructions in the Java Virtual Machine instruction set encode type information about the operations they perform. For instance, the iload instruction (6.5.iload) loads the contents of a local variable, which must be an int
, onto the operand stack. The fload instruction (6.5.fload) does the same with a float
value. The two instructions may have identical implementations, but have distinct opcodes.
For the majority of typed instructions, the instruction type is represented explicitly in the opcode mnemonic by a letter: i for an int
operation, l for long
, s for short
, b for byte
, c for char
, f for float
, d for double
, and a for reference
. Some instructions for which the type is unambiguous do not have a type letter in their mnemonic. For instance, arraylength always operates on an object that is an array. Some instructions, such as goto, an unconditional control transfer, do not operate on typed operands.
Given the Java Virtual Machine's one-byte opcode size, encoding types into opcodes places pressure on the design of its instruction set. If each typed instruction supported all of the Java Virtual Machine's run-time data types, there would be more instructions than could be represented in a byte. Instead, the instruction set of the Java Virtual Machine provides a reduced level of type support for certain operations. In other words, the instruction set is intentionally not orthogonal. Separate instructions can be used to convert between unsupported and supported data types as necessary.
Table 2.11.1-A summarizes the type support in the instruction set of the Java Virtual Machine. A specific instruction, with type information, is built by replacing the T in the instruction template in the opcode column by the letter in the type column. If the type column for some instruction template and type is blank, then no instruction exists supporting that type of operation. For instance, there is a load instruction for type int
, iload, but there is no load instruction for type byte
.
Note that most instructions in Table 2.11.1-A do not have forms for the integral types byte
, char
, and short
. None have forms for the boolean
type. A compiler encodes loads of literal values of types Whenever values of types byte
and short
using Java Virtual Machine instructions that sign-extend those values to values of type int
at compile-time or run-time. Loads of literal values of types boolean
and char
are encoded using instructions that zero-extend the literal to a value of type int
at compile-time or run-time. Likewise, loads from arrays of values of type boolean
, byte
, short
, and char
are encoded using Java Virtual Machine instructions that sign-extend or zero-extend the values to values of type int
.byte
and short
are loaded onto the operand stack, they are implicitly converted by sign extension to values of type int
. Similarly, whenever values of types boolean
and char
are loaded onto the operand stack, they are implicitly converted by zero extension to values of type int
. Thus, most operations on values originally of actual types boolean
, byte
, char
, and short
are correctly performed by instructions operating on values of computational type int
.
Table 2.11.1-A. Type support in the Java Virtual Machine instruction set
opcode | byte |
short |
int |
long |
float |
double |
char |
reference |
---|---|---|---|---|---|---|---|---|
Tipush | bipush | sipush | ||||||
Tconst | iconst | lconst | fconst | dconst | aconst | |||
Tload | iload | lload | fload | dload | aload | |||
Tstore | istore | lstore | fstore | dstore | astore | |||
Tinc | iinc | |||||||
Taload | baload | saload | iaload | laload | faload | daload | caload | aaload |
Tastore | bastore | sastore | iastore | lastore | fastore | dastore | castore | aastore |
Tadd | iadd | ladd | fadd | dadd | ||||
Tsub | isub | lsub | fsub | dsub | ||||
Tmul | imul | lmul | fmul | dmul | ||||
Tdiv | idiv | ldiv | fdiv | ddiv | ||||
Trem | irem | lrem | frem | drem | ||||
Tneg | ineg | lneg | fneg | dneg | ||||
Tshl | ishl | lshl | ||||||
Tshr | ishr | lshr | ||||||
Tushr | iushr | lushr | ||||||
Tand | iand | land | ||||||
Tor | ior | lor | ||||||
Txor | ixor | lxor | ||||||
i2T | i2b | i2s | i2l | i2f | i2d | |||
l2T | l2i | l2f | l2d | |||||
f2T | f2i | f2l | f2d | |||||
d2T | d2i | d2l | d2f | |||||
Tcmp | lcmp | |||||||
Tcmpl | fcmpl | dcmpl | ||||||
Tcmpg | fcmpg | dcmpg | ||||||
if_TcmpOP | if_icmpOP | if_acmpOP | ||||||
Treturn | ireturn | lreturn | freturn | dreturn | areturn |
The mapping between Java Virtual Machine actual original types and Java Virtual Machine computational types is summarized by Table 2.11.1-B.
Certain Java Virtual Machine instructions such as pop and swap operate on the operand stack without regard to type; however, such instructions are constrained to use only on values of certain categories of computational types, also given in Table 2.11.1-B.
Table 2.11.1-B. Actual Original and Computational types in the Java Virtual Machine
Computational type | Category | |
---|---|---|
boolean |
int |
1 |
byte |
int |
1 |
char |
int |
1 |
short |
int |
1 |
int |
int |
1 |
float |
float |
1 |
reference |
reference |
1 |
returnAddress |
returnAddress |
1 |
long |
long |
2 |
double |
double |
2 |
The load and store instructions transfer values between the local variables (2.6.1) and the operand stack (2.6.2) of a Java Virtual Machine frame (2.6):
Load a local variable onto the operand stack: iload, iload_<n>, lload, lload_<n>, fload, fload_<n>, dload, dload_<n>, aload, aload_<n>.
Store a value from the operand stack into a local variable: istore, istore_<n>, lstore, lstore_<n>, fstore, fstore_<n>, dstore, dstore_<n>, astore, astore_<n>.
Load a constant on to the operand stack: bipush, sipush, ldc, ldc_w, ldc2_w, aconst_null, iconst_m1, iconst_<i>, lconst_<l>, fconst_<f>, dconst_<d>.
Gain access to more local variables using a wider index, or to a larger immediate operand: wide.
Instructions that access fields of objects and elements of arrays ([2.11.5]) also transfer data to and from the operand stack.
Instruction mnemonics shown above with trailing letters between angle brackets (for instance, iload_<n>) denote families of instructions (with members iload_0, iload_1, iload_2, and iload_3 in the case of iload_<n>). Such families of instructions are specializations of an additional generic instruction (iload) that takes one operand. For the specialized instructions, the operand is implicit and does not need to be stored or fetched. The semantics are otherwise the same (iload_0 means the same thing as iload with the operand 0). The letter between the angle brackets specifies the type of the implicit operand for that family of instructions: for <n>, a nonnegative integer; for <i>, an int
; for <l>, a long
; for <f>, a float
; and for <d>, a double
. Forms for type int
are used in many cases to perform operations on values of type byte
, char
, and short
(2.11.1).
This notation for instruction families is used throughout this specification.
The following five instructions invoke methods:
invokevirtual invokes an instance method of an object, dispatching on the (virtual) type of the object. This is the normal method dispatch in the Java programming language.
invokeinterface invokes an interface method, searching the methods implemented by the particular run-time object to find the appropriate method.
invokespecial invokes an instance method requiring special handling, either an instance initialization method (2.9.1) or a method of the current class or its supertypes.
invokestatic invokes a class (static
) method in a named class.
invokedynamic invokes the method which is the target of the call site object bound to the invokedynamic instruction. The call site object was bound to a specific lexical occurrence of the invokedynamic instruction by the Java Virtual Machine as a result of running a bootstrap method before the first execution of the instruction. Therefore, each occurrence of an invokedynamic instruction has a unique linkage state, unlike the other instructions which invoke methods.
The method return instructions, which are distinguished by return type, are ireturn (used to return values of type , lreturn, freturn, dreturn, and areturn. In addition, the return instruction is used to return from methods declared to be void, instance initialization methods, and class or interface initialization methods.boolean
, byte
, char
, short
, or int
)
class
File FormatClassFile
StructureA class
file consists of a single ClassFile
structure:
ClassFile {
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
The items in the ClassFile
structure are as follows:
The magic
item supplies the magic number identifying the class
file format; it has the value 0xCAFEBABE
.
The values of the minor_version
and major_version
items are the minor and major version numbers of this class
file. Together, a major and a minor version number determine the version of the class
file format. If a class
file has major version number M and minor version number m, we denote the version of its class
file format as M.m.
A Java Virtual Machine implementation which conforms to Java SE N must support exactly the major versions of the class
file format specified for Java SE N in [Table 4.1-A]. The notation A .. B means major versions A through B inclusive. The column "Corresponding major version" denotes the major version introduced by each Java SE release, that is, the first release that could have accepted a class
file containing that major_version
item. For very early releases, the JDK version is shown instead of the Java SE release.
Table 4.1-A. class
file format major versions
Java SE | Corresponding major version | Supported major versions |
---|---|---|
1.0.2 | 45 | 45 |
1.1 | 45 | 45 |
1.2 | 46 | 45 .. 46 |
1.3 | 47 | 45 .. 47 |
1.4 | 48 | 45 .. 48 |
5.0 | 49 | 45 .. 49 |
6 | 50 | 45 .. 50 |
7 | 51 | 45 .. 51 |
8 | 52 | 45 .. 52 |
9 | 53 | 45 .. 53 |
10 | 54 | 45 .. 54 |
11 | 55 | 45 .. 55 |
12 | 56 | 45 .. 56 |
For a class
file whose major_version
is 56 or above, the minor_version
must be 0 or 65535.
For a class
file whose major_version
is between 45 and 55 inclusive, the minor_version
may be any value.
A historical perspective on versions of the
class
file format is warranted. JDK 1.0.2 supported versions 45.0 through 45.3 inclusive. JDK 1.1 supported versions 45.0 through 45.65535 inclusive. When JDK 1.2 introduced support for major version 46, the only minor version supported under that major version was 0. Later JDKs continued the practice of introducing support for a new major version (47, 48, etc) but supporting only a minor version of 0 under the new major version. Finally, the introduction of preview features in Java SE 12 (see below) motivated a standard role for the minor version of theclass
file format, so JDK 12 supported minor versions of 0 and 65535 under major version 56. Subsequent JDKs introduce support for N.0 and N.65535 where N is the corresponding major version of the implemented Java SE Platform.
The Java SE Platform may define preview features. A Java Virtual Machine implementation which conforms to Java SE N (N ≥ 12) must support all the preview features of Java SE N, and none of the preview features of any other Java SE release. The implementation must by default disable the supported preview features, and must provide a way to enable all of them, and must not provide a way to enable only some of them.
A class
file is said to depend on the preview features of Java SE N (N ≥ 12) if it has a major_version
that corresponds to Java SE N (according to [Table 4.1-A]) and a minor_version
of 65535.
A Java Virtual Machine implementation which conforms to Java SE N (N ≥ 12) must behave as follows:
A class
file that depends on the preview features of Java SE N may be loaded only when the preview features of Java SE N are enabled.
A class
file that depends on the preview features of another Java SE release must never be loaded.
A class
file that does not depend on the preview features of any Java SE release may be loaded regardless of whether the preview features of Java SE N are enabled.
The value of the constant_pool_count
item is equal to the number of entries in the constant_pool
table plus one. A constant_pool
index is considered valid if it is greater than zero and less than constant_pool_count
, with the exception for constants of type long
and double
noted in 4.4.5.
The constant_pool
is a table of structures (4.4) representing various string constants, class and interface names, field names, and other constants that are referred to within the ClassFile
structure and its substructures. The format of each constant_pool
table entry is indicated by its first "tag" byte.
The constant_pool
table is indexed from 1 to constant_pool_count
- 1.
The value of the access_flags
item is a mask of flags used to denote access permissions to and properties of this class or interface. The interpretation of each flag, when set, is specified in [Table 4.1-B].
Table 4.1-B. Class access and property modifiers
Flag Name | Value | Interpretation |
---|---|---|
ACC_PUBLIC |
0x0001 | Declared public ; may be accessed from outside its package. |
ACC_FINAL |
0x0010 | Declared final ; no subclasses allowed. |
ACC_SUPER |
0x0020 | Treat superclass methods specially when invoked by the invokespecial instruction. |
ACC_INTERFACE |
0x0200 | Is an interface, not a class. |
ACC_ABSTRACT |
0x0400 | Declared abstract ; must not be instantiated. |
ACC_SYNTHETIC |
0x1000 | Declared synthetic; not present in the source code. |
ACC_ANNOTATION |
0x2000 | Declared as an annotation type. |
ACC_ENUM |
0x4000 | Declared as an enum type. |
ACC_MODULE |
0x8000 | Is a module, not a class or interface. |
The ACC_MODULE
flag indicates that this class
file defines a module, not a class or interface. If the ACC_MODULE
flag is set, then special rules apply to the class
file which are given at the end of this section. If the ACC_MODULE
flag is not set, then the rules immediately below the current paragraph apply to the class
file.
An interface is distinguished by the ACC_INTERFACE
flag being set. If the ACC_INTERFACE
flag is not set, this class
file defines a class, not an interface or module.
If the ACC_INTERFACE
flag is set, the ACC_ABSTRACT
flag must also be set, and the ACC_FINAL
, ACC_SUPER
, ACC_ENUM
, and ACC_MODULE
flags set must not be set.
If neither the ACC_MODULE
flag nor the ACC_INTERFACE
flag is not set, the ClassFile
structure represents a class, and any of the other flags in [Table 4.1-B] may be set except ACC_ANNOTATION
and . ACC_MODULE
However, such a class
file must not have both its ACC_FINAL
and ACC_ABSTRACT
flags set (JLS §8.1.1.2).
If the ACC_FINAL
flag is set, the ACC_ABSTRACT
flag must not be set.
The ACC_SUPER
flag indicates which of two alternative semantics is to be expressed by the invokespecial instruction ([6.5.invokespecial]) if it appears in this class or interface. Compilers to the instruction set of the Java Virtual Machine should set the ACC_SUPER
flag. In Java SE 8 and above, the Java Virtual Machine considers the ACC_SUPER
flag to be set in every class
file, regardless of the actual value of the flag in the class
file and the version of the class
file.
The
ACC_SUPER
flag exists for backward compatibility with code compiled by older compilers for the Java programming language. Prior to JDK 1.0.2, the compiler generatedaccess_flags
in which the flag now representingACC_SUPER
had no assigned meaning, and Oracle's Java Virtual Machine implementation ignored the flag if it was set.
The ACC_SYNTHETIC
flag indicates that this class or interface was generated by a compiler and does not appear in source code.
An annotation type (JLS §9.6) must have its ACC_ANNOTATION
flag set. If the ACC_ANNOTATION
flag is set, the ACC_INTERFACE
flag must also be set.
The ACC_ENUM
flag indicates that this class or its superclass is declared as an enumerated type (JLS §8.9).
All bits of the access_flags
item not assigned in [Table 4.1-B] are reserved for future use. They should be set to zero in generated class
files and should be ignored by Java Virtual Machine implementations.
The value of the this_class
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure (4.4.1) representing the class or interface defined by this class
file.
For a class, the value of the super_class
item either must be zero or must be a valid index into the constant_pool
table. If the value of the super_class
item is nonzero, the constant_pool
entry at that index must be a CONSTANT_Class_info
structure representing the direct superclass of the class defined by this class
file. Neither the direct superclass nor any of its superclasses may have the ACC_FINAL
flag set in the access_flags
item of its ClassFile
structure.
If the value of the super_class
item is zero, then this class
file must represent the class Object
, the only class or interface without a direct superclass.
For an interface, the value of the super_class
item must always be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure representing the class Object
.
The value of the interfaces_count
item gives the number of direct superinterfaces of this class or interface type.
Each value in the interfaces
array must be a valid index into the constant_pool
table. The constant_pool
entry at each value of interfaces[*i*]
, where 0 ≤ i < interfaces_count
, must be a CONSTANT_Class_info
structure representing an interface that is a direct superinterface of this class or interface type, in the left-to-right order given in the source for the type class or interface.
The value of the fields_count
item gives the number of field_info
structures in the fields
table. The field_info
structures represent all fields, both class variables and instance variables, declared by this class or interface type.
Each value in the fields
table must be a field_info
structure ([4.5]) giving a complete description of a field in this class or interface. The fields
table includes only those fields that are declared by this class or interface. It does not include items representing fields that are inherited from superclasses or superinterfaces.
The value of the methods_count
item gives the number of method_info
structures in the methods
table.
Each value in the methods
table must be a method_info
structure (4.6) giving a complete description of a method in this class or interface. If neither of the ACC_NATIVE
and ACC_ABSTRACT
flags are set in the access_flags
item of a method_info
structure, the Java Virtual Machine instructions implementing the method are also supplied.
The method_info
structures represent all methods declared by this class or interface type, including instance methods, class methods, instance initialization methods (2.9.1), and any class or interface initialization method (2.9.2). The methods
table does not include items representing methods that are inherited from superclasses or superinterfaces.
The value of the attributes_count
item gives the number of attributes in the attributes
table of this class.
Each value of the attributes
table must be an attribute_info
structure ([4.7]).
The attributes defined by this specification as appearing in the attributes
table of a ClassFile
structure are listed in [Table 4.7-C].
The rules concerning attributes defined to appear in the attributes
table of a ClassFile
structure are given in [4.7].
The rules concerning non-predefined attributes in the attributes
table of a ClassFile
structure are given in [4.7.1].
If the ACC_MODULE
flag is set in the access_flags
item, then no other flag in the access_flags
item may be set, and the following rules apply to the rest of the ClassFile
structure:
major_version
, minor_version
: ≥ 53.0 (i.e., Java SE 9 and above)
this_class
: module-info
super_class
, interfaces_count
, fields_count
, methods_count
: zero
attributes
: One Module
attribute must be present. Except for Module
, ModulePackages
, ModuleMainClass
, InnerClasses
, SourceFile
, SourceDebugExtension
, RuntimeVisibleAnnotations
, and RuntimeInvisibleAnnotations
, none of the pre-defined attributes ([4.7]) may appear.
A descriptor is a string representing the type of a field or method. Descriptors are represented in the class
file format using modified UTF-8 strings (4.4.7) and thus may be drawn, where not further constrained, from the entire Unicode codespace.
Descriptors are specified using a grammar. The grammar is a set of productions that describe how sequences of characters can form syntactically correct descriptors of various kinds. Terminal symbols of the grammar are shown in fixed width
font, and should be interpreted as ASCII characters. Nonterminal symbols are shown in italic type. The definition of a nonterminal is introduced by the name of the nonterminal being defined, followed by a colon. One or more alternative definitions for the nonterminal then follow on succeeding lines.
The syntax {x} on the right-hand side of a production denotes zero or more occurrences of x.
The phrase (one of) on the right-hand side of a production signifies that each of the terminal symbols on the following line or lines is an alternative definition.
A field descriptor represents the type of a class, instance, or local variable.
B
C
D
F
I
J
S
Z
L
ClassName ;
[
ComponentType
The characters of BaseType, the L
and ;
of ObjectType, and the [
of ArrayType are all ASCII characters.
ClassName represents a binary class or interface name encoded in internal form (4.2.1).
A field descriptor mentions a class or interface name if the name appears as a ClassName in the descriptor. This includes a ClassName nested in the ComponentType of an ArrayType.
This definition of mentions allows us to eliminate boilerplate in a handful of other sections and gives us a single location to identify the classes that are subject to loading constraints, resolution, etc. The definition is flexible enough to support new kinds of types that may be added in the future.
The interpretation of field descriptors as types is shown in Table 4.3-A. See 2.2, 2.3, and 2.4 for the meaning of these types.
A field descriptor representing an array type is valid only if it represents a type with 255 or fewer dimensions.
Table 4.3-A. Interpretation of field descriptors
FieldType term | Type | |
---|---|---|
B |
byte |
|
C |
char |
|
D |
double |
|
F |
float |
|
I |
int |
|
J |
long |
|
L ClassName ; |
reference |
|
S |
short |
|
Z |
boolean |
true or false |
[ |
reference |
The field descriptor of an instance variable of type
int
is simplyI
.The field descriptor of an instance variable of type
Object
isLjava/lang/Object;
. Note that the internal form of the binary name for classObject
is used.The field descriptor of an instance variable of the multidimensional array type
double[][][]
is[[[D
.
The "Interpretation" column of Table 4.3-A is redundant; these details are better left to sections 2.2, 2.3 and 2.4.
A method descriptor contains zero or more parameter descriptors, representing the types of parameters that the method takes, and a return descriptor, representing the type of the value (if any) that the method returns.
(
{ParameterDescriptor} )
ReturnDescriptor
V
The character V
indicates that the method returns no value (its result is void
).
A method descriptor mentions a class or interface name if the name appears as a ClassName in the FieldType of a parameter descriptor or return descriptor.
The method descriptor for the method:
Object m(int i, double d, Thread t) {...}
is:
(IDLjava/lang/Thread;)Ljava/lang/Object;
Note that the internal forms of the binary names of
Thread
andObject
are used.
A method descriptor is valid only if it represents method parameters with a total length of 255 or less, where that length includes the contribution for this
in the case of instance or interface method invocations. The total length is calculated by summing the contributions of the individual parameters, where a parameter of type long
or double
contributes two units to the length and a parameter of any other type contributes one unit.
A method descriptor is the same whether the method it describes is a class method or an instance method. Although an instance method is passed this
, a reference to the object on which the method is being invoked, in addition to its intended arguments, that fact is not reflected in the method descriptor. The reference to this
is passed implicitly by the Java Virtual Machine instructions which invoke instance methods (2.6.1, 4.11).
CONSTANT_Class_info
StructureThe CONSTANT_Class_info
structure is used to represent a class, or an interface, or an array type**:
CONSTANT_Class_info {
u1 tag;
u2 name_index;
}
The items of the CONSTANT_Class_info
structure are as follows:
The tag
item has the value CONSTANT_Class
(7).
The value of the name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing either a valid binary class or interface name encoded in internal form (4.2.1) or an ArrayType descriptor (4.3.2).
Because arrays are objects, the opcodes anewarray and multianewarray - but not the opcode new - can reference array "classes" via CONSTANT_Class_info
structures in the constant_pool
table. For such array classes, the name of the class is the descriptor of the array type (4.3.2).
For example, the class name representing the two-dimensional array typeint[][]
is[[I
, while the class name representing the typeThread[]
is[Ljava/lang/Thread;
.
For example, the
name_index
string representing the classString
isjava/lang/String
. Thename_index
string representing the interfaceRunnable
isjava/lang/Runnable
. Thename_index
string representing the array typeThread[]
is[Ljava/lang/Thread;
. Thename_index
string representing the array typeint[][]
is[[I
.
Note that it is not supported to represent a primitive type with a
CONSTANT_Class_info
. For example, thename_index
stringD
represents a class or interface namedD
, not the primitive typedouble
.
An array type descriptor is valid only if it represents 255 or fewer dimensions.
This rule is already stated in 4.3.2.
CONSTANT_Fieldref_info
, CONSTANT_Methodref_info
, and CONSTANT_InterfaceMethodref_info
StructuresFields, methods, and interface methods are represented by similar structures:
CONSTANT_Fieldref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_Methodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_InterfaceMethodref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
The items of these structures are as follows:
The tag
item of a CONSTANT_Fieldref_info
structure has the value CONSTANT_Fieldref
(9).
The tag
item of a CONSTANT_Methodref_info
structure has the value CONSTANT_Methodref
(10).
The tag
item of a CONSTANT_InterfaceMethodref_info
structure has the value CONSTANT_InterfaceMethodref
(11).
The value of the class_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure (4.4.1) representing a class or interface type that has the field or method as a member.
In a CONSTANT_Fieldref_info
structure, the class_index
item may be either a class type or an interface type.
In a CONSTANT_Methodref_info
structure, the class_index
item must be a class type, not an interface type.
In a CONSTANT_InterfaceMethodref_info
structure, the class_index
item must be an interface type, not a class type.
The value of the name_and_type_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_NameAndType_info
structure (4.4.6). This constant_pool
entry indicates the name and descriptor of the field or method.
In a CONSTANT_Fieldref_info
structure, the indicated descriptor must be a field descriptor (4.3.2). Otherwise, the indicated descriptor must be a method descriptor (4.3.3).
If the name of the method in a CONSTANT_Methodref_info
structure begins with a '<
' ('\u003c
'), then the name must be the special name <init>
, representing an instance initialization method (2.9.1). The return type of such a method must be void
.
CONSTANT_NameAndType_info
StructureThe CONSTANT_NameAndType_info
structure is used to represent a field or method, without indicating which class or interface type it belongs to:
CONSTANT_NameAndType_info {
u1 tag;
u2 name_index;
u2 descriptor_index;
}
The items of the CONSTANT_NameAndType_info
structure are as follows:
The tag
item has the value CONSTANT_NameAndType
(12).
The value of the name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing either a valid unqualified name denoting a field or method ([4.2.2]), or the special method name <init>
(2.9.1).
The value of the descriptor_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing a valid field descriptor or method descriptor (4.3.2, 4.3.3).
StackMapTable
Attribute...
A verification type specifies the type of either one or two locations, where a location is either a single local variable or a single operand stack entry. A verification type is represented by a discriminated union, verification_type_info
, that consists of a one-byte tag, indicating which item of the union is in use, followed by zero or more bytes, giving more information about the tag.
union verification_type_info {
Top_variable_info;
Integer_variable_info;
Float_variable_info;
Long_variable_info;
Double_variable_info;
Null_variable_info;
UninitializedThis_variable_info;
Object_variable_info;
Uninitialized_variable_info;
}
A verification type that specifies one location in the local variable array or in the operand stack is represented by the following items of the verification_type_info
union:
The Top_variable_info
item indicates that the local variable has the verification type top
.
Top_variable_info {
u1 tag = ITEM_Top; /* 0 */
}
The Integer_variable_info
item indicates that the location has the verification type int
.
Integer_variable_info {
u1 tag = ITEM_Integer; /* 1 */
}
The Float_variable_info
item indicates that the location has the verification type float
.
Float_variable_info {
u1 tag = ITEM_Float; /* 2 */
}
The Null_variable_info
type indicates that the location has the verification type null
.
Null_variable_info {
u1 tag = ITEM_Null; /* 5 */
}
The UninitializedThis_variable_info
item indicates that the location has the verification type uninitializedThis
.
UninitializedThis_variable_info {
u1 tag = ITEM_UninitializedThis; /* 6 */
}
The Object_variable_info
item indicates that the location has the verification type which is the class, interface, or array type represented by the CONSTANT_Class_info
structure (4.4.1) found in the constant_pool
table at the index given by cpool_index
.
Object_variable_info {
u1 tag = ITEM_Object; /* 7 */
u2 cpool_index;
}
The Uninitialized_variable_info
item indicates that the location has the verification type uninitialized(Offset)
. The Offset
item indicates the offset, in the code
array of the Code
attribute that contains this StackMapTable
attribute, of the new instruction (6.5.new) that created the object being stored in the location.
Uninitialized_variable_info {
u1 tag = ITEM_Uninitialized; /* 8 */
u2 offset;
}
A verification type that specifies two locations in the local variable array or in the operand stack is represented by the following items of the verification_type_info
union:
The Long_variable_info
item indicates that the first of two locations has the verification type long
.
Long_variable_info {
u1 tag = ITEM_Long; /* 4 */
}
The Double_variable_info
item indicates that the first of two locations has the verification type double
.
Double_variable_info {
u1 tag = ITEM_Double; /* 3 */
}
The Long_variable_info
and Double_variable_info
items indicate the verification type of the second of two locations as follows:
If the first of the two locations is a local variable, then:
It must not be the local variable with the highest index.
The next higher numbered local variable has the verification type top
.
If the first of the two locations is an operand stack entry, then:
It must not be the topmost location of the operand stack.
The next location closer to the top of the operand stack has the verification type top
.
...
Exceptions
AttributeThe Exceptions
attribute is a variable-length attribute in the attributes
table of a method_info
structure (4.6). The Exceptions
attribute indicates which checked exceptions a method may throw.
There may be at most one Exceptions
attribute in the attributes
table of a method_info
structure.
The Exceptions
attribute has the following format:
Exceptions_attribute {
u2 attribute_name_index;
u4 attribute_length;
u2 number_of_exceptions;
u2 exception_index_table[number_of_exceptions];
}
The items of the Exceptions_attribute
structure are as follows:
The value of the attribute_name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be the CONSTANT_Utf8_info
structure (4.4.7) representing the string "Exceptions
".
The value of the attribute_length
item indicates the length of the attribute, excluding the initial six bytes.
The value of the number_of_exceptions
item indicates the number of entries in the exception_index_table
.
Each value in the exception_index_table
array must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Class_info
structure (4.4.1) representing a class type that this method is declared to throw.
A method should throw an exception only if at least one of the following three criteria is met:
The exception is an instance of
RuntimeException
or one of its subclasses.The exception is an instance of
Error
or one of its subclasses.The exception is an instance of one of the exception classes specified in the
exception_index_table
just described, or one of their subclasses.
These requirements are not enforced in the Java Virtual Machine; they are enforced only at compile time.
RuntimeVisibleAnnotations
Attributeelement_value
structureThe element_value
structure is a discriminated union representing the value of an element-value pair. It has the following format:
element_value {
u1 tag;
union {
u2 const_value_index;
{ u2 type_name_index;
u2 const_name_index;
} enum_const_value;
u2 class_info_index;
annotation annotation_value;
{ u2 num_values;
element_value values[num_values];
} array_value;
} value;
}
The tag
item uses a single ASCII character to indicate the type of the value of the element-value pair. This determines which item of the value
union is in use. [Table 4.7.16.1-A] shows the valid characters for the tag
item, the type indicated by each character, and the item used in the value
union for each character. The table's fourth column is used in the description below of one item of the value
union.
Table 4.7.16.1-A. Interpretation of tag
values as types
tag Item |
Type | value Item |
Constant Type |
---|---|---|---|
B |
byte |
const_value_index |
CONSTANT_Integer |
C |
char |
const_value_index |
CONSTANT_Integer |
D |
double |
const_value_index |
CONSTANT_Double |
F |
float |
const_value_index |
CONSTANT_Float |
I |
int |
const_value_index |
CONSTANT_Integer |
J |
long |
const_value_index |
CONSTANT_Long |
S |
short |
const_value_index |
CONSTANT_Integer |
Z |
boolean |
const_value_index |
CONSTANT_Integer |
s |
String |
const_value_index |
CONSTANT_Utf8 |
e |
Enum type | enum_const_value |
Not applicable |
c |
Class |
class_info_index |
Not applicable |
@ |
Annotation type | annotation_value |
Not applicable |
[ |
Array type | array_value |
Not applicable |
The value
item represents the value of an element-value pair. The item is a union, whose own items are as follows:
The const_value_index
item denotes either a primitive constant value or a String
literal as the value of this element-value pair.
The value of the const_value_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be of a type appropriate to the tag
item, as specified in the fourth column of [Table 4.7.16.1-A].
The enum_const_value
item denotes an enum constant as the value of this element-value pair.
The enum_const_value
item consists of the following two items:
The value of the type_name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing a field descriptor (4.3.2). The constant_pool
entry gives the internal form of the binary name of the type of the enum constant represented by this element_value
structure (4.2.1).
The value of the const_name_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7). The constant_pool
entry gives the simple name of the enum constant represented by this element_value
structure.
The class_info_index
item denotes a class literal as the value of this element-value pair.
The class_info_index
item must be a valid index into the constant_pool
table. The constant_pool
entry at that index must be a CONSTANT_Utf8_info
structure (4.4.7) representing a return descriptor (4.3.3). The return descriptor gives the type corresponding to the class literal represented by this element_value
structure. Types correspond to class literals as follows:
For a class literal C.class
, where C is the name of a class, or interface, or array type, the corresponding type is C. The return descriptor in the constant_pool
will be an ObjectType or an ArrayType a ClassType.
For a class literal T[].class
, where T[] is an array type, the corresponding type is T[]. The return descriptor in the constant_pool
will be an ArrayType.
For a class literal p.class
, where p is the name of a primitive type, the corresponding type is p. The return descriptor in the constant_pool
will be a BaseType character.
For a class literal void.class
, the corresponding type is void
. The return descriptor in the constant_pool
will be V.
For example, the class literal
Object.class
corresponds to the typeObject
, so theconstant_pool
entry isLjava/lang/Object;
, whereas the class literalint.class
corresponds to the typeint
, so theconstant_pool
entry isI
.
The class literal
void.class
corresponds tovoid
, so theconstant_pool
entry is V, whereas the class literalVoid.class
corresponds to the typeVoid
, so theconstant_pool
entry isLjava/lang/Void;
.
The annotation_value
item denotes a "nested" annotation as the value of this element-value pair.
The value of the annotation_value
item is an annotation
structure (4.7.16) that gives the annotation represented by this element_value
structure.
The array_value
item denotes an array as the value of this element-value pair.
The array_value
item consists of the following two items:
The value of the num_values
item gives the number of elements in the array represented by this element_value
structure.
Each value in the values
table gives the corresponding element of the array represented by this element_value
structure.
The structural constraints on the code
array specify constraints on relationships between Java Virtual Machine instructions. The structural constraints are as follows:
Each instruction must only be executed with the appropriate type and number of arguments in the operand stack and local variable array, regardless of the execution path that leads to its invocation.
An instruction operating on values of type int
is also permitted to operate on values of type boolean
, byte
, char
, and short
.
As noted in 2.3.4 and 2.11.1, the Java Virtual Machine
internallyimplicitly converts values of typesboolean
,byte
,short
, andchar
to typeint
, allowing instructions expecting values of typeint
to operate on them.)
If an instruction can be executed along several different execution paths, the operand stack must have the same depth (2.6.2) prior to the execution of the instruction, regardless of the path taken.
At no point during execution can the operand stack grow to a depth greater than that implied by the max_stack
item.
At no point during execution can more values be popped from the operand stack than it contains.
At no point during execution can the order of the local variable pair holding a value of type long
or double
be reversed or the pair split up. At no point can the local variables of such a pair be operated on individually.
No local variable (or local variable pair, in the case of a value of type long
or double
) can be accessed before it is assigned a value.
Each invokespecial instruction must name one of the following:
an instance initialization method (2.9.1)
a method in the current class or interface
a method in a superclass of the current class
a method in a direct superinterface of the current class or interface
a method in Object
If an invokespecial instruction names an instance initialization method, then the target reference on the operand stack must be an uninitialized class instance. An instance initialization method must never be invoked on an initialized class instance. In addition:
If the target reference on the operand stack is an uninitialized class instance for the current class, then invokespecial must name an instance initialization method from the current class or its direct superclass.
If an invokespecial instruction names an instance initialization method and the target reference on the operand stack is a class instance created by an earlier new instruction, then invokespecial must name an instance initialization method from the class of that class instance.
If an invokespecial instruction names a method which is not an instance initialization method, then the target reference on the operand stack must be a class instance whose type is assignment compatible with the current class (JLS §5.2).
The general rule for invokespecial is that the class or interface named by invokespecial must be be "above" the caller class or interface, while the receiver object targeted by invokespecial must be "at" or "below" the caller class or interface. The latter clause is especially important: a class or interface can only perform invokespecial on its own objects. See 4.10.1.9.invokespecial for an explanation of how the latter clause is implemented in Prolog.
Each instance initialization method, except for the instance initialization method derived from the constructor of class Object
, must call either another instance initialization method of this
or an instance initialization method of its direct superclass super
before its instance members are accessed.
However, instance fields of this
that are declared in the current class may be assigned by putfield before calling any instance initialization method.
When any instance method is invoked or when any instance variable is accessed, the class instance that contains the instance method or instance variable must already be initialized.
<init>
method, the handler must throw an exception or loop forever; and
<init>
method, the uninitialized class instance must remain uninitialized.There must never be an uninitialized class instance on the operand stack or in a local variable when a jsr or jsr_w instruction is executed.
The When an invokevirtual or invokespecial instruction references a method of a class, the type of every class instance that is the target of a method invocation the instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class or interface type specified in the instruction the class type of the referenced class or one of its subclasses.
The types of the arguments to each method invocation must be method invocation compatible with subtypes of the types given by the method descriptor (JLS §5.3 4.10.1.2, 4.3.3), where descriptor types boolean
, byte
, char
, and short
are interpreted as type int
.
Each return instruction must match its method's return type:
If the method returns a boolean
, byte
, char
, short
, or int
, only the ireturn instruction may be used.
If the method returns a float
, long
, or double
, only an freturn, lreturn, or dreturn instruction, respectively, may be used.
If the method returns a reference
type, only an areturn instruction may be used, and the type of the returned value must be assignment compatible with a subtype of the return descriptor of the method (4.3.3).
All instance initialization methods, class or interface initialization methods, and methods declared to return void
must use only the return instruction.
The type of every class instance accessed by a getfield instruction or modified by a putfield instruction (that is, the type of the target reference on the operand stack) must be assignment compatible with the class type specified in the instruction the class type of the class specified in the instruction or one of its subclasses.
The type of every value stored by a putfield or putstatic instruction must be compatible with the descriptor of the field (4.3.2) of the class instance or class being stored into:
If the descriptor type is boolean
, byte
, char
, short
, or int
, then the value must be an int
.
If the descriptor type is float
, long
, or double
, then the value must be a float
, long
, or double
, respectively.
If the descriptor type is a reference
type, then the value must be of a type that is assignment compatible with a subtype of the descriptor type.
The type of every value stored into an array by an aastore instruction must be a reference
type.
The component type of the array being stored into by the aastore instruction must also be a reference
type.
Each athrow instruction must throw only values that are instances of class Throwable
or of subclasses of Throwable
.
Each class mentioned in a catch_type
item of the exception_table
array of the method's Code_attribute
structure must be Throwable
or a subclass of Throwable
.
If getfield or putfield is used to access a protected
field declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class the class type of the current class or one of its subclasses.
If invokevirtual or invokespecial is used to access a protected
method declared in a superclass that is a member of a different run-time package than the current class, then the type of the class instance being accessed (that is, the type of the target reference on the operand stack) must be assignment compatible with the current class the class type of the current class or one of its subclasses.
Execution never falls off the bottom of the code
array.
No return address (a value of type returnAddress
) may be loaded from a local variable.
The instruction following each jsr or jsr_w instruction may be returned to only by a single ret instruction.
No jsr or jsr_w instruction that is returned to may be used to recursively call a subroutine if that subroutine is already present in the subroutine call chain. (Subroutines can be nested when using try
-finally
constructs from within a finally
clause.)
Each instance of type returnAddress
can be returned to at most once.
If a ret instruction returns to a point in the subroutine call chain above the ret instruction corresponding to a given instance of type returnAddress
, then that instance can never be used as a return address.
class
FilesMany of the changes in this section are made to reduce reliance on class types when what is really wanted is classes. The verifier already has a first-class notion of a class, a black-box "live" representation of a loaded class file. Using an extra layer of indirection to encode these classes as verification type structures is unnecessary and risks problems when the type system evolves.
A class
file whose version number is 50.0 or above (4.1) must be verified using the type checking rules given in this section.
If, and only if, a class
file's version number equals 50.0, then if the type checking fails, a Java Virtual Machine implementation may choose to attempt to perform verification by type inference (4.10.2).
This is a pragmatic adjustment, designed to ease the transition to the new verification discipline. Many tools that manipulate
class
files may alter the bytecodes of a method in a manner that requires adjustment of the method's stack map frames. If a tool does not make the necessary adjustments to the stack map frames, type checking may fail even though the bytecode is in principle valid (and would consequently verify under the old type inference scheme). To allow implementors time to adapt their tools, Java Virtual Machine implementations may fall back to the older verification discipline, but only for a limited time.
In cases where type checking fails but type inference is invoked and succeeds, a certain performance penalty is expected. Such a penalty is unavoidable. It also should serve as a signal to tool vendors that their output needs to be adjusted, and provides vendors with additional incentive to make these adjustments.
In summary, failover to verification by type inference supports both the gradual addition of stack map frames to the Java SE Platform (if they are not present in a version 50.0
class
file, failover is allowed) and the gradual removal of the jsr and jsr_w instructions from the Java SE Platform (if they are present in a version 50.0class
file, failover is allowed).
If a Java Virtual Machine implementation ever attempts to perform verification by type inference on version 50.0 class files, it must do so in all cases where verification by type checking fails.
This means that a Java Virtual Machine implementation cannot choose to resort to type inference in once case and not in another. It must either reject
class
files that do not verify via type checking, or else consistently failover to the type inferencing verifier whenever type checking fails.
The type checker enforces type rules that are specified by means of Prolog clauses. English language text is used to describe the type rules in an informal way, while the Prolog clauses provide a formal specification.
The type checker requires a list of stack map frames for each method with a Code
attribute (4.7.3). A list of stack map frames is given by the StackMapTable
attribute (4.7.4) of a Code
attribute. The intent is that a stack map frame must appear at the beginning of each basic block in a method. The stack map frame specifies the verification type of each operand stack entry and of each local variable at the start of each basic block. The type checker reads the stack map frames for each method with a Code
attribute and uses these maps to generate a proof of the type safety of the instructions in the Code
attribute.
A class is type safe if all its methods are type safe, and it does not subclass a final
class.
classIsTypeSafe(Class) :-
classClassName(Class, Name),
classDefiningLoader(Class, L),
superclassChain(Name, L, Chain),
Chain \= [],
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, L),
loadedClass(SuperclassName, L, Superclass),
classIsNotFinal(Superclass),
classMethods(Class, Methods),
checklist(methodIsTypeSafe(Class), Methods).
classIsTypeSafe(Class) :-
loadedSuperclasses(Class, [ Superclass | Rest ]),
classIsNotFinal(Superclass),
classMethods(Class, Methods),
checklist(methodIsTypeSafe(Class), Methods).
classIsTypeSafe(Class) :-
classClassName(Class, 'java/lang/Object'),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classMethods(Class, Methods),
checklist(methodIsTypeSafe(Class), Methods).
The Prolog predicate classIsTypeSafe
assumes that Class
is a Prolog term representing a binary class that has been successfully parsed and loaded. This specification does not mandate the precise structure of this term, but does require that certain predicates be defined upon it.
For example, we assume a predicate
classMethods(Class, Methods)
that, given a term representing a class as described above as its first argument, binds its second argument to a list comprising all the methods of the class, represented in a convenient form described later.
Iff the predicate classIsTypeSafe
is not true, the type checker must throw the exception VerifyError
to indicate that the class
file is malformed. Otherwise, the class
file has type checked successfully and bytecode verification has completed successfully.
The rest of this section explains the process of type checking in detail:
First, we give Prolog predicates for core Java Virtual Machine artifacts like classes and methods (4.10.1.1).
Second, we specify the type system known to the type checker (4.10.1.2).
Third, we specify the Prolog representation of instructions and stack map frames (4.10.1.3, 4.10.1.4).
Fourth, we specify how a method is type checked, for methods without code (4.10.1.5) and methods with code (4.10.1.6).
Fifth, we discuss type checking issues common to all load and store instructions (4.10.1.7), and also issues of access to protected
members (4.10.1.8).
Finally, we specify the rules to type check each instruction (4.10.1.9).
We stipulate the existence of 28 Prolog predicates ("accessors") that have certain expected behavior but whose formal definitions are not given in this specification.
Extracts the name, ClassName
, of the class Class
.
True iff the class, Class
, is an interface.
True iff the class, Class
, is not a final
class.
Extracts the name, SuperClassName
, of the superclass of class Class
.
Extracts a list, Interfaces
, of the direct superinterfaces of the class Class
.
Extracts a list, Methods
, of the methods declared in the class Class
.
Extracts a list, Attributes
, of the attributes of the class Class
.
Each attribute is represented as a functor application of the form attribute(AttributeName, AttributeContents)
, where AttributeName
is the name of the attribute. The format of the attribute's contents is unspecified.
Extracts the defining class loader, Loader
, of the class Class
.
True iff the class loader Loader
is the bootstrap class loader.
True iff there exists a class named Name
whose representation (in accordance with this specification) when loaded by the class loader InitiatingLoader
is ClassDefinition
.
Extracts the name, Name
, of the method Method
.
Extracts the access flags, AccessFlags
, of the method Method
.
Extracts the descriptor, Descriptor
, of the method Method
.
Extracts a list, Attributes
, of the attributes of the method Method
.
True iff Method
(regardless of class) is <init>
.
True iff Method
(regardless of class) is not <init>
.
True iff Method
in class Class
is not final
.
True iff Method
in class Class
is static
.
True iff Method
in class Class
is not static
.
True iff Method
in class Class
is private
.
True iff Method
in class Class
is not private
.
True iff there is a member named MemberName
with descriptor MemberDescriptor
in the class MemberClass
and it is protected
.
True iff there is a member named MemberName
with descriptor MemberDescriptor
in the class MemberClass
and it is not protected
.
Converts a field descriptor, Descriptor
, into the corresponding verification type Type
(4.10.1.2).
The verification type derived from descriptor types byte
, short
, boolean
, and char
is int
.
Converts a method descriptor, Descriptor
, into a list of verification types, ArgTypeList
, corresponding to the method argument types, and a verification type, ReturnType
, corresponding to the return type.
The verification type derived from descriptor types byte
, short
, boolean
, and char
is int
. A void return is represented with the special symbol void
.
Extracts the instruction stream, ParsedCode
, of the method Method
in Class
, as well as the maximum operand stack size, MaxStack
, the maximal number of local variables, FrameSize
, the exception handlers, Handlers
, and the stack map StackMap
.
The representation of the instruction stream and stack map attribute must be as specified in 4.10.1.3 and 4.10.1.4.
True iff the package names of Class1
and Class2
are the same.
True iff the package names of Class1
and Class2
are different.
The above accessors are used to define loadedSuperclasses
, which produces a list of a class's superclasses.
loadedSuperclasses(Class, [ Superclass | Rest ]) :-
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, L),
loadedClass(SuperclassName, L, Superclass),
loadedSuperclasses(Superclass, Rest).
loadedSuperclasses(Class, []) :-
classClassName(Class, 'java/lang/Object'),
classDefiningLoader(Class, BL),
isBootstrapLoader(BL).
The loadedSuperclasses
predicate replaces superclassChain
(4.10.1.2), which had the same effect, but produced a list of class types rather than loaded classes. (Despite the change in representation, all superclasses are loaded in either case.)
When type checking a method's body, it is convenient to access information about the method. For this purpose, we define an environment, a six-tuple consisting of:
void
)We specify accessors to extract information from the environment.
allInstructions(Environment, Instructions) :-
Environment = environment(_Class, _Method, _ReturnType,
Instructions, _, _).
exceptionHandlers(Environment, Handlers) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, _, Handlers).
maxOperandStackLength(Environment, MaxStack) :-
Environment = environment(_Class, _Method, _ReturnType,
_Instructions, MaxStack, _Handlers).
currentClassLoader(Environment, Loader) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L).
thisClass(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisClass(Environment, Class) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _).
thisType(Environment, class(ClassName, L)) :-
Environment = environment(Class, _Method, _ReturnType,
_Instructions, _, _),
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
thisMethodReturnType(Environment, ReturnType) :-
Environment = environment(_Class, _Method, ReturnType,
_Instructions, _, _).
We specify additional predicates to extract higher-level information from the environment.
offsetStackFrame(Environment, Offset, StackFrame) :-
allInstructions(Environment, Instructions),
member(stackMap(Offset, StackFrame), Instructions).
currentClassLoader(Environment, Loader) :-
thisClass(Environment, class(_, Loader)).
Finally, we specify a general predicate used throughout the type rules:
notMember(_, []).
notMember(X, [A | More]) :- X \= A, notMember(X, More).
The principle guiding the determination as to which accessors are stipulated and which are fully specified is that we do not want to over-specify the representation of the
class
file. Providing specific accessors to theClass
orMethod
term would force us to completely specify the format for a Prolog term representing theclass
file.
The type checker enforces a type system based upon a hierarchy of verification types, illustrated below.
Verification type hierarchy:
top
____________/\____________
/ \
/ \
oneWord twoWord
/ | \ / \
/ | \ / \
int float reference long double
/ \
/ \_____________
/ \
/ \
uninitialized +---------------------+
/ \ | Java reference |
/ \ | reference |
/ \ | type hierarchy |
uninitializedThis uninitialized(Offset) +---------------------+
|
|
null
The "reference type hierarchy" was previously referred to as the "Java reference type hierarchy". But the reference type subtyping graph doesn't rely on the Java language at all, and in fact, as of Java 5, differs significantly from it.
Most verification types have a direct correspondence with the primitive and reference types described in 2.2 and represented by field descriptors in Table 4.3-A:
The primitive types double
, float
, int
, and long
(field descriptors D
, F
, I
, J
) each correspond to the verification type of the same name.
The primitive types byte
, char
, short
, and boolean
(field descriptors B
, C
, S
, Z
) all correspond to the verification type int
.
Class and interface types (field descriptors beginning L
) correspond to verification types that use the functor class
. The verification type class(*N*, *L*)
represents the type of the class or interface whose binary name is *N*
as loaded by the loader *L*
. Note that *L*
is an initiating loader (5.3) of the class represented by class(*N*, *L*)
and may, or may not, be the class's defining loader.
For example, the class type
Object
would be represented asclass('java/lang/Object',
BL
L
)
, wherethe defining loader of classBL
'java/lang/Object'
, as loaded byL
, is the bootstrap loader.
Array types (field descriptors beginning [
) correspond to verification types that use the functor arrayOf
. Note that the primitive types byte
, char
, short
, and boolean
do not correspond to verification types, but an array type whose element type is byte
, char
, short
, or boolean
does correspond to a verification type; such verification types support the baload, bastore, caload, castore, saload, sastore, and newarray instructions.
The verification type arrayOf(*T*)
represents the array type whose component type is the verification type *T*
.
The verification type arrayOf(byte)
represents the array type whose element component type is byte
.
The verification type arrayOf(char)
represents the array type whose element component type is char
.
The verification type arrayOf(short)
represents the array type whose element component type is short
.
The verification type arrayOf(boolean)
represents the array type whose element component type is boolean
.
For example, the array types
int[]
andObject[]
would be represented by the verification typesarrayOf(int)
andarrayOf(class('java/lang/Object', BL))
respectively. The array typesbyte[]
andboolean[][]
would be represented by the verification typesarrayOf(byte)
andarrayOf(arrayOf(boolean))
respectively.
The remaining verification types are described as follows:
The verification types top
, oneWord
, twoWord
, and reference
describe abstract unions of other types, as illustrated above, and are represented in Prolog as atoms whose name denotes the verification type in question.
The verification types uninitialized
, uninitializedThis
, and uninitialized(Offset)
describe references to objects created with new
that have not yet been initialized (2.9.2). uninitialized
and uninitializedThis
are represented with an atom. The verification type uninitialized(Offset)
is represented by applying the functor uninitialized
to an argument representing the numerical value of the Offset
.
The verification type null
describes the result of the aconst_null
instruction, and is represented in Prolog as an atom.
The subtyping rules for verification types are as follows.
Subtyping is reflexive.
isAssignable(X, X).
The verification types which are not reference types in the Java programming language have subtype rules of the form:
isAssignable(v, X) :- isAssignable(the_direct_supertype_of_v, X).
That is, v
is a subtype of X
if the direct supertype of v
is a subtype of X
. The rules are:
The type top
is a supertype of all other types.
isAssignable(oneWord, top).
isAssignable(twoWord, top).
A type is a subtype of some other type, X, if its direct supertype is a subtype of X.
isAssignable(int, X) :- isAssignable(oneWord, X).
isAssignable(float, X) :- isAssignable(oneWord, X).
isAssignable(long, X) :- isAssignable(twoWord, X).
isAssignable(double, X) :- isAssignable(twoWord, X).
isAssignable(reference, X) :- isAssignable(oneWord, X).
isAssignable(class(_, _), X) :- isAssignable(reference, X).
isAssignable(arrayOf(_), X) :- isAssignable(reference, X).
isAssignable(null, X) :- isAssignable(reference, X).
isAssignable(uninitialized, X) :- isAssignable(reference, X).
isAssignable(uninitializedThis, X) :- isAssignable(uninitialized, X).
isAssignable(uninitialized(_), X) :- isAssignable(uninitialized, X).
The type null
is a subtype of all reference types.
isAssignable(null, class(_, _)).
isAssignable(null, arrayOf(_)).
isAssignable(null, X) :- isAssignable(class('java/lang/Object', BL), X),
isBootstrapLoader(BL).
These subtype rules are not necessarily the most obvious formulation of subtyping. There is a clear split between subtyping rules
for reference types in the Java programming languageamong reference types, and rules for the remaining verification types. The split allows us to state general subtyping relations betweenJava programming languagereference types and other verification types. These relations hold independently of aJavareference type's position in the type hierarchy, and help to prevent excessive class loading by a Java Virtual Machine implementation. For example, we do not want to start climbing theJavasuperclass hierarchy in response to a query of the formclass(foo, L) <: twoWord
.
We also have a rule that says subtyping is reflexive, so together these rules cover most verification types that are not reference types
in the Java programming language.
Subtype rules for the reference types in the Java programming language are specified recursively with isJavaAssignable
isWideningReference
.
isAssignable(class(X, Lx), class(Y, Ly)) :-
isJavaAssignable(class(X, Lx), class(Y, Ly)).
isAssignable(arrayOf(X), class(Y, L)) :-
isJavaAssignable(arrayOf(X), class(Y, L)).
isAssignable(arrayOf(X), arrayOf(Y)) :-
isJavaAssignable(arrayOf(X), arrayOf(Y)).
isAssignable(From, To) :- isWideningReference(From, To).
The isWideningReference
predicate is only defined for reference types, and will fail to match any non-reference inputs. So it's unnecessary to restrict the form of the inputs to avoid testing non-referene types.
For assignments, interfaces are treated like The verifier allows any reference type to be widened to an interface type.Object
.
isJavaAssignable(class(_, _), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
isWideningReference(class(_, _), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
isWideningReference(arrayOf(_), class(To, L)) :-
loadedClass(To, L, ToClass),
classIsInterface(ToClass).
This approach is less strict than the Java Programming Language, which will not allow an assignment to an interface unless the value is statically known to implement or extend the interface. The Java Virtual Machine instead uses a run-time check to ensure that invocations of interface methods actually operate on objects that implement the interface (6.5.invokeinterface). But there is no requirement that a reference stored by a local variable of an interface type refers to an object that actually implements that interface.
A class type can be widened to another class type if that type refers to the loaded class or one of its superclasses.
isJavaAssignable(From, To) :-
isJavaSubclassOf(From, To).
isWideningReference(class(ClassName, L1), class(ClassName, L2)) :-
L1 \= L2,
loadedClass(ClassName, L1, Class),
loadedClass(ClassName, L2, Class).
isWideningReference(class(From, L1), class(To, L2)) :-
From \= To,
loadedClass(From, L1, FromClass),
loadedClass(To, L2, ToClass),
loadedSuperclases(FromClass, Supers),
member(ToClass, Supers).
A bug in the previous rules failed to allow the same class to be treated as a subtype of itself when referenced in the context of different initiating class loaders. It's not clear if this has any practical impact (are subtype tests ever performed between types referenced from different classes?), but the first rule above addresses it.
In the case in which two class types have the same name and the same initiating class loader, neither of these rules apply. If the types are the same, that's an identity, not a widening. The reflexive isAssignable
rule applies, and the class should not be loaded.
Array types are subtypes of Object
. The intent is also that array types are subtypes of Cloneable
and java.io.Serializable
.
isJavaAssignable(arrayOf(_), class('java/lang/Object', BL)) :-
isBootstrapLoader(BL).
isJavaAssignable(arrayOf(_), X) :-
isArrayInterface(X).
isArrayInterface(class('java/lang/Cloneable', BL)) :-
isBootstrapLoader(BL).
isArrayInterface(class('java/io/Serializable', BL)) :-
isBootstrapLoader(BL).
isWideningReference(arrayOf(_), class('java/lang/Object', L)) :-
loadedClass('java/lang/Object', L, ObjectClass),
classDefiningLoader(ObjectClass, BL),
isBootstrapLoader(BL).
A bug in the previous rules fails to treat array types as subtypes of class('java/lang/Object', L)
unless L
is the bootstrap loader. Since L
is the initiating loader, that rule failed to support the common case of java/lang/Object
being referenced outside of bootstrap classes.
The previous rules also fail to allow an array type to be treated as a subtype of an arbitrary interface type. In practice, it is possible to, say, pass an array as an argument to a method expecting a Runnable
. The earlier rules for interface types address this, making it unnecessary to single out Cloneable
and Serializable
for special treatment.
Subtyping between arrays of primitive type is the identity relation.
isJavaAssignable(arrayOf(X), arrayOf(Y)) :-
atom(X),
atom(Y),
X = Y.
Subtyping between arrays of reference type is covariant.
isJavaAssignable(arrayOf(X), arrayOf(Y)) :-
compound(X), compound(Y), isJavaAssignable(X, Y).
isWideningReference(arrayOf(X), arrayOf(Y)) :-
isWideningReference(X, Y).
The subtyping rule for arrays of primitive types is an identity conversion, not a widening; and it is already covered by the reflexive rule for isAssignable
.
The subtyping rule for arrays of reference types does not need to check that the inputs are reference types—if not, isWideningReference
will not succeed.
Subclassing is reflexive.
isJavaSubclassOf(class(SubclassName, L), class(SubclassName, L)).
isJavaSubclassOf(class(SubclassName, LSub), class(SuperclassName, LSuper)) :-
superclassChain(SubclassName, LSub, Chain),
member(class(SuperclassName, L), Chain),
loadedClass(SuperclassName, L, Sup),
loadedClass(SuperclassName, LSuper, Sup).
This relation is expressed directly with isWideningReference
, above. No need to introduce another predicate.
superclassChain(ClassName, L, [class(SuperclassName, Ls) | Rest]) :-
loadedClass(ClassName, L, Class),
classSuperClassName(Class, SuperclassName),
classDefiningLoader(Class, Ls),
superclassChain(SuperclassName, Ls, Rest).
superclassChain('java/lang/Object', L, []) :-
loadedClass('java/lang/Object', L, Class),
classDefiningLoader(Class, BL),
isBootstrapLoader(BL).
This predicate is moved to 4.10.1.1 and renamed loadedSuperclasses
.
Individual bytecode instructions are represented in Prolog as terms whose functor is the name of the instruction and whose arguments are its parsed operands.
For example, an aload instruction is represented as the term
aload(N)
, which includes the indexN
that is the operand of the instruction.
The instructions as a whole are represented as a list of terms of the form:
instruction(Offset, AnInstruction)
For example,
instruction(21, aload(1))
.
The order of instructions in this list must be the same as in the class
file.
Some instructions have operands that refer to entries in the constant_pool
table representing fields, methods, and dynamically-computed call sites. Such entries are represented as functor applications of the form: If the constant_pool
index of an operand is invalid, or if the constant pool entry at that index does not have a supported form, as described below, the code attribute cannot be parsed, and verification will fail.
This assertion about operand well-formedness doesn't seem to have been made anywhere else. But, of course, if it's impossible to encode an instruction, the parseCodeAttribute
predicate can't succeed, nor can methodWithCodeIsTypeSafe
, etc.
The presentation below is restructured to be explicit about which constant pool forms are supported by which instructions.
Each checkcast, instanceof, anewarray, and multianewarray instruction must have an operand that refers to a CONSTANT_Class_info
constant pool entry (4.4.1). This entry is represented with a functor application, where the functor is the instruction name, and the operand is a class
or arrayOf
verification type (4.10.1.2) representing the referenced class, interface, or array type.
For example, a checkcast instruction whose operand refers to a constant pool entry representing the class
String
would be represented ascheckcast(class('java/lang/String', L))
, whereL
is the class loader of the class containing the instruction.
Each new instruction must have an operand that refers to a CONSTANT_Class_info
constant pool entry. This entry is represented with a functor application of the form new(ClassName)
, where ClassName
is the name of the referenced class. (The CONSTANT_Class_info
must not reference an array type.)
For example, a new instruction whose operand refers to a constant pool entry representing the class
Object
would be represented asnew('java/lang/Object')
.
It's better to encode new
with a class name than a type, because new
is a special feature of classes. It doesn't work on arbitrary types.
field(FieldClassName, FieldName, FieldDescriptor)
for a constant pool entry that is a CONSTANT_Fieldref_info
structure (4.4.2).Each getfield, putfield, getstatic, and putstatic instruction must have an operand that refers to a CONSTANT_Fieldref_info
constant pool entry (4.4.2). This entry is represented with a functor application of the form field(FieldClassName, FieldName, FieldDescriptor)
.
FieldClassName
is the name of the class referenced by the class_index
item in the structure. FieldName
and FieldDescriptor
correspond to the name and field descriptor referenced by the name_and_type_index
item of the structure.
For example, a getfield instruction whose operand refers to a constant pool entry representing a field
foo
of typefloat
in classBar
would be represented asgetfield(field('Bar', 'foo', 'F'))
.
Each invokevirtual, invokeinterface, invokespecial, and invokestatic instruction must have an operand that refers to a constant pool entry that references a method. This entry is represented with a functor application, as follows:
method(MethodClassName, MethodName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_Methodref_info
structure (4.4.2).
MethodClassName
is the name of the class referenced by the class_index
item of the structure. MethodName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure.
imethod(MethodIntfName, MethodName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_InterfaceMethodref_info
structure (4.4.2).
MethodIntfName
is the name of the interface referenced by the class_index
item of the structure. MethodName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure.
For example, an invokevirtual instruction whose operand refers to a constant pool entry representing the
hashCode
method in classObject
would be represented asinvokevirtual(method('java/lang/Object', 'hashCode', '()I'))
.
dmethod(CallSiteName, MethodDescriptor)
for a constant pool entry that is a CONSTANT_InvokeDynamic_info
structure (4.4.10).Each invokedynamic instruction must have an operand that refers to a CONSTANT_InvokeDynamic_info
constant pool entry (4.4.10). This entry is represented with a functor application of the form dmethod(CallSiteName, MethodDescriptor)
.
CallSiteName
and MethodDescriptor
correspond to the name and method descriptor referenced by the name_and_type_index
item of the structure. (The bootstrap_method_attr_index
item is irrelevant to verification.)
For clarity, we assume that field and method descriptors (4.3.2, 4.3.3) are mapped into more readable names: the leading L
and trailing ;
are dropped from class names, and the BaseType characters used for primitive types are mapped to the names of those types.
The descriptor should always be processed with parseFieldDescriptor, so its format doesn't need to be specified.
For example, a getfield instruction whose operand refers to a constant pool entry representing a fieldfoo
of typeF
in classBar
would be represented asgetfield(field('Bar', 'foo', 'F'))
.
The ldc instruction, among others, has Each ldc, ldc_w, and ldc2_w instruction must have an operand that refers to a loadable entry in the constant_pool
table. There are nine kinds of loadable entry (see Table 4.4-C), represented by functor applications of the following forms:
int(Value)
for a constant pool entry that is a CONSTANT_Integer_info
structure (4.4.4).
Value
is the int
constant represented by the bytes
item of the structure.
For example, an ldc instruction for loading the
int
constant 91 would be represented asldc(int(91))
.
float(Value)
for a constant pool entry that is a CONSTANT_Float_info
structure (4.4.4).
Value
is the float
constant represented by the bytes
item of the structure.
long(Value)
for a constant pool entry that is a CONSTANT_Long_info
structure (4.4.5).
Value
is the long
constant represented by the high_bytes
and low_bytes
items of the structure.
double(Value)
for a constant pool entry that is a CONSTANT_Double_info
structure (4.4.5).
Value
is the double
constant represented by the high_bytes
and low_bytes
items of the structure.
class(ClassName
, Loader
)
or arrayOf(Component)
for a constant pool entry that is a CONSTANT_Class_info
structure (4.4.1).
~~ ClassName
~~ is the name of the class or interface referenced by the name_index
item in the structure.~~ The class
and arrayOf
functors are defined in 4.10.1.2.
string(Value)
for a constant pool entry that is a CONSTANT_String_info
structure (4.4.3).
Value
is the string referenced by the string_index
item of the structure.
methodHandle(Kind, Reference)
for a constant pool entry that is a CONSTANT_MethodHandle_info
structure (4.4.8).
Kind
is the value of the reference_kind
item of the structure. Reference
is the value of the reference_index
item of the structure.
methodType(MethodDescriptor)
for a constant pool entry that is a CONSTANT_MethodType_info
structure (4.4.9).
MethodDescriptor
is the method descriptor referenced by the descriptor_index
item of the structure.
dconstant(ConstantName, FieldDescriptor)
for a constant pool entry that is a CONSTANT_Dynamic_info
structure (4.4.10).
ConstantName
and FieldDescriptor
correspond to the name and field descriptor referenced by the name_and_type_index
item of the structure. (The bootstrap_method_attr_index
item is irrelevant to verification.)
Non-abstract
, non-native
methods are type correct if they have code and the code is type correct.
methodIsTypeSafe(Class, Method) :-
doesNotOverrideFinalMethod(Class, Method),
methodAccessFlags(Method, AccessFlags),
methodAttributes(Method, Attributes),
notMember(native, AccessFlags),
notMember(abstract, AccessFlags),
member(attribute('Code', _), Attributes),
methodWithCodeIsTypeSafe(Class, Method).
A method with code is type safe if it is possible to merge the code and the stack map frames into a single stream such that each stack map frame precedes the instruction it corresponds to, and the merged stream is type correct. The method's exception handlers, if any, must also be legal.
methodWithCodeIsTypeSafe(Class, Method) :-
parseCodeAttribute(Class, Method, FrameSize, MaxStack,
ParsedCode, Handlers, StackMap),
mergeStackMapAndCode(StackMap, ParsedCode, MergedCode),
methodInitialStackFrame(Class, Method, FrameSize, StackFrame, ReturnType),
Environment = environment(Class, Method, ReturnType, MergedCode,
MaxStack, Handlers),
handlersAreLegal(Environment),
mergedCodeIsTypeSafe(Environment, MergedCode, StackFrame).
Let us consider exception handlers first.
An exception handler is represented by a functor application of the form:
handler(Start, End, Target, ClassName)
whose arguments are, respectively, the start and end of the range of instructions covered by the handler, the first instruction of the handler code, and the name of the exception class that this handler is designed to handle.
An exception handler is legal if its start (Start
) is less than its end (End
), there exists an instruction whose offset is equal to Start
, there exists an instruction whose offset equals End
, and the handler's exception class is assignable to the class Throwable
. The exception class of a handler is Throwable
if the handler's class entry is 0, otherwise it is the class named in the handler.
An additional requirement exists for a handler inside an <init>
method if one of the instructions covered by the handler is invokespecial of an <init>
method. In this case, the fact that a handler is running means the object under construction is likely broken, so it is important that the handler does not swallow the exception and allow the enclosing <init>
method to return normally to the caller. Accordingly, the handler is required to either complete abruptly by throwing an exception to the caller of the enclosing <init>
method, or to loop forever.
handlersAreLegal(Environment) :-
exceptionHandlers(Environment, Handlers),
checklist(handlerIsLegal(Environment), Handlers).
handlerIsLegal(Environment, Handler) :-
Handler = handler(Start, End, Target, _),
Start < End,
allInstructions(Environment, Instructions),
member(instruction(Start, _), Instructions),
offsetStackFrame(Environment, Target, _),
instructionsIncludeEnd(Instructions, End),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
isBootstrapLoader(BL),
isAssignable(ExceptionClass, class('java/lang/Throwable', BL)),
initHandlerIsLegal(Environment, Handler).
instructionsIncludeEnd(Instructions, End) :-
member(instruction(End, _), Instructions).
instructionsIncludeEnd(Instructions, End) :-
member(endOfCode(End), Instructions).
handlerExceptionClass(handler(_, _, _, 0),
class('java/lang/Throwable', BL), _) :-
isBootstrapLoader(BL).
handlerExceptionClass(handler(_, _, _, Name),
class(Name, L), L) :-
Name \= 0.
initHandlerIsLegal(Environment, Handler) :-
notInitHandler(Environment, Handler).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isNotInit(Method).
notInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method),
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`'.
initHandlerIsLegal(Environment, Handler) :-
isInitHandler(Environment, Handler),
sublist(isApplicableInstruction(Target), Instructions,
HandlerInstructions),
noAttemptToReturnNormally(HandlerInstructions).
isInitHandler(Environment, Handler) :-
Environment = environment(_Class, Method, _, Instructions, _, _),
isInit(Method).
member(instruction(_, invokespecial(CP)), Instructions),
CP = method(MethodClassName, '`<init>`', Descriptor).
isApplicableInstruction(HandlerStart, instruction(Offset, _)) :-
Offset >= HandlerStart.
noAttemptToReturnNormally(Instructions) :-
notMember(instruction(_, return), Instructions).
noAttemptToReturnNormally(Instructions) :-
member(instruction(_, athrow), Instructions).
Let us now turn to the stream of instructions and stack map frames.
Merging instructions and stack map frames into a single stream involves four cases:
Merging an empty StackMap
and a list of instructions yields the original list of instructions.
mergeStackMapAndCode([], CodeList, CodeList).
Given a list of stack map frames beginning with the type state for the instruction at Offset
, and a list of instructions beginning at Offset
, the merged list is the head of the stack map frame list, followed by the head of the instruction list, followed by the merge of the tails of the two lists.
mergeStackMapAndCode([stackMap(Offset, Map) | RestMap],
[instruction(Offset, Parse) | RestCode],
[stackMap(Offset, Map),
instruction(Offset, Parse) | RestMerge]) :-
mergeStackMapAndCode(RestMap, RestCode, RestMerge).
Otherwise, given a list of stack map frames beginning with the type state for the instruction at OffsetM
, and a list of instructions beginning at OffsetP
, then, if OffsetP < OffsetM
, the merged list consists of the head of the instruction list, followed by the merge of the stack map frame list and the tail of the instruction list.
mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap],
[instruction(OffsetP, Parse) | RestCode],
[instruction(OffsetP, Parse) | RestMerge]) :-
OffsetP < OffsetM,
mergeStackMapAndCode([stackMap(OffsetM, Map) | RestMap],
RestCode, RestMerge).
Otherwise, the merge of the two lists is undefined. Since the instruction list has monotonically increasing offsets, the merge of the two lists is not defined unless every stack map frame offset has a corresponding instruction offset and the stack map frames are in monotonically increasing order.
To determine if the merged stream for a method is type correct, we first infer the method's initial type state.
The initial type state of a method consists of an empty operand stack and local variable types derived from the type of this
and the arguments, as well as the appropriate flag, depending on whether this is an <init>
method.
methodInitialStackFrame(Class, Method, FrameSize, frame(Locals, [], Flags),
ReturnType):-
methodDescriptor(Method, Descriptor),
parseMethodDescriptor(Descriptor, RawArgs, ReturnType),
expandTypeList(RawArgs, Args),
methodInitialThisType(Class, Method, ThisList),
flags(ThisList, Flags),
append(ThisList, Args, ThisArgs),
expandToLength(ThisArgs, FrameSize, top, Locals).
Given a list of types, the following clause produces a list where every type of size 2 has been substituted by two entries: one for itself, and one top
entry. The result then corresponds to the representation of the list as 32-bit words in the Java Virtual Machine.
expandTypeList([], []).
expandTypeList([Item | List], [Item | Result]) :-
sizeOf(Item, 1),
expandTypeList(List, Result).
expandTypeList([Item | List], [Item, top | Result]) :-
sizeOf(Item, 2),
expandTypeList(List, Result).
flags([uninitializedThis], [flagThisUninit]).
flags(X, []) :- X \= [uninitializedThis].
expandToLength(List, Size, _Filler, List) :-
length(List, Size).
expandToLength(List, Size, Filler, Result) :-
length(List, ListLength),
ListLength < Size,
Delta is Size - ListLength,
length(Extra, Delta),
checklist(=(Filler), Extra),
append(List, Extra, Result).
For the initial type state of an instance method, we compute the type of this
and put it in a list. The type of this
in the <init>
method of Object
is Object
; in other <init>
methods, the type of this
is uninitializedThis
; otherwise, the type of this
in an instance method is class(N, L)
where N
is the name of the class containing the method and L
is its defining class loader.
For the initial type state of a static method, this
is irrelevant, so the list is empty.
methodInitialThisType(_Class, Method, []) :-
methodAccessFlags(Method, AccessFlags),
member(static, AccessFlags),
methodName(Method, MethodName),
MethodName \= '`<init>`'.
methodInitialThisType(Class, Method, [This]) :-
methodAccessFlags(Method, AccessFlags),
notMember(static, AccessFlags),
instanceMethodInitialThisType(Class, Method, This).
instanceMethodInitialThisType(Class, Method, class('java/lang/Object', L)) :-
methodName(Method, '`<init>`'),
classDefiningLoader(Class, L),
isBootstrapLoader(L),
classClassName(Class, 'java/lang/Object').
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
classClassName(Class, ClassName),
classDefiningLoader(Class, CurrentLoader),
superclassChain(ClassName, CurrentLoader, Chain),
Chain \= [].
instanceMethodInitialThisType(Class, Method, uninitializedThis) :-
methodName(Method, '`<init>`'),
loadedSuperclasses(Class, Supers),
Supers \= [].
instanceMethodInitialThisType(Class, Method, class(ClassName, L)) :-
methodName(Method, MethodName),
MethodName \= '`<init>`',
classDefiningLoader(Class, L),
classClassName(Class, ClassName).
We now compute whether the merged stream for a method is type correct, using the method's initial type state:
If we have a stack map frame and an incoming type state, the type state must be assignable to the one in the stack map frame. We may then proceed to type check the rest of the stream with the type state given in the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode],
frame(Locals, OperandStack, Flags)) :-
frameIsAssignable(frame(Locals, OperandStack, Flags), MapFrame),
mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
A merged code stream is type safe relative to an incoming type state T
if it begins with an instruction I
that is type safe relative to T
, and I
satisfies its exception handlers (see below), and the tail of the stream is type safe given the type state following that execution of I
.
NextStackFrame
indicates what falls through to the following instruction. For an unconditional branch instruction, it will have the special value afterGoto
. ExceptionStackFrame
indicates what is passed to exception handlers.
mergedCodeIsTypeSafe(Environment, [instruction(Offset, Parse) | MoreCode],
frame(Locals, OperandStack, Flags)) :-
instructionIsTypeSafe(Parse, Environment, Offset,
frame(Locals, OperandStack, Flags),
NextStackFrame, ExceptionStackFrame),
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame),
mergedCodeIsTypeSafe(Environment, MoreCode, NextStackFrame).
After an unconditional branch (indicated by an incoming type state of afterGoto
), if we have a stack map frame giving the type state for the following instructions, we can proceed and type check them using the type state provided by the stack map frame.
mergedCodeIsTypeSafe(Environment, [stackMap(Offset, MapFrame) | MoreCode],
afterGoto) :-
mergedCodeIsTypeSafe(Environment, MoreCode, MapFrame).
It is illegal to have code after an unconditional branch without a stack map frame being provided for it.
mergedCodeIsTypeSafe(_Environment, [instruction(_, _) | _MoreCode],
afterGoto) :-
write_ln('No stack frame after unconditional branch'),
fail.
If we have an unconditional branch at the end of the code, stop.
mergedCodeIsTypeSafe(_Environment, [endOfCode(Offset)],
afterGoto).
Branching to a target is type safe if the target has an associated stack frame, Frame
, and the current stack frame, StackFrame
, is assignable to Frame
.
targetIsTypeSafe(Environment, StackFrame, Target) :-
offsetStackFrame(Environment, Target, Frame),
frameIsAssignable(StackFrame, Frame).
An instruction satisfies its exception handlers if it satisfies every exception handler that is applicable to the instruction.
instructionSatisfiesHandlers(Environment, Offset, ExceptionStackFrame) :-
exceptionHandlers(Environment, Handlers),
sublist(isApplicableHandler(Offset), Handlers, ApplicableHandlers),
checklist(instructionSatisfiesHandler(Environment, ExceptionStackFrame),
ApplicableHandlers).
An exception handler is applicable to an instruction if the offset of the instruction is greater or equal to the start of the handler's range and less than the end of the handler's range.
isApplicableHandler(Offset, handler(Start, End, _Target, _ClassName)) :-
Offset >= Start,
Offset < End.
An instruction satisfies an exception handler if the instructions's outgoing type state is ExcStackFrame
, and the handler's target (the initial instruction of the handler code) is type safe assuming an incoming type state T
. The type state T
is derived from ExcStackFrame
by replacing the operand stack with a stack whose sole element is the handler's exception class.
instructionSatisfiesHandler(Environment, ExcStackFrame, Handler) :-
Handler = handler(_, _, Target, _),
currentClassLoader(Environment, CurrentLoader),
handlerExceptionClass(Handler, ExceptionClass, CurrentLoader),
/* The stack consists of just the exception. */
ExcStackFrame = frame(Locals, _, Flags),
TrueExcStackFrame = frame(Locals, [ ExceptionClass ], Flags),
operandStackHasLegalLength(Environment, TrueExcStackFrame),
targetIsTypeSafe(Environment, TrueExcStackFrame, Target).
protected
MembersAll instructions that access members must contend with the rules concerning protected
members. This section describes the protected
check that corresponds to JLS §6.6.2.1.
The protected
check applies only to protected
members of superclasses of the current class. protected
members in other classes will be caught by the access checking done at resolution (5.4.4). There are four cases:
If the name of a class is not the name of any superclass, it cannot be a superclass, and so it can safely be ignored.
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor, StackFrame) :-
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
superclassChain(CurrentClassName, CurrentLoader, Chain),
notMember(class(MemberClassName, _), Chain).
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor, StackFrame) :-
thisClass(Environment, CurrentClass),
\+ hasSuperclassWithName(CurrentClass, MemberClassName).
hasSuperclassWithName(Class, SuperclassName) :-
loadedSuperclasses(Class, Supers),
member(Super, Supers),
classClassName(Super, SuperclassName).
If the MemberClassName
is the same as the name of a superclass, the class being resolved may indeed be a superclass. In this case, if no superclass named MemberClassName
in a different run-time package has a protected
member named MemberName
with descriptor MemberDescriptor
, the protected
check does not apply.
This is because the actual class being resolved will either be one of these superclasses, in which case we know that it is either in the same run-time package, and the access is legal; or the member in question is not
protected
and the check does not apply; or it will be a subclass, in which case the check would succeed anyway; or it will be some other class in the same run-time package, in which case the access is legal and the check need not take place; or the verifier need not flag this as a problem, since it will be caught anyway because resolution will per force fail.
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor, StackFrame) :-
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
superclassChain(CurrentClassName, CurrentLoader, Chain),
member(class(MemberClassName, _), Chain),
classesInOtherPkgWithProtectedMember(
class(CurrentClassName, CurrentLoader),
MemberName, MemberDescriptor, MemberClassName, Chain, []).
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor, StackFrame) :-
thisClass(Environment, CurrentClass),
hasSuperclassWithName(CurrentClass, MemberClassName),
loadedSuperclasses(CurrentClass, Chain),
member(class(MemberClassName, _), Chain),
classesInOtherPkgWithProtectedMember(
CurrentClass, MemberName, MemberDescriptor, MemberClassName,
Chain, []).
If there does exist a protected
superclass member in a different run-time package, then load MemberClassName
; if the member in question is not protected
, the check does not apply. (Using a superclass member that is not protected
is trivially correct.)
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor,
frame(_Locals, [Target | Rest], _Flags)) :-
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
superclassChain(CurrentClassName, CurrentLoader, Chain),
member(class(MemberClassName, _), Chain),
classesInOtherPkgWithProtectedMember(
class(CurrentClassName, CurrentLoader),
MemberName, MemberDescriptor, MemberClassName, Chain, List),
List \= [],
loadedClass(MemberClassName, CurrentLoader, ReferencedClass),
isNotProtected(ReferencedClass, MemberName, MemberDescriptor).
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor,
frame(_Locals, [Target | Rest], _Flags)) :-
thisClass(Environment, CurrentClass),
hasSuperclassWithName(CurrentClass, MemberClassName),
loadedSuperclasses(CurrentClass, Chain),
classesInOtherPkgWithProtectedMember(
CurrentClass, MemberName, MemberDescriptor, MemberClassName,
Chain, List),
List \= [],
loadedClass(MemberClassName, CurrentLoader, ReferencedClass),
isNotProtected(ReferencedClass, MemberName, MemberDescriptor).
Otherwise, use of a member of an object of type Target
requires that Target
be assignable to the type of the current class.
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor,
frame(_Locals, [Target | Rest], _Flags)) :-
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
superclassChain(CurrentClassName, CurrentLoader, Chain),
member(class(MemberClassName, _), Chain),
classesInOtherPkgWithProtectedMember(
class(CurrentClassName, CurrentLoader),
MemberName, MemberDescriptor, MemberClassName, Chain, List),
List \= [],
loadedClass(MemberClassName, CurrentLoader, ReferencedClass),
isProtected(ReferencedClass, MemberName, MemberDescriptor),
isAssignable(Target, class(CurrentClassName, CurrentLoader)).
passesProtectedCheck(Environment, MemberClassName, MemberName,
MemberDescriptor,
frame(_Locals, [Target | Rest], _Flags)) :-
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
hasSuperclassWithName(CurrentClass, MemberClassName),
loadedSuperclasses(CurrentClass, Chain),
classesInOtherPkgWithProtectedMember(
CurrentClass, MemberName, MemberDescriptor, MemberClassName,
Chain, List),
List \= [],
loadedClass(MemberClassName, CurrentLoader, ReferencedClass),
isProtected(ReferencedClass, MemberName, MemberDescriptor),
thisType(Environment, ThisType),
isAssignable(Target, ThisType).
The predicate classesInOtherPkgWithProtectedMember(Class, MemberName, MemberDescriptor, MemberClassName, Chain, List)
is true if List
is the set of classes in Chain
with name MemberClassName
that are in a different run-time package than Class
which have a protected
member named MemberName
with descriptor MemberDescriptor
.
classesInOtherPkgWithProtectedMember(_, _, _, _, [], []).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
[class(MemberClassName, L) | T]) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail],
[Super | T]) :-
classClassName(Super, MemberClassName),
differentRuntimePackage(Class, Super),
isProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T) :-
differentRuntimePackage(Class, class(MemberClassName, L)),
loadedClass(MemberClassName, L, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[Super | Tail],
T) :-
classClassName(Super, MemberClassName),
differentRuntimePackage(Class, Super),
isNotProtected(Super, MemberName, MemberDescriptor),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
[class(MemberClassName, L) | Tail],
T] :-
sameRuntimePackage(Class, class(MemberClassName, L)),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
classesInOtherPkgWithProtectedMember(Class, MemberName,
MemberDescriptor, MemberClassName,
Super | Tail],
T) :-
classClassName(Super, MemberClassName),
sameRuntimePackage(Class, Super),
classesInOtherPkgWithProtectedMember(
Class, MemberName, MemberDescriptor, MemberClassName, Tail, T).
sameRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L),
classDefiningLoader(Class2, L),
samePackageName(Class1, Class2).
differentRuntimePackage(Class1, Class2) :-
classDefiningLoader(Class1, L1),
classDefiningLoader(Class2, L2),
L1 \= L2.
differentRuntimePackage(Class1, Class2) :-
differentPackageName(Class1, Class2).
An anewarray instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a class, an interface, or an array type, and one can legally replace a type matching int
on the incoming operand stack with an array with component type CP
yielding the outgoing type state.
instructionIsTypeSafe(anewarray(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = CP = arrayOf(_)),
validTypeTransition(Environment, [int], arrayOf(CP),
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(anewarray(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
validTypeTransition(Environment, [int], arrayOf(CP),
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
This instruction works on all valid CONSTANT_Class_info
structures. There's no need to make assertions about CP
.
A checkcast instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting either a class or an array a class, an interface, or an array type, and one can validly replace the type Object
on top of the incoming operand stack with the type denoted by CP
yielding the outgoing type state.
instructionIsTypeSafe(checkcast(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = arrayOf(_)),
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(checkcast(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
A getfield instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting a field whose declared type is FieldType
, declared in a class FieldClassName
, and one can validly replace a type matching FieldClassName
with type FieldType
on the incoming operand stack yielding the outgoing type state. ~~ FieldClassName
must not be an array type.~~ protected
fields are subject to additional checks (4.10.1.8).
instructionIsTypeSafe(getfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, StackFrame),
currentClassLoader(Environment, CurrentLoader),
validTypeTransition(Environment,
[class(FieldClassName, CurrentLoader)], FieldType,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
An instanceof instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting either a class or an array a class, an interface, or an array type, and one can validly replace the type Object
on top of the incoming operand stack with type int
yielding the outgoing type state.
instructionIsTypeSafe(instanceof(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
(CP = class(_, _) ; CP = arrayOf(_)),
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], int,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(instanceof(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
isBootstrapLoader(BL),
validTypeTransition(Environment, [class('java/lang/Object', BL)], int,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
An invokespecial instruction is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting a method named MethodName
with descriptor Descriptor
that is a member of a class MethodClassName
.
Either:
MethodName
is not <init>
.
MethodName
is not <clinit>
.
One can validly replace types matching the current class and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
, yielding the outgoing type state.
One can validly replace types matching the class MethodClassName
and the argument types given in Descriptor
on the incoming operand stack with the return type given in Descriptor
.
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisClass(Environment, class(CurrentClassName, CurrentLoader)),
isAssignable(class(CurrentClassName, CurrentLoader),
class(MethodClassName, CurrentLoader)),
reverse([class(CurrentClassName, CurrentLoader) | OperandArgList],
StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, MethodName, Descriptor),
MethodName \= '`<init>`',
MethodName \= '`<clinit>`',
parseMethodDescriptor(Descriptor, OperandArgList, ReturnType),
thisType(Environment, ThisType),
isAssignable(ThisType, class(MethodClassName, CurrentLoader)),
reverse([ThisType | OperandArgList], StackArgList),
validTypeTransition(Environment, StackArgList, ReturnType,
StackFrame, NextStackFrame),
reverse([class(MethodClassName, CurrentLoader) | OperandArgList],
StackArgList2),
validTypeTransition(Environment, StackArgList2, ReturnType,
StackFrame, _ResultStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
isAssignable
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, must name a method in the current class/interface or a superclass/superinterface.
The first
validTypeTransition
clause enforces the structural constraint that invokespecial, for other than an instance initialization method, targets a receiver object of the current class or deeper. To see why, consider thatStackArgList
simulates the list of types on the operand stack expected by the method, starting with the current class (the class performing invokespecial). The actual types on the operand stack are inStackFrame
. The effect ofvalidTypeTransition
is to pop the first type from the operand stack inStackFrame
and check it is a subtype of the first term ofStackArgList
, namely the current class. Thus, the actual receiver type is compatible with the current class.
A sharp-eyed reader might notice that enforcing this structural constraint supercedes the structural constraint pertaining to invokespecial of a
protected
method. Thus, the Prolog code above makes no reference topassesProtectedCheck
(4.10.1.8), whereas the Prolog code for invokespecial of an instance initialization method usespassesProtectedCheck
to ensure the actual receiver type is compatible with the current class when certainprotected
instance initialization methods are named.
The second
validTypeTransition
clause enforces the structural constraint that any method invocation instruction must target a receiver object whose type is compatible with the type named by the instruction. To see why, consider thatStackArgList2
simulates the list of types on the operand stack expected by the method, starting with the type named by the instruction. Again, the actual types on the operand stack are inStackFrame
, and the effect ofvalidTypeTransition
is to check the actual receiver type inStackFrame
is compatible with the type named by the instruction inStackArgList2
.
Or:
MethodName is <init>
.
Descriptor
specifies a void
return type.
One can validly pop types matching the argument types given in Descriptor
and an uninitialized type, UninitializedArg
, off the incoming operand stack, yielding OperandStack
.
The outgoing type state is derived from the incoming type state by first replacing the incoming operand stack with OperandStack
and then replacing all instances of UninitializedArg
with the type of instance being initialized.
If the instruction calls an instance initialization method on a class instance created by an earlier new instruction, and the method is protected
, the usage conforms to the special rules governing access to protected
members (4.10.1.8).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitializedThis | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitializedThis, Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitializedThis, Flags, NextFlags),
substitute(uninitializedThis, This, OperandStack, NextOperandStack),
substitute(uninitializedThis, This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags).
instructionIsTypeSafe(invokespecial(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = method(MethodClassName, '`<init>`', Descriptor),
parseMethodDescriptor(Descriptor, OperandArgList, void),
reverse(OperandArgList, StackArgList),
canPop(StackFrame, StackArgList, TempFrame),
TempFrame = frame(Locals, [uninitialized(Address) | OperandStack], Flags),
currentClassLoader(Environment, CurrentLoader),
rewrittenUninitializedType(uninitialized(Address), Environment,
class(MethodClassName, CurrentLoader), This),
rewrittenInitializationFlags(uninitialized(Address), Flags, NextFlags),
substitute(uninitialized(Address), This, OperandStack, NextOperandStack),
substitute(uninitialized(Address), This, Locals, NextLocals),
NextStackFrame = frame(NextLocals, NextOperandStack, NextFlags),
ExceptionStackFrame = frame(Locals, [], Flags),
passesProtectedCheck(Environment, MethodClassName, '`<init>`',
Descriptor, NextStackFrame).
To compute what type the uninitialized argument's type needs to be rewritten to, there are two cases:
If we are initializing an object within its constructor, its type is initially uninitializedThis
. This type will be rewritten to the type of the class of the <init>
method.
The second case arises from initialization of an object created by new. The uninitialized arg type is rewritten to MethodClass
, the type of the method holder of <init>
. We check whether there really is a new instruction at Address
.
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, MethodClass).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClassType, MethodClassType) :-
thisType(Environment, MethodClassType).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClass, MethodClass) :-
MethodClass = class(MethodClassName, CurrentLoader),
thisClass(Environment, class(thisClassName, thisLoader)),
superclassChain(thisClassName, thisLoader, [MethodClass | Rest]).
rewrittenUninitializedType(uninitializedThis, Environment,
MethodClassType, MethodClassType) :-
MethodClassType = class(MethodClassName, _),
thisClass(Environment, ThisClass),
classSuperClassName(ThisClass, MethodClassName).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClass, MethodClass) :-
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClass)), Instructions).
rewrittenUninitializedType(uninitialized(Address), Environment,
MethodClassType, MethodClassType) :-
MethodClassType = class(MethodClassName, _),
allInstructions(Environment, Instructions),
member(instruction(Address, new(MethodClassName)), Instructions).
rewrittenInitializationFlags(uninitializedThis, _Flags, []).
rewrittenInitializationFlags(uninitialized(_), Flags, Flags).
substitute(_Old, _New, [], []).
substitute(Old, New, [Old | FromRest], [New | ToRest]) :-
substitute(Old, New, FromRest, ToRest).
substitute(Old, New, [From1 | FromRest], [From1 | ToRest]) :-
From1 \= Old,
substitute(Old, New, FromRest, ToRest).
The rule for invokespecial of an
<init>
method is the sole motivation for passing back a distinct exception stack frame. The concern is that when initializing an object within its constructor, invokespecial can cause a superclass<init>
method to be invoked, and that invocation could fail, leavingthis
uninitialized. This situation cannot be created using source code in the Java programming language, but can be created by programming in bytecode directly.
In this situation, the original frame holds an uninitialized object in local variable 0 and has flag
flagThisUninit
. Normal termination of invokespecial initializes the uninitialized object and turns off theflagThisUninit
flag. But if the invocation of an<init>
method throws an exception, the uninitialized object might be left in a partially initialized state, and needs to be made permanently unusable. This is represented by an exception frame containing the broken object (the new value of the local) and theflagThisUninit
flag (the old flag). There is no way to get from an apparently-initialized object bearing theflagThisUninit
flag to a properly initialized object, so the object is permanently unusable.
If not for this situation, the flags of the exception stack frame would always be the same as the flags of the input stack frame.
An ldc instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where Type
is loadable (4.4), but not long
or double
, and one can validly push Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
Type \= long,
Type \= double,
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
loadableConstant(CP, Type) :-
member([CP, Type], [
[int(_), int],
[float(_), float],
[long(_), long],
[double(_), double]
]).
loadableConstant(CP, Type) :-
isBootstrapLoader(BL),
member([CP, Type], [
[class(_), class('java/lang/Class', BL)],
[string(_), class('java/lang/String', BL)],
[methodHandle(_,_), class('java/lang/invoke/MethodHandle', BL)],
[methodType(_,_), class('java/lang/invoke/MethodType', BL)]
]).
loadableConstant(CP, Type) :-
isBootstrapLoader(BL),
member([CP, Type], [
[class(_,_), class('java/lang/Class', BL)],
[arrayOf(_), class('java/lang/Class', BL)],
[string(_), class('java/lang/String', BL)],
[methodHandle(_,_), class('java/lang/invoke/MethodHandle', BL)],
[methodType(_,_), class('java/lang/invoke/MethodType', BL)]
]).
loadableConstant(CP, Type) :-
CP = dconstant(_, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, Type).
An ldc_w instruction is type safe iff the equivalent ldc instruction is type safe.
instructionHasEquivalentTypeRule(ldc_w(CP), ldc(CP))
An ldc2_w instruction with operand CP
is type safe iff CP
refers to a constant pool entry denoting an entity of type Type
, where Type
is either long
or double
, and one can validly push Type
onto the incoming operand stack yielding the outgoing type state.
instructionIsTypeSafe(ldc2_w(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
loadableConstant(CP, Type),
(Type = long ; Type = double),
validTypeTransition(Environment, [], Type, StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
A multianewarray instruction with operands CP
and Dim
is type safe iff CP
refers to a constant pool entry denoting an array type whose dimension is greater or equal to Dim
, Dim
is strictly positive, and one can validly replace Dim
int
types on the incoming operand stack with the type denoted by CP
yielding the outgoing type state.
instructionIsTypeSafe(multianewarray(CP, Dim), Environment, _Offset,
StackFrame, NextStackFrame, ExceptionStackFrame) :-
CP = arrayOf(_),
classDimension(CP, Dimension),
Dimension >= Dim,
Dim > 0,
/* Make a list of Dim ints */
findall(int, between(1, Dim, _), IntList),
validTypeTransition(Environment, IntList, CP,
StackFrame, NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The dimension dimensions of an array type whose component type is also an array type is one more than the dimension dimensions of its component type.
classDimension(arrayOf(X), Dimension) :-
classDimension(X, Dimension1),
Dimension is Dimension1 + 1.
classDimension(_, Dimension) :-
Dimension = 0.
arrayDimensions(arrayOf(X), XDimensions + 1) :-
arrayDimensions(X, XDimensions).
arrayDimensions(Type, Dimension) :-
Type \= arrayOf(_).
Renamed this predicate, since the element type is not necessarily a class type. Also addressed a bug: the second rule previously would match array types as well as non-array types.
A new instruction with operand CP
ClassName
at offset Offset
is type safe iff the type CP
refers to a constant pool entry denoting a class or interface type,uninitialized(Offset)
does not appear in the incoming operand stack, and one can validly push uninitialized(Offset)
onto the incoming operand stack and replace uninitialized(Offset)
with top
in the incoming local variables yielding the outgoing type state.
instructionIsTypeSafe(new(CP), Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, OperandStack, Flags),
CP = class(_, _),
NewItem = uninitialized(Offset),
notMember(NewItem, OperandStack),
substitute(NewItem, top, Locals, NewLocals),
validTypeTransition(Environment, [], NewItem,
frame(NewLocals, OperandStack, Flags),
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
instructionIsTypeSafe(new(ClassName), Environment, Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
StackFrame = frame(Locals, OperandStack, Flags),
NewItem = uninitialized(Offset),
notMember(NewItem, OperandStack),
substitute(NewItem, top, Locals, NewLocals),
validTypeTransition(Environment, [], NewItem,
frame(NewLocals, OperandStack, Flags),
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The
substitute
predicate is defined in the rule for invokespecial (4.10.1.9.invokespecial).
A putfield instruction with operand CP
is type safe iff all of the following are true:
Its first operand, CP
, refers to a constant pool entry denoting a field whose declared type is FieldType
, declared in a class FieldClassName
. ~~ FieldClassName
must not be an array type.~~
Either:
One can validly pop types matching FieldType
and FieldClassName
off the incoming operand stack yielding the outgoing type state.
protected
fields are subject to additional checks (4.10.1.8).
instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
canPop(StackFrame, [FieldType], PoppedFrame),
passesProtectedCheck(Environment, FieldClassName, FieldName,
FieldDescriptor, PoppedFrame),
currentClassLoader(Environment, CurrentLoader),
canPop(StackFrame, [FieldType, class(FieldClassName, CurrentLoader)],
NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
Or:
FieldClassName
, then one can validly pop types matching FieldType
and uninitializedThis
off the incoming operand stack yielding the outgoing type state. This allows instance fields of this
that are declared in the current class to be assigned prior to complete initialization of this
.instructionIsTypeSafe(putfield(CP), Environment, _Offset, StackFrame,
NextStackFrame, ExceptionStackFrame) :-
CP = field(FieldClassName, _FieldName, FieldDescriptor),
parseFieldDescriptor(FieldDescriptor, FieldType),
Environment = environment(CurrentClass, CurrentMethod, _, _, _, _),
CurrentClass = class(FieldClassName, _),
isInit(CurrentMethod),
canPop(StackFrame, [FieldType, uninitializedThis], NextStackFrame),
exceptionStackFrame(StackFrame, ExceptionStackFrame).
The Java Virtual Machine maintains a run-time constant pool for each class and interface (2.5.5). This data structure serves many of the purposes of the symbol table of a conventional programming language implementation. The constant_pool
table in the binary representation of a class or interface (4.4) is used to construct the run-time constant pool upon class or interface creation (5.3).
There are two kinds of entry in the run-time constant pool: symbolic references, which may later be resolved (5.4.3), and static constants, which require no further processing.
The symbolic references in the run-time constant pool are derived from entries in the constant_pool
table in accordance with the structure of each entry:
A symbolic reference to a class, or interface, or array type is derived from a CONSTANT_Class_info
structure (4.4.1). Such a reference gives the name of the class or interface in the following form: Such a reference gives the name of a class or interface, or the field descriptor representation of an array type, as described in 4.4.1.
For a nonarray class or an interface, the name is the binary name (4.2.1) of the class or interface.
For an array class of n dimensions, the name begins with n occurrences of the ASCII [
character followed by a representation of the element type:
If the element type is a primitive type, it is represented by the corresponding field descriptor (4.3.2).
Otherwise, if the element type is a reference type, it is represented by the ASCII L
character followed by the binary name of the element type followed by the ASCII ;
character.
Whenever this chapter refers to the name of a class or interface, the name should be understood to be in the form above. (This is also the form returned by the Class.getName
method.)
This extended description of the form of the CONSTANT_Class_info
attribute is redundant and better left to section 4.4.1.
A symbolic reference to a field of a class or an interface is derived from a CONSTANT_Fieldref_info
structure (4.4.2). Such a reference gives the name and descriptor of the field, as well as a symbolic reference to the class or interface in which the field is to be found.
A symbolic reference to a method of a class is derived from a CONSTANT_Methodref_info
structure (4.4.2). Such a reference gives the name and descriptor of the method, as well as a symbolic reference to the class in which the method is to be found.
A symbolic reference to a method of an interface is derived from a CONSTANT_InterfaceMethodref_info
structure (4.4.2). Such a reference gives the name and descriptor of the interface method, as well as a symbolic reference to the interface in which the method is to be found.
A symbolic reference to a method handle is derived from a CONSTANT_MethodHandle_info
structure (4.4.8). Such a reference gives a symbolic reference to a field of a class or interface, or a method of a class, or a method of an interface, depending on the kind of the method handle.
A symbolic reference to a method type is derived from a CONSTANT_MethodType_info
structure (4.4.9). Such a reference gives a method descriptor (4.3.3).
A symbolic reference to a dynamically-computed constant is derived from a CONSTANT_Dynamic_info
structure (4.4.10). Such a reference gives:
a symbolic reference to a method handle, which will be invoked to compute the constant's value;
a sequence of symbolic references and static constants, which will serve as static arguments when the method handle is invoked;
an unqualified name and a field descriptor.
A symbolic reference to a dynamically-computed call site is derived from a CONSTANT_InvokeDynamic_info
structure (4.4.10). Such a reference gives:
a symbolic reference to a method handle, which will be invoked in the course of an invokedynamic instruction (6.5.invokedynamic) to compute an instance of java.lang.invoke.CallSite
;
a sequence of symbolic references and static constants, which will serve as static arguments when the method handle is invoked;
an unqualified name and a method descriptor.
The static constants in the run-time constant pool are also derived from entries in the constant_pool
table in accordance with the structure of each entry:
A string constant is a reference
to an instance of class String
, and is derived from a CONSTANT_String_info
structure (4.4.3). To derive a string constant, the Java Virtual Machine examines the sequence of code points given by the CONSTANT_String_info
structure:
If the method String.intern
has previously been invoked on an instance of class String
containing a sequence of Unicode code points identical to that given by the CONSTANT_String_info
structure, then the string constant is a reference
to that same instance of class String
.
Otherwise, a new instance of class String
is created containing the sequence of Unicode code points given by the CONSTANT_String_info
structure. The string constant is a reference
to the new instance. Finally, the method String.intern
is invoked on the new instance.
Numeric constants are derived from CONSTANT_Integer_info
, CONSTANT_Float_info
, CONSTANT_Long_info
, and CONSTANT_Double_info
structures (4.4.4, 4.4.5).
Note that CONSTANT_Float_info
structures represent values in IEEE 754 single format and CONSTANT_Double_info
structures represent values in IEEE 754 double format. The numeric constants derived from these structures must thus be values that can be represented using IEEE 754 single and double formats, respectively.
The remaining structures in the constant_pool
table - the descriptive structures CONSTANT_NameAndType_info
, CONSTANT_Module_info
, and CONSTANT_Package_info
, and the foundational structure CONSTANT_Utf8_info
- are only used indirectly when constructing the run-time constant pool. No entries in the run-time constant pool correspond directly to these structures.
Some entries in the run-time constant pool are loadable, which means:
They may be pushed onto the stack by the ldc family of instructions (6.5.ldc, 6.5.ldc_w, 6.5.ldc2_w).
They may be static arguments to bootstrap methods for dynamically-computed constants and call sites (5.4.3.6).
An entry in the run-time constant pool is loadable if it is derived from an entry in the constant_pool
table that is loadable (see Table 4.4-C). Accordingly, the following entries in the run-time constant pool are loadable:
Symbolic references to classes, and interfaces, and array types
Symbolic references to method handles
Symbolic references to method types
Symbolic references to dynamically-computed constants
Static constants
Creation of a class or interface C denoted by the name N consists of the construction in the method area of the Java Virtual Machine (2.5.4) of an implementation-specific internal representation of C. Class or interface creation is triggered by another class or interface D, which references C through its run-time constant pool. Class or interface creation may also be triggered by D invoking methods in certain Java SE Platform class libraries (2.12) such as reflection.
If C is not an array class, it A class or interface C is created by loading a binary representation of C (4) using a class loader. Array classes do not have an external binary representation; they are created by the Java Virtual Machine rather than by a class loader.
There are two kinds of class loaders: the bootstrap class loader supplied by the Java Virtual Machine, and user-defined class loaders. Every user-defined class loader is an instance of a subclass of the abstract class ClassLoader
. Applications employ user-defined class loaders in order to extend the manner in which the Java Virtual Machine dynamically loads and thereby creates classes. User-defined class loaders can be used to create classes that originate from user-defined sources. For example, a class could be downloaded across a network, generated on the fly, or extracted from an encrypted file.
A class loader L may create C by defining it directly or by delegating to another class loader. If L creates C directly, we say that L defines C or, equivalently, that L is the defining loader of C.
When one class loader delegates to another class loader, the loader that initiates the loading is not necessarily the same loader that completes the loading and defines the class. If L creates C, either by defining it directly or by delegation, we say that L initiates loading of C or, equivalently, that L is an initiating loader of C.
At run time, a class or interface is determined not by its name alone, but by a pair: its binary name (4.2.1) and its defining class loader. Each such class or interface belongs to a single run-time package. The run-time package of a class or interface is determined by the package name and defining class loader of the class or interface.
The Java Virtual Machine uses one of three two procedures to create class or interface C denoted by N:
If N denotes a nonarray class or an interface, one of the two following methods is used to load and thereby create C:
If D was defined by the bootstrap class loader, then the bootstrap class loader initiates loading of C (5.3.1).
If D was defined by a user-defined class loader, then that same user-defined class loader initiates loading of C (5.3.2).
Otherwise N denotes an array class. An array class is created directly by the Java Virtual Machine (5.3.3), not by a class loader. However, the defining class loader of D is used in the process of creating array class C.
References to loading of "array classes" are removed to clarify that types, such as array types, need not be loaded. Only classes and interfaces are loaded. (An array type named by a CONSTANT_Class_info
may still be resolved—see 5.4.3.1.)
If an error occurs during class loading, then an instance of a subclass of LinkageError
must be thrown at a point in the program that (directly or indirectly) uses the class or interface being loaded.
If the Java Virtual Machine ever attempts to load a class C during verification (5.4.1) or resolution (5.4.3) (but not initialization (5.5)), and the class loader that is used to initiate loading of C throws an instance of ClassNotFoundException
, then the Java Virtual Machine must throw an instance of NoClassDefFoundError
whose cause is the instance of ClassNotFoundException
.
(A subtlety here is that recursive class loading to load superclasses is performed as part of resolution (5.3.5, step 3). Therefore, a ClassNotFoundException
that results from a class loader failing to load a superclass must be wrapped in a NoClassDefFoundError
.)
A well-behaved class loader should maintain three properties:
Given the same name, a good class loader should always return the same
Class
object.If a class loader L1 delegates loading of a class C to another loader L2, then for any type T that occurs as the direct superclass or a direct superinterface of C, or as the type of a field in C, or as the type of a formal parameter of a method or constructor in C, or as a return type of a method in C, L1 and L2 should return the same
Class
object.If a user-defined classloader prefetches binary representations of classes and interfaces, or loads a group of related classes together, then it must reflect loading errors only at points in the program where they could have arisen without prefetching or group loading.
We will sometimes represent a class or interface using the notation <
N, Ld>
, where N denotes the name of the class or interface and Ld denotes the defining loader of the class or interface.
We will also represent a class or interface using the notation NLi, where N denotes the name of the class or interface and Li denotes an initiating loader of the class or interface.
The following steps are used to load and thereby create the nonarray class or interface C denoted by N using the bootstrap class loader.
First, the Java Virtual Machine determines whether the bootstrap class loader has already been recorded as an initiating loader of a class or interface denoted by N. If so, this class or interface is C, and no class creation is necessary.
Otherwise, the Java Virtual Machine passes the argument N to an invocation of a method on the bootstrap class loader to search for a purported representation of C in a platform-dependent manner. Typically, a class or interface will be represented using a file in a hierarchical file system, and the name of the class or interface will be encoded in the pathname of the file.
Note that there is no guarantee that a purported representation found is valid or is a representation of C. This phase of loading must detect the following error:
ClassNotFoundException
.Then the Java Virtual Machine attempts to derive a class denoted by N using the bootstrap class loader from the purported representation using the algorithm found in 5.3.5. That class is C.
The following steps are used to load and thereby create the nonarray class or interface C denoted by N using a user-defined class loader L.
First, the Java Virtual Machine determines whether L has already been recorded as an initiating loader of a class or interface denoted by N. If so, this class or interface is C, and no class creation is necessary.
Otherwise, the Java Virtual Machine invokes loadClass(*N*)
on L. The value returned by the invocation is the created class or interface C. The Java Virtual Machine then records that L is an initiating loader of C (5.3.4). The remainder of this section describes this process in more detail.
When the loadClass
method of the class loader L is invoked with the name N of a class or interface C to be loaded, L must perform one of the following two operations in order to load C:
The class loader L can create an array of bytes representing C as the bytes of a ClassFile
structure (4.1); it then must invoke the method defineClass
of class ClassLoader
. Invoking defineClass
causes the Java Virtual Machine to derive a class or interface denoted by N using L from the array of bytes using the algorithm found in 5.3.5.
The class loader L can delegate the loading of C to some other class loader L'. This is accomplished by passing the argument N directly or indirectly to an invocation of a method on L' (typically the loadClass
method). The result of the invocation is C.
In either (1) or (2), if the class loader L is unable to load a class or interface denoted by N for any reason, it must throw an instance of ClassNotFoundException
.
Since JDK 1.1, Oracle’s Java Virtual Machine implementation has invoked the
loadClass
method of a class loader in order to cause it to load a class or interface. The argument toloadClass
is the name of the class or interface to be loaded. There is also a two-argument version of theloadClass
method, where the second argument is aboolean
that indicates whether the class or interface is to be linked or not. Only the two-argument version was supplied in JDK 1.0.2, and Oracle’s Java Virtual Machine implementation relied on it to link the loaded class or interface. From JDK 1.1 onward, Oracle’s Java Virtual Machine implementation links the class or interface directly, without relying on the class loader.
The following steps are used to create the array class C denoted by N using class loader L. Class loader L may be either the bootstrap class loader or a user-defined class loader.
If L has already been recorded as an initiating loader of an array class with the same component type as N, that class is C, and no array class creation is necessary.
Otherwise, the following steps are performed to create C:
If the component type is a reference
type, the algorithm of this section (5.3) is applied recursively using class loader L in order to load and thereby create the component type of C.
The Java Virtual Machine creates a new array class with the indicated component type and number of dimensions.
If the component type is a reference
type, C is marked as having been defined by the defining class loader of the component type. Otherwise, C is marked as having been defined by the bootstrap class loader.
In any case, the Java Virtual Machine then records that L is an initiating loader for C (5.3.4).
If the component type is a reference
type, the accessibility of the array class is determined by the accessibility of its component type (5.4.4). Otherwise, the array class is accessible to all classes and interfaces.
Deleting this section will necessitate a renumbering or restructuring of the subsequent sections.
Ensuring type safe linkage in the presence of class loaders requires special care. It is possible that when two different class loaders initiate loading of a class or interface denoted by N, the name N may denote a different class or interface in each loader.
When a class or interface C = <
N1, L1>
makes a symbolic reference to a field or method of another class or interface D = <
N2, L2>
, the symbolic reference includes a descriptor specifying the type of the field, or the return and argument types of the method. It is essential that any type class or interface name N mentioned in the field or method descriptor ([4.3.1], 4.3.2) denote the same class or interface when loaded by L1 and when loaded by L2.
To ensure this, the Java Virtual Machine imposes loading constraints of the form NL1 = NL2 during preparation (5.4.2) and resolution (5.4.3).
To enforce these constraints, the Java Virtual Machine will, at certain prescribed times (see 5.3.1, 5.3.2, 5.3.3, and 5.3.5), record that a particular loader is an initiating loader of a particular class. After recording that a loader is an initiating loader of a class, the Java Virtual Machine must immediately check to see if any loading constraints are violated. If so, the record is retracted, the Java Virtual Machine throws a LinkageError
, and the loading operation that caused the recording to take place fails.
Similarly, after imposing a loading constraint (see 5.4.2, 5.4.3.2, 5.4.3.3, and 5.4.3.4), the Java Virtual Machine must immediately check to see if any loading constraints are violated. If so, the newly imposed loading constraint is retracted, the Java Virtual Machine throws a LinkageError
, and the operation that caused the constraint to be imposed (either resolution or preparation, as the case may be) fails.
The situations described here are the only times at which the Java Virtual Machine checks whether any loading constraints have been violated. A loading constraint is violated if, and only if, all the following four conditions hold:
There exists a loader L such that L has been recorded by the Java Virtual Machine as an initiating loader of a class C named N.
There exists a loader L' such that L' has been recorded by the Java Virtual Machine as an initiating loader of a class C ' named N.
The equivalence relation defined by the (transitive closure of the) set of imposed constraints implies NL = NL'.
C ≠ C '.
A full discussion of class loaders and type safety is beyond the scope of this specification. For a more comprehensive discussion, readers are referred to Dynamic Class Loading in the Java Virtual Machine by Sheng Liang and Gilad Bracha (Proceedings of the 1998 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages and Applications).
class
File RepresentationThe following steps are used to derive a class or interface C denoted by N using loader L from a purported representation in Class
object for the nonarrayclass
file format.
This section says nothing about Class
objects, which are only auxiliary to creation and loading. Instead, these rules describe how to derive an abstract "class or interface" from a name and class loader.
First, the Java Virtual Machine determines whether it has already recorded that L is an initiating loader of a class or interface denoted by N. If so, this creation attempt is invalid and loading throws a LinkageError
.
Otherwise, the Java Virtual Machine attempts to parse the purported representation. However, the purported representation may not in fact be a valid representation of C.
This phase of loading must detect the following errors:
If the purported representation is not a ClassFile
structure (4.1, 4.8), loading throws an instance of ClassFormatError
.
Otherwise, if the purported representation is not of a supported major or minor version (4.1), loading throws an instance of UnsupportedClassVersionError
.
UnsupportedClassVersionError
, a subclass ofClassFormatError
, was introduced to enable easy identification of aClassFormatError
caused by an attempt to load a class whose representation uses an unsupported version of theclass
file format. In JDK 1.1 and earlier, an instance ofNoClassDefFoundError
orClassFormatError
was thrown in case of an unsupported version, depending on whether the class was being loaded by the system class loader or a user-defined class loader.
Otherwise, if the purported representation does not actually represent a class named N, loading throws an instance of NoClassDefFoundError
or an instance of one of its subclasses.
This occurs when the purported representation has either a this_class
item which specifies a name other than N, or an access_flags
item which has the ACC_MODULE
flag set.
If C has a direct superclass, the symbolic reference from C to its direct superclass is resolved using the algorithm of 5.4.3.1. Note that if C is an interface it must have Object
as its direct superclass, which must already have been loaded. Only Object
has no direct superclass.
Any exceptions that can be thrown due to class or interface resolution can be thrown as a result of this phase of loading. In addition, this phase of loading must detect the following errors:
If the class or interface named as the direct superclass of C is in fact an interface, loading throws an IncompatibleClassChangeError
.
Otherwise, if any of the superclasses of C is C itself, loading throws a ClassCircularityError
.
If C has any direct superinterfaces, the symbolic references from C to its direct superinterfaces are resolved using the algorithm of 5.4.3.1.
Any exceptions that can be thrown due to class or interface resolution can be thrown as a result of this phase of loading. In addition, this phase of loading must detect the following errors:
If any of the classes or interfaces named as direct superinterfaces of C is not in fact an interface, loading throws an IncompatibleClassChangeError
.
Otherwise, if any of the superinterfaces of C is C itself, loading throws a ClassCircularityError
.
The Java Virtual Machine marks C as having L as its defining class loader and records that L is an initiating loader of C (5.3.4).
The Java Virtual Machine supports the organization of classes and interfaces into modules. The membership of a class or interface C in a module M is used to control access to C from classes and interfaces in modules other than M (5.4.4).
Module membership is defined in terms of run-time packages (5.3). A program determines the names of the packages in each module, and the class loaders that will create the classes and interfaces of the named packages; it then specifies the packages and class loaders to an invocation of the defineModules
method of the class ModuleLayer
. Invoking defineModules
causes the Java Virtual Machine to create new run-time modules that are associated with the run-time packages of the class loaders.
Every run-time module indicates the run-time packages that it exports, which influences access to the public
classes and interfaces in those run-time packages. Every run-time module also indicates the other run-time modules that it reads, which influences access by its own code to the public
types and interfaces in those run-time modules.
We say that a class is in a run-time module iff the class's run-time package is associated (or will be associated, if the class is actually created) with that run-time module.
A class created by a class loader is in exactly one run-time package and therefore exactly one run-time module, because the Java Virtual Machine does not support a run-time package being associated with (or more evocatively, "split across") multiple run-time modules.
A run-time module is implicitly bound to exactly one class loader, by the semantics of defineModules
. On the other hand, a class loader may create classes in more than one run-time module, because the Java Virtual Machine does not require all the run-time packages of a class loader to be associated with the same run-time module.
In other words, the relationship between class loaders and run-time modules need not be 1:1. For a given set of modules to be loaded, if a program can determine that the names of the packages in each module are found only in that module, then the program may specify only one class loader to the invocation of
defineModules
. This class loader will create classes across multiple run-time modules.
Every run-time module created by defineModules
is part of a layer. A layer represents a set of class loaders that jointly serve to create classes in a set of run-time modules. There are two kinds of layers: the boot layer supplied by the Java Virtual Machine, and user-defined layers. The boot layer is created at Java Virtual Machine startup in an implementation-dependent manner. It associates the standard run-time module java.base
with standard run-time packages defined by the bootstrap class loader, such as java.lang
. User-defined layers are created by programs in order to construct sets of run-time modules that depend on java.base
and other standard run-time modules.
A run-time module is implicitly part of exactly one layer, by the semantics of defineModules
. However, a class loader may create classes in the run-time modules of different layers, because the same class loader may be specified to multiple invocations of defineModules
. Access control is governed by a class's run-time module, not by the class loader which created the class or by the layer(s) which the class loader serves.
The set of class loaders specified for a layer, and the set of run-time modules which are part of a layer, are immutable after the layer is created. However, the ModuleLayer
class affords programs a degree of dynamic control over the relationships between the run-time modules in a user-defined layer.
If a user-defined layer contains more than one class loader, then any delegation between the class loaders is the responsibility of the program that created the layer. The Java Virtual Machine does not check that the layer's class loaders delegate to each other in accordance with how the layer's run-time modules read each other. Moreover, if the layer's run-time modules are modified via the ModuleLayer
class to read additional run-time modules, then the Java Virtual Machine does not check that the layer's class loaders are modified by some out-of-band mechanism to delegate in a corresponding fashion.
There are similarities and differences between class loaders and layers. On the one hand, a layer is similar to a class loader in that each may delegate to, respectively, one or more parent layers or class loaders that created, respectively, modules or classes at an earlier time. That is, the set of modules specified to a layer may depend on modules not specified to the layer, and instead specified previously to one or more parent layers. On the other hand, a layer may be used to create new modules only once, whereas a class loader may be used to create new classes or interfaces at any time via multiple invocations of the
defineClass
method.
It is possible for a class loader to define a class or interface in a run-time package that was not associated with a run-time module by any of the layers which the class loader serves. This may occur if the run-time package embodies a named package that was not specified to defineModules
, or if the class or interface has a simple binary name (4.2.1) and thus is a member of a run-time package that embodies an unnamed package (JLS §7.4.2). In either case, the class or interface is treated as a member of a special run-time module which is implicitly bound to the class loader. This special run-time module is known as the unnamed module of the class loader. The run-time package of the class or interface is associated with the unnamed module of the class loader. There are special rules for unnamed modules, designed to maximize their interoperation with other run-time modules, as follows:
A class loader's unnamed module is distinct from all other run-time modules bound to the same class loader.
A class loader's unnamed module is distinct from all run-time modules (including unnamed modules) bound to other class loaders.
Every unnamed module reads every run-time module.
Every unnamed module exports, to every run-time module, every run-time package associated with itself.
Preparation involves creating the static fields for a class or interface and initializing such fields to their default values (2.3, 2.4). This does not require the execution of any Java Virtual Machine code; explicit initializers for static fields are executed as part of initialization (5.5), not preparation.
During preparation of a class or interface C, the Java Virtual Machine also imposes loading constraints (5.3.4):
Let L1 be the defining loader of C. For each instance method m declared in C that can override (5.4.5) an instance method declared in a superclass or superinterface <
D, L2>
, the Java Virtual Machine imposes loading constraints as follows **for each class or interface name N mentioned by the descriptor of m (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = N*L2**.
Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:
If Tr not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.
For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.
Then TiL1 = TiL2 for i = 0 to n.
For each instance method m declared in a superinterface <
I, L3>
of C, if C does not itself declare an instance method that can override m, then a method is selected (5.4.6) with respect to C and the method m in <
I, L3>
. Let <
D, L2>
be the class or interface that declares the selected method. The Java Virtual Machine imposes loading constraints as follows. For each class or interface name N mentioned by the descriptor of m, the Java Virtual Machine imposes the loading constraint NL2 = NL3.
Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:
If Tr not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.
For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.
Then TiL2 = TiL3 for i = 0 to n.
Preparation may occur at any time following creation but must be completed prior to initialization.
Many Java Virtual Machine instructions - anewarray, checkcast, getfield, getstatic, instanceof, invokedynamic, invokeinterface, invokespecial, invokestatic, invokevirtual, ldc, ldc_w, ldc2_w, multianewarray, new, putfield, and putstatic - rely on symbolic references in the run-time constant pool. Execution of any of these instructions requires resolution of the symbolic reference.
Resolution is the process of dynamically determining one or more concrete values from a symbolic reference in the run-time constant pool. Initially, all symbolic references in the run-time constant pool are unresolved.
Resolution of an unresolved symbolic reference to (i) a class, or interface, or array type, (ii) a field, (iii) a method, (iv) a method type, (v) a method handle, or (vi) a dynamically-computed constant, proceeds in accordance with the rules given in 5.4.3.1 through 5.4.3.5. In the first three of those sections, the class or interface in whose run-time constant pool the symbolic reference appears is labeled D. Then:
If no error occurs during resolution of the symbolic reference, then resolution succeeds.
Subsequent attempts to resolve the symbolic reference always succeed trivially and result in the same entity produced by the initial resolution. If the symbolic reference is to a dynamically-computed constant, the bootstrap method is not re-executed for these subsequent attempts.
IncompatibleClassChangeError
(or a subclass);
Error
(or a subclass) that arose from resolution or invocation of a bootstrap method; or (iii) an instance of LinkageError
(or a subclass) that arose because class loading failed or a loader constraint was violated. The error must be thrown at a point in the program that (directly or indirectly) uses the symbolic reference.Subsequent attempts to resolve the symbolic reference always fail with the same error that was thrown as a result of the initial resolution attempt. If the symbolic reference is to a dynamically-computed constant, the bootstrap method is not re-executed for these subsequent attempts.
Because errors occurring on an initial attempt at resolution are thrown again on subsequent attempts, a class in one module that attempts to access, via resolution of a symbolic reference in its run-time constant pool, an unexported
public
type in a different module will always receive the same error indicating an inaccessible type (5.4.4), even if the Java SE Platform API is used to dynamically export thepublic
type's package at some time after the class's first attempt.
Resolution of an unresolved symbolic reference to a dynamically-computed call site proceeds in accordance with the rules given in 5.4.3.6. Then:
If no error occurs during resolution of the symbolic reference, then resolution succeeds solely for the instruction in the class
file that required resolution. This instruction necessarily has an opcode of invokedynamic.
Subsequent attempts to resolve the symbolic reference by that instruction in the class
file always succeed trivially and result in the same entity produced by the initial resolution. The bootstrap method is not re-executed for these subsequent attempts.
The symbolic reference is still unresolved for all other instructions in the class
file, of any opcode, which indicate the same entry in the run-time constant pool as the invokedynamic instruction above.
IncompatibleClassChangeError
(or a subclass);
Error
(or a subclass) that arose from resolution or invocation of a bootstrap method; or (iii) an instance of LinkageError
(or a subclass) that arose because class loading failed or a loader constraint was violated. The error must be thrown at a point in the program that (directly or indirectly) uses the symbolic reference.Subsequent attempts by the same instruction in the class
file to resolve the symbolic reference always fail with the same error that was thrown as a result of the initial resolution attempt. The bootstrap method is not re-executed for these subsequent attempts.
The symbolic reference is still unresolved for all other instructions in the class
file, of any opcode, which indicate the same entry in the run-time constant pool as the invokedynamic instruction above.
Certain of the instructions above require additional linking checks when resolving symbolic references. For instance, in order for a getfield instruction to successfully resolve the symbolic reference to the field on which it operates, it must not only complete the field resolution steps given in 5.4.3.2 but also check that the field is not static
. If it is a static
field, a linking exception must be thrown.
Linking exceptions generated by checks that are specific to the execution of a particular Java Virtual Machine instruction are given in the description of that instruction and are not covered in this general discussion of resolution. Note that such exceptions, although described as part of the execution of Java Virtual Machine instructions rather than resolution, are still properly considered failures of resolution.
To resolve an unresolved symbolic reference from D to a class or interface C denoted by N, the following steps are performed:
The defining class loader of D is used to create a class or interface denoted by N. This class or interface is C. The details of the process are given in 5.3.
Any exception that can be thrown as a result of failure of class or interface creation can thus be thrown as a result of failure of class and interface resolution.
If C is an array class and its element type is a reference
type, then a symbolic reference to the class or interface representing the element type is resolved by invoking the algorithm in 5.4.3.1 recursively.
Finally, access Access control is applied for the access from D to C (5.4.4).
To resolve an unresolved symbolic reference from D to an array type T, the following steps are performed:
If the element type of the array type is a class or interface type, the named class or interface is resolved, as if by resolution of an unresolved symbolic reference to the named class or interface.
A representation of the array type denoted by the symbolic reference is created.
If steps 1 and 2 succeed but step 3 fails, C is still valid and usable. Nevertheless, resolution fails, and D is prohibited from accessing C. If resolution of a class, interface, or array type successfully loads a class or interface, but a subsequent step (such as access checking) fails, the class or interface is still valid and usable. Nevertheless, resolution fails, and the symbolic reference that was being resolved is invalid.
To resolve an unresolved symbolic reference from D to a field in a class or interface C, the symbolic reference to C given by the field reference must first be resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of a class or interface reference can be thrown as a result of failure of field resolution. If the reference to C can be successfully resolved, an exception relating to the failure of resolution of the field reference itself can be thrown.
When resolving a field reference, field resolution first attempts to look up the referenced field in C and its superclasses:
If C declares a field with the name and descriptor specified by the field reference, field lookup succeeds. The declared field is the result of the field lookup.
Otherwise, field lookup is applied recursively to the direct superinterfaces of the specified class or interface C.
Otherwise, if C has a superclass S, field lookup is applied recursively to S.
Otherwise, field lookup fails.
Then, the result of field resolution is determined:
If field lookup failed, field resolution throws a NoSuchFieldError
.
Otherwise, field lookup succeeded. Access control is applied for the access from D to the field which is the result of field lookup (5.4.4). Then:
If access control failed, field resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let <
E, L1>
be the class or interface in which the referenced field is actually declared. Let L2 be the defining loader of D. Given that the type of the referenced field is Tf: if Tf is not an array type, let T be Tf; otherwise, let T be the element type of Tf.
The Java Virtual Machine imposes the loading constraint that TL1 = TL2.
For any class or interface name N mentioned by the descriptor of the referenced field (4.3.2), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing this constraint results in any loading constraints being violated (5.3.4), then field resolution fails. Otherwise, field resolution succeeds.
To resolve an unresolved symbolic reference from D to a method in a class C, the symbolic reference to C given by the method reference is first resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of a class reference can be thrown as a result of failure of method resolution. If the reference to C can be successfully resolved, exceptions relating to the resolution of the method reference itself can be thrown.
When resolving a method reference:
If C is an interface, method resolution throws an IncompatibleClassChangeError
.
Otherwise, method resolution attempts to locate the referenced method in C and its superclasses:
If C declares exactly one method with the name specified by the method reference, and the declaration is a signature polymorphic method (2.9.3), then method lookup succeeds. All the class names mentioned in the descriptor (4.3.3) are resolved (5.4.3.1).
The resolved method is the signature polymorphic method declaration. It is not necessary for C to declare a method with the descriptor specified by the method reference.
Otherwise, if C declares a method with the name and descriptor specified by the method reference, method lookup succeeds.
Otherwise, if C has a superclass, step 2 of method resolution is recursively invoked on the direct superclass of C.
Otherwise, method resolution attempts to locate the referenced method in the superinterfaces of the specified class C:
If the maximally-specific superinterface methods of C for the name and descriptor specified by the method reference include exactly one method that does not have its ACC_ABSTRACT
flag set, then this method is chosen and method lookup succeeds.
Otherwise, if any superinterface of C declares a method with the name and descriptor specified by the method reference that has neither its ACC_PRIVATE
flag nor its ACC_STATIC
flag set, one of these is arbitrarily chosen and method lookup succeeds.
Otherwise, method lookup fails.
A maximally-specific superinterface method of a class or interface C for a particular method name and descriptor is any method for which all of the following are true:
The method is declared in a superinterface (direct or indirect) of C.
The method is declared with the specified name and descriptor.
The method has neither its ACC_PRIVATE
flag nor its ACC_STATIC
flag set.
Where the method is declared in interface I, there exists no other maximally-specific superinterface method of C with the specified name and descriptor that is declared in a subinterface of I.
The result of method resolution is determined as follows:
If method lookup failed, method resolution throws a NoSuchMethodError
.
Otherwise, method lookup succeeded. Access control is applied for the access from D to the method which is the result of method lookup (5.4.4). Then:
If access control failed, method resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let <
E, L1>
be the class or interface in which the referenced method m is actually declared. Let L2 be the defining loader of D. Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:
If Tr is not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.
For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.
The Java Virtual Machine imposes the loading constraints TiL1 = TiL2 for i = 0 to n.
For each class or interface name N mentioned by the descriptor of the referenced method (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing these constraints results in any loading constraints being violated (5.3.4), then method resolution fails. Otherwise, method resolution succeeds.
When resolution searches for a method in the class's superinterfaces, the best outcome is to identify a maximally-specific non-
abstract
method. It is possible that this method will be chosen by method selection, so it is desirable to add class loader constraints for it.
Otherwise, the result is nondeterministic. This is not new: The Java® Virtual Machine Specification has never identified exactly which method is chosen, and how "ties" should be broken. Prior to Java SE 8, this was mostly an unobservable distinction. However, beginning with Java SE 8, the set of interface methods is more heterogenous, so care must be taken to avoid problems with nondeterministic behavior. Thus:
Superinterface methods that are
private
andstatic
are ignored by resolution. This is consistent with the Java programming language, where such interface methods are not inherited.Any behavior controlled by the resolved method should not depend on whether the method is
abstract
or not.
Note that if the result of resolution is an
abstract
method, the referenced class C may be non-abstract
. Requiring C to beabstract
would conflict with the nondeterministic choice of superinterface methods. Instead, resolution assumes that the run time class of the invoked object has a concrete implementation of the method.
To resolve an unresolved symbolic reference from D to an interface method in an interface C, the symbolic reference to C given by the interface method reference is first resolved (5.4.3.1). Therefore, any exception that can be thrown as a result of failure of resolution of an interface reference can be thrown as a result of failure of interface method resolution. If the reference to C can be successfully resolved, exceptions relating to the resolution of the interface method reference itself can be thrown.
When resolving an interface method reference:
If C is not an interface, interface method resolution throws an IncompatibleClassChangeError
.
Otherwise, if C declares a method with the name and descriptor specified by the interface method reference, method lookup succeeds.
Otherwise, if the class Object
declares a method with the name and descriptor specified by the interface method reference, which has its ACC_PUBLIC
flag set and does not have its ACC_STATIC
flag set, method lookup succeeds.
Otherwise, if the maximally-specific superinterface methods (5.4.3.3) of C for the name and descriptor specified by the method reference include exactly one method that does not have its ACC_ABSTRACT
flag set, then this method is chosen and method lookup succeeds.
Otherwise, if any superinterface of C declares a method with the name and descriptor specified by the method reference that has neither its ACC_PRIVATE
flag nor its ACC_STATIC
flag set, one of these is arbitrarily chosen and method lookup succeeds.
Otherwise, method lookup fails.
The result of interface method resolution is determined as follows:
If method lookup failed, interface method resolution throws a NoSuchMethodError
.
Otherwise, method lookup succeeded. Access control is applied for the access from D to the method which is the result of method lookup (5.4.4). Then:
If access control failed, interface method resolution fails for the same reason.
Otherwise, access control succeeded. Loading constraints are imposed, as follows.
Let <
E, L1>
be the class or interface in which the referenced interface method m is actually declared. Let L2 be the defining loader of D. Given that the return type of m is Tr, and that the formal parameter types of m are Tf1, ..., Tfn:
If Tr is not an array type, let T0 be Tr; otherwise, let T0 be the element type of Tr.
For i = 1 to n: If Tfi is not an array type, let Ti be Tfi; otherwise, let Ti be the element type of Tfi.
The Java Virtual Machine imposes the loading constraints TiL1 = TiL2 for i = 0 to n.
For each class or interface name N mentioned by the descriptor of the referenced method (4.3.3), the Java Virtual Machine imposes the loading constraint NL1 = NL2 (5.3.4).
If imposing these constraints results in any loading constraints being violated (5.3.4), then interface method resolution fails. Otherwise, interface method resolution succeeds.
Access control is necessary because interface method resolution may pick a
private
method of interface C. (Prior to Java SE 8, the result of interface method resolution could be a non-public
method of classObject
or astatic
method of classObject
; such results were not consistent with the inheritance model of the Java programming language, and are disallowed in Java SE 8 and above.)
To resolve an unresolved symbolic reference to a method type, it is as if resolution occurs of unresolved symbolic references to classes and interfaces (5.4.3.1) whose names correspond to the types given in are mentioned by the method descriptor (4.3.3), in the order in which they are mentioned.
Any exception that can be thrown as a result of failure of resolution of a class reference to a class or interface can thus be thrown as a result of failure of method type resolution.
The result of successful method type resolution is a reference
to an instance of java.lang.invoke.MethodType
which represents the method descriptor.
Method type resolution occurs regardless of whether the run-time constant pool actually contains symbolic references to classes and interfaces indicated in the method descriptor. Also, the resolution is deemed to occur on unresolved symbolic references, so a failure to resolve one method type will not necessarily lead to a later failure to resolve another method type with the same textual method descriptor, if suitable classes and interfaces can be loaded by the later time.
Resolution of an unresolved symbolic reference to a method handle is more complicated. Each method handle resolved by the Java Virtual Machine has an equivalent instruction sequence called its bytecode behavior, indicated by the method handle's kind. The integer values and descriptions of the nine kinds of method handle are given in Table 5.4.3.5-A.
Symbolic references by an instruction sequence to fields or methods are indicated by C.x:T
, where x
and T
are the name and descriptor (4.3.2, 4.3.3) of the field or method, and C
is the class or interface in which the field or method is to be found.
Table 5.4.3.5-A. Bytecode Behaviors for Method Handles
Kind | Description | Interpretation |
---|---|---|
1 | REF_getField |
getfield C.f:T |
2 | REF_getStatic |
getstatic C.f:T |
3 | REF_putField |
putfield C.f:T |
4 | REF_putStatic |
putstatic C.f:T |
5 | REF_invokeVirtual |
invokevirtual C.m:(A*)T |
6 | REF_invokeStatic |
invokestatic C.m:(A*)T |
7 | REF_invokeSpecial |
invokespecial C.m:(A*)T |
8 | REF_newInvokeSpecial |
new C; dup; invokespecial C.<init>:(A*)V |
9 | REF_invokeInterface |
invokeinterface C.m:(A*)T |
Let MH be the symbolic reference to a method handle (5.1) being resolved. Also:
Let R be the symbolic reference to the a field or method contained within given by MH.
R is derived from the CONSTANT_Fieldref
, CONSTANT_Methodref
, or CONSTANT_InterfaceMethodref
structure referred to by the reference_index
item of the CONSTANT_MethodHandle
from which MH is derived.
It was already established that a symbolic reference to a method handle gives a symbolic reference to a field or method (5.1). No need to be talking about class file constructs here.
For example, R is a symbolic reference to C
.
f for bytecode behavior of kind 1, and a symbolic reference to C.
<init>
for bytecode behavior of kind 8.
If MH's bytecode behavior is kind 7 (REF_invokeSpecial
), then C must be the current class or interface, a superclass of the current class, a direct superinterface of the current class or interface, or Object
.
Let C be the class or interface referenced by R.
Let T be the type of given by the field referenced by descriptor of R, or the return type of given by the method referenced by descriptor of R. Let A* be the sequence (perhaps empty) of parameter types of given by the method referenced by descriptor of R.
T and A* are derived from the CONSTANT_NameAndType
structure referred to by the name_and_type_index
item in the CONSTANT_Fieldref
, CONSTANT_Methodref
, or CONSTANT_InterfaceMethodref
structure from which R is derived.
To resolve MH, all symbolic references to classes, interfaces, fields, and methods in MH's bytecode behavior are resolved, using the following four steps:
R is resolved. This occurs as if by field resolution (5.4.3.2) when MH's bytecode behavior is kind 1, 2, 3, or 4, and as if by method resolution (5.4.3.3) when MH's bytecode behavior is kind 5, 6, 7, or 8, and as if by interface method resolution (5.4.3.4) when MH's bytecode behavior is kind 9.
The following constraints apply to the result of resolving R. These constraints correspond to those that would be enforced during verification or execution of the instruction sequence for the relevant bytecode behavior.
If MH's bytecode behavior is kind 7 (REF_invokeSpecial
), then C must be the current class or interface, a superclass of the current class, a direct superinterface of the current class or interface, or Object
.
Moved this rule here to be clear about the timing and effects of this error check.
If MH's bytecode behavior is kind 8 (REF_newInvokeSpecial
), then R must resolve to an instance initialization method declared in class C.
If R resolves to a protected
member, then the following rules apply depending on the kind of MH's bytecode behavior:
For kinds 1, 3, and 5 (REF_getField
, REF_putField
, and REF_invokeVirtual
): If C.f
or C.m
resolved to a protected
field or method, and C is in a different run-time package than the current class, then C must be assignable to the current class.
For kind 8 (REF_newInvokeSpecial
): If C .
<init>
resolved to a protected
method, then C must be declared in the same run-time package as the current class.
R must resolve to a static
or non-static
member depending on the kind of MH's bytecode behavior:
For kinds 1, 3, 5, 7, and 9 (REF_getField
, REF_putField
, REF_invokeVirtual
, REF_invokeSpecial
, and REF_invokeInterface
): C.f
or C.m
must resolve to a non-static
field or method.
For kinds 2, 4, and 6 (REF_getStatic
, REF_putStatic
, and REF_invokeStatic
): C.f
or C.m
must resolve to a static
field or method.
Resolution occurs as if of unresolved symbolic references to classes and interfaces whose names correspond to each type in A* , and to the type T, in that order.
This is phrased incorrectly—not all types correspond to class and interface names. It's also unnecessary: the next step will perform MethodType
resolution which, as described above, resolves all the mentioned classes and interfaces.
A reference to an instance of java.lang.invoke.MethodType
is obtained as if by resolution of an unresolved symbolic reference to a method type that contains the method descriptor specified in Table 5.4.3.5-B for the kind of MH.
It is as if the symbolic reference to a method handle contains a symbolic reference to the method type that the resolved method handle will eventually have. The detailed structure of the method type is obtained by inspecting Table 5.4.3.5-B.
Table 5.4.3.5-B. Method Descriptors for Method Handles
Kind | Description | Method descriptor |
---|---|---|
1 | REF_getField |
(C)T |
2 | REF_getStatic |
()T |
3 | REF_putField |
(C,T)V |
4 | REF_putStatic |
(T)V |
5 | REF_invokeVirtual |
(C,A*)T |
6 | REF_invokeStatic |
(A*)T |
7 | REF_invokeSpecial |
(C,A*)T |
8 | REF_newInvokeSpecial |
(A*)C |
9 | REF_invokeInterface |
(C,A*)T |
In steps 1, 3, and 4 1 and 3, any exception that can be thrown as a result of failure of resolution of a symbolic reference to a class, interface, field, or method can be thrown as a result of failure of method handle resolution. In step 2, any failure due to the specified constraints causes a failure of method handle resolution due to an IllegalAccessError
.
The intent is that resolving a method handle can be done in exactly the same circumstances that the Java Virtual Machine would successfully verify and resolve the symbolic references in the bytecode behavior. In particular, method handles to
private
,protected
, andstatic
members can be created in exactly those classes for which the corresponding normal accesses are legal.
The result of successful method handle resolution is a reference
to an instance of java.lang.invoke.MethodHandle
which represents the method handle MH.
The type descriptor of this java.lang.invoke.MethodHandle
instance is the java.lang.invoke.MethodType
instance produced in the third step of method handle resolution above.
The type descriptor of a method handle is such that a valid call to
invokeExact
injava.lang.invoke.MethodHandle
on the method handle has exactly the same stack effects as the bytecode behavior. Calling this method handle on a valid set of arguments has exactly the same effect and returns the same result (if any) as the corresponding bytecode behavior.
If the method referenced by R has the ACC_VARARGS
flag set (4.6), then the java.lang.invoke.MethodHandle
instance is a variable arity method handle; otherwise, it is a fixed arity method handle.
A variable arity method handle performs argument list boxing (JLS §15.12.4.2) when invoked via invoke
, while its behavior with respect to invokeExact
is as if the ACC_VARARGS
flag were not set.
Method handle resolution throws an IncompatibleClassChangeError
if the method referenced by R has the ACC_VARARGS
flag set and either A* is an empty sequence or the last parameter type in A* is not an array type. That is, creation of a variable arity method handle fails.
An implementation of the Java Virtual Machine is not required to intern method types or method handles. That is, two distinct symbolic references to method types or method handles which are structurally identical might not resolve to the same instance of java.lang.invoke.MethodType
or java.lang.invoke.MethodHandle
respectively.
The
java.lang.invoke.MethodHandles
class in the Java SE Platform API allows creation of method handles with no bytecode behavior. Their behavior is defined by the method ofjava.lang.invoke.MethodHandles
that creates them. For example, a method handle may, when invoked, first apply transformations to its argument values, then supply the transformed values to the invocation of another method handle, then apply a transformation to the value returned from that invocation, then return the transformed value as its own result.
To resolve an unresolved symbolic reference R to a dynamically-computed constant or call site, there are three tasks. First, R is examined to determine which code will serve as its bootstrap method, and which arguments will be passed to that code. Second, the arguments are packaged into an array and the bootstrap method is invoked. Third, the result of the bootstrap method is validated, and used as the result of resolution.
The first task involves the following steps:
R gives a symbolic reference to a bootstrap method handle. The bootstrap method handle is resolved (5.4.3.5) to obtain a reference
to an instance of java.lang.invoke.MethodHandle
.
Any exception that can be thrown as a result of failure of resolution of a symbolic reference to a method handle can be thrown in this step.
If R is a symbolic reference to a dynamically-computed constant, then let D be the type descriptor of the bootstrap method handle. (That is, D is a reference
to an instance of java.lang.invoke.MethodType
.) The first parameter type indicated by D must be java.lang.invoke.MethodHandles.Lookup
, or else resolution fails with a BootstrapMethodError
. For historical reasons, the bootstrap method handle for a dynamically-computed call site is not similarly constrained.
If R is a symbolic reference to a dynamically-computed constant, then it gives a field descriptor.
If the field descriptor indicates a primitive type, then a reference
to the pre-defined Class
object representing that type is obtained (see the method isPrimitive
in class Class
).
Otherwise, the field descriptor indicates a class or interface type, or an array type. A reference
to the Class
object representing the type indicated by the field descriptor is obtained, as if by resolution of an unresolved symbolic reference to a class, or interface, or array type (5.4.3.1) whose name or descriptor corresponds to the type indicated by the field descriptor.
Any exception that can be thrown as a result of failure of resolution of a symbolic reference to a class, or interface, or array type can be thrown in this step.
If R is a symbolic reference to a dynamically-computed call site, then it gives a method descriptor.
A reference
to an instance of java.lang.invoke.MethodType
is obtained, as if by resolution of an unresolved symbolic reference to a method type (5.4.3.5) with the same parameter and return types as the method descriptor.
Any exception that can be thrown as a result of failure of resolution of a symbolic reference to a method type can be thrown in this step.
R gives zero or more static arguments, which communicate application-specific metadata to the bootstrap method. Each static argument A is resolved, in the order given by R, as follows:
If A is a string constant, then a reference
to its instance of class String
is obtained.
If A is a numeric constant, then a reference
to an instance of object representing the number is obtained by the following procedure:java.lang.invoke.MethodHandle
Let v be the value of the numeric constant, and let T be a field descriptor which corresponds to the type of the numeric constant.
Let MH be a method handle produced as if by invocation of the identity
method of java.lang.invoke.MethodHandles
with an argument representing the class Object
.
A reference
to an instance of object is obtained as if by the invocation java.lang.invoke.MethodHandle
MH.invoke(v)
with method descriptor (T)Ljava/lang/Object;
.
This is a bug fix: the value of a static argument representing a number is a boxed number, not a MethodHandle
!
If A is a symbolic reference to a dynamically-computed constant with a field descriptor indicating a primitive type T, then A is resolved, producing a primitive value v. Given v and T, a reference
is obtained to an instance of object encoding v according to the procedure specified above for numeric constants.java.lang.invoke.MethodHandle
If A is any other kind of symbolic reference, then the result is the result of resolving A.
Among the symbolic references in the run-time constant pool, symbolic references to dynamically-computed constants are special because they are derived from constant_pool
entries that can syntactically refer to themselves via the BootstrapMethods
attribute (4.7.23). However, the Java Virtual Machine does not support resolving a symbolic reference to a dynamically-computed constant that depends on itself (that is, as a static argument to its own bootstrap method). Accordingly, when both R and A are symbolic references to dynamically-computed constants, if A is the same as R or A gives a static argument that (directly or indirectly) references R, then resolution fails with a StackOverflowError
at the point where re-resolution of R would be required.
Unlike class initialization (5.5), where cycles are allowed between uninitialized classes, resolution does not allow cycles in symbolic references to dynamically-computed constants. If an implementation of resolution makes recursive use of a stack, then a
StackOverflowError
will occur naturally. If not, the implementation is required to detect the cycle rather than, say, looping infinitely or returning a default value for the dynamically-computed constant.
A similar cycle may arise if the body of a bootstrap method makes reference to a dynamically-computed constant currently being resolved. This has always been possible for invokedynamic bootstraps, and does not require special treatment in resolution; the recursive
invokeWithArguments
calls will naturally lead to aStackOverflowError
.
Any exception that can be thrown as a result of failure of resolution of a symbolic reference can be thrown in this step.
The second task, to invoke the bootstrap method handle, involves the following steps:
An array is allocated with component type Object
and length n+3, where n is the number of static arguments given by R (n ≥ 0).
The zeroth component of the array is set to a reference
to an instance of java.lang.invoke.MethodHandles.Lookup
for the class in which R occurs, produced as if by invocation of the lookup
method of java.lang.invoke.MethodHandles
.
The first component of the array is set to a reference
to an instance of String
that denotes N, the unqualified name given by R.
The second component of the array is set to the reference
to an instance of Class
or java.lang.invoke.MethodType
that was obtained earlier for the field descriptor or method descriptor given by R.
Subsequent components of the array are set to the reference
s that were obtained earlier from resolving R's static arguments, if any. The reference
s appear in the array in the same order as the corresponding static arguments are given by R.
A Java Virtual Machine implementation may be able to skip allocation of the array and, without any change in observable behavior, pass the arguments directly to the bootstrap method.
The bootstrap method handle is invoked, as if by the invocation BMH.invokeWithArguments(args)
, where BMH
is the bootstrap method handle and args
is the array allocated above.
Due to the behavior of the
invokeWithArguments
method ofjava.lang.invoke.MethodHandle
, the type descriptor of the bootstrap method handle need not exactly match the run-time types of the arguments. For example, the second parameter type of the bootstrap method handle (corresponding to the unqualified name given in the first component of the array above) could beObject
instead ofString
. If the bootstrap method handle is variable arity, then some or all of the arguments may be collected into a trailing array parameter.
The invocation occurs within a thread that is attempting resolution of this symbolic reference. If there are several such threads, the bootstrap method handle may be invoked concurrently. Bootstrap methods which access global application data should take the usual precautions against race conditions.
If the invocation fails by throwing an instance of Error
or a subclass of Error
, resolution fails with that exception.
If the invocation fails by throwing an exception that is not an instance of Error
or a subclass of Error
, resolution fails with a BootstrapMethodError
whose cause is the thrown exception.
If several threads concurrently invoke the bootstrap method handle for this symbolic reference, the Java Virtual Machine chooses the result of one invocation and installs it visibly to all threads. Any other bootstrap methods executing for this symbolic reference are allowed to complete, but their results are ignored.
The third task, to validate the reference
, o, produced by invocation of the bootstrap method handle, is as follows:
If R is a symbolic reference to a dynamically-computed constant, then o is converted to type T, the type indicated by the field descriptor given by R.
o's conversion occurs as if by the invocation MH.invoke(o)
with method descriptor (Ljava/lang/Object;)T
, where MH is a method handle produced as if by invocation of the identity
method of java.lang.invoke.MethodHandles
with an argument representing the class Object
.
The result of o's conversion is the result of resolution.
If the conversion fails by throwing a NullPointerException
or a ClassCastException
, resolution fails with a BootstrapMethodError
.
If R is a symbolic reference to a dynamically-computed call site, then o is the result of resolution if it has all of the following properties:
o is not null
.
o is an instance of java.lang.invoke.CallSite
or a subclass of java.lang.invoke.CallSite
.
The type of the java.lang.invoke.CallSite
is semantically equal to the method descriptor given by R.
If o does not have these properties, resolution fails with a BootstrapMethodError
.
Many of the steps above perform computations "as if by invocation" of certain methods. In each case, the invocation behavior is given in detail by the specifications for invokestatic and invokevirtual. The invocation occurs in the thread and from the class that is attempting resolution of the symbolic reference R. However, no corresponding method references are required to appear in the run-time constant pool, no particular method's operand stack is necessarily used, and the value of the max_stack
item of any method's Code
attribute is not enforced for the invocation.
If several threads attempt resolution of R at the same time, the bootstrap method may be invoked concurrently. Therefore, bootstrap methods which access global application data must take precautions against race conditions.
This was stated already, at the end of step 2.
Initialization of a class or interface consists of executing its class or interface initialization method (2.9.2).
A class or interface C may be initialized only as a result of:
The execution of any one of the Java Virtual Machine instructions new, getstatic, putstatic, or invokestatic that references C (6.5.new, 6.5.getstatic, 6.5.putstatic, 6.5.invokestatic).
Upon execution of a new instruction, the class to be initialized is the class referenced by the instruction.
Upon execution of a getstatic, putstatic, or invokestatic instruction, the class or interface to be initialized is the class or interface that declares the resolved field or method.
The first invocation of a java.lang.invoke.MethodHandle
instance which was the result of method handle resolution (5.4.3.5) for a method handle of kind 2 (REF_getStatic
), 4 (REF_putStatic
), 6 (REF_invokeStatic
), or 8 (REF_newInvokeSpecial
).
This implies that the class of a bootstrap method is initialized when the bootstrap method is invoked for an invokedynamic instruction (6.5.invokedynamic), as part of the continuing resolution of the call site specifier.
Invocation of certain reflective methods in the class library (2.12), for example, in class Class
or in package java.lang.reflect
.
If C is a class, the initialization of one of its subclasses.
If C is an interface that declares a non-abstract
, non-static
method, the initialization of a class that implements C directly or indirectly.
Its designation as the initial class or interface at Java Virtual Machine startup (5.2).
Prior to initialization, a class or interface must be linked, that is, verified, prepared, and optionally resolved.
Because the Java Virtual Machine is multithreaded, initialization of a class or interface requires careful synchronization, since some other thread may be trying to initialize the same class or interface at the same time. There is also the possibility that initialization of a class or interface may be requested recursively as part of the initialization of that class or interface. The implementation of the Java Virtual Machine is responsible for taking care of synchronization and recursive initialization by using the following procedure. It assumes that the class or interface has already been verified and prepared, and that the Class
object class or interface contains state that indicates one of four situations:Class
object
This class or interface is verified and prepared but not initialized.Class
object
This class or interface is being initialized by some particular thread.Class
object
This class or interface is fully initialized and ready for use.Class
object
This class or interface is in an erroneous state, perhaps because initialization was attempted and failed.Class
object
Here and below, we eliminate the unnecessary assertion that the initialization state of the class is stored by an instance of java.lang.Class
. The specification need not concern itself with how classes are internally represented and how this representation relates to instances of java.lang.Class
.
For each class or interface C, there is a unique initialization lock LC. The mapping from C to LC is left to the discretion of the Java Virtual Machine implementation. For example, LC could be the Class
object for C, or the monitor associated with that Class
object. The procedure for initializing C is then as follows:
Synchronize on the initialization lock, LC, for C. This involves waiting until the current thread can acquire LC.
If the C indicates that initialization is in progress for C by some other thread, then release LC and block the current thread until informed that the in-progress initialization has completed, at which time repeat this procedure.Class
object for
Thread interrupt status is unaffected by execution of the initialization procedure.
If the C indicates that initialization is in progress for C by the current thread, then this must be a recursive request for initialization. Release LC and complete normally.Class
object for
If the C indicates that Class
object forC it has already been initialized, then no further action is required. Release LC and complete normally.
If the C is in an erroneous state, then initialization is not possible. Release LC and throw a Class
object forNoClassDefFoundError
.
Otherwise, record the fact that initialization of the Class
object for C is in progress by the current thread, and release LC.
Then, initialize each final
static
field of C with the constant value in its ConstantValue
attribute (4.7.2), in the order the fields appear in the ClassFile
structure.
Next, if C is a class rather than an interface, then let SC be its superclass and let SI1, ..., SIn be all superinterfaces of C (whether direct or indirect) that declare at least one non-abstract
, non-static
method. The order of superinterfaces is given by a recursive enumeration over the superinterface hierarchy of each interface directly implemented by C. For each interface I directly implemented by C (in the order of the interfaces
array of C), the enumeration recurs on I's superinterfaces (in the order of the interfaces
array of I) before returning I.
For each S in the list [ SC, SI1, ..., SIn ], if S has not yet been initialized, then recursively perform this entire procedure for S. If necessary, verify and prepare S first.
If the initialization of S completes abruptly because of a thrown exception, then acquire LC, label the C as erroneous, notify all waiting threads, release LC, and complete abruptly, throwing the same exception that resulted from initializing Class
object forSC S.
Next, determine whether assertions are enabled for C by querying its defining class loader.
Next, execute the class or interface initialization method of C.
If the execution of the class or interface initialization method completes normally, then acquire LC, label the Class
object for C as fully initialized, notify all waiting threads, release LC, and complete this procedure normally.
Otherwise, the class or interface initialization method must have completed abruptly by throwing some exception E. If the class of E is not Error
or one of its subclasses, then create a new instance of the class ExceptionInInitializerError
with E as the argument, and use this object in place of E in the following step. If a new instance of ExceptionInInitializerError
cannot be created because an OutOfMemoryError
occurs, then use an OutOfMemoryError
object in place of E in the following step.
Acquire LC, label the C as erroneous, notify all waiting threads, release LC, and complete this procedure abruptly with reason E or its replacement as determined in the previous step.Class
object for
A Java Virtual Machine implementation may optimize this procedure by eliding the lock acquisition in step 1 (and release in step 4/5) when it can determine that the initialization of the class has already completed, provided that, in terms of the Java memory model, all happens-before orderings (JLS §17.4.5) that would exist if the lock were acquired, still exist when the optimization is performed.
Store into reference
array
aastore
aastore = 83 (0x53)
..., arrayref, index, value →
...
The arrayref must be of type reference
and must refer to an array whose components are of type reference
. The index must be of type int
, and value must be of type reference
. The arrayref, index, and value are popped from the operand stack.
If value is null
, then value is stored as the component of the array at index.
Otherwise, value is non-null
. If the type of value is assignment compatible with the type of the components of the array referenced by arrayref, then value is stored as the component of the array at index. If value is a value of the component type of the array referenced by arrayref, then value is stored as the component of the array at index.
The following rules are used to determine whether a value that is not null
is assignment compatible with the array component type. If S is the type of the object referred to by value, and T is the reference type of the array components, then aastore determines whether assignment is compatible as follows:
If S is a class type, then:
If T is a class type, then S must be the same class as T, or S must be a subclass of T;
If T is an interface type, then S must implement interface T.
If S is an array type SC[]
, that is, an array of components of type SC, then:
If T is a class type, then T must be Object
.
If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3).
If T is an array type TC[]
, that is, an array of components of type TC, then one of the following must be true:
TC and SC are the same primitive type.
TC and SC are reference types, and type SC is assignable to TC by these run-time rules.
Whether value is a value of the array component type is determined according to the rules given for [checkcast][6.5.checkcast].
Appealing to "assignment compatible" is a roundabout way to say what we really mean—value must be a value of the array's component type.
aastore, checkcast, and instanceof use the same rules to interpret types. It's helpful to consolidate those rules in one place, so that readers can clearly see that the rules are the same, and so that future enhancements to the type system have fewer rules to maintain.
If arrayref is null
, aastore throws a NullPointerException
.
Otherwise, if index is not within the bounds of the array referenced by arrayref, the aastore instruction throws an ArrayIndexOutOfBoundsException
.
Otherwise, if arrayref is not the non-null
and the actual type ofnull
value is not assignment compatible with the actual type of the components of the array a value of the array component type, aastore throws an ArrayStoreException
.
"Otherwise" here implies "arrayref is not null
".
Create new array of reference
anewarray
indexbyte1
indexbyte2
anewarray = 189 (0xbd)
..., count →
..., arrayref
The count must be of type int
. It is popped off the operand stack. The count represents the number of components of the array to be created. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 <<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type interface, or array type. The named class, array, or interface type interface, or array type is resolved (5.4.3.1). A new array with components of that type component type given by the resolved class, interface, or array type, of length count, is allocated from the garbage-collected heap, and a reference
arrayref to this new array object is pushed onto the operand stack. All components of the new array are initialized to null
, the default value for reference
types (2.4).
During resolution of the symbolic reference to the class, array, or interface type interface, or array type, any of the exceptions documented in 5.4.3.1 can be thrown.
Otherwise, if count is less than zero, the anewarray instruction throws a NegativeArraySizeException
.
The anewarray instruction is used to create a single dimension of an array of object references or part of a multidimensional array.
Check whether object is of given type
checkcast
indexbyte1
indexbyte2
checkcast = 192 (0xc0)
..., objectref →
..., objectref
The objectref must be of type reference
. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 <<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type interface, or array type.
If objectref is null
, then the operand stack is unchanged.
Otherwise, the named class, array, or interface type interface, or array type is resolved (5.4.3.1). If objectref can be cast to the resolved class, array, or interface type is a value of the type given by the resolved class, interface, or array type, the operand stack is unchanged; otherwise, the checkcast instruction throws a .ClassCastException
The following rules are used to determine whether an objectref that is not a reference to an object is a value of a reference type, T. null
can be cast to the resolved typeIf S is the type of the object referred to by objectref, and T is the resolved class, array, or interface type, then checkcast determines whether objectref can be cast to type T as follows:
If S is a class type the reference is to an instance of a class C, then:
If T is a class type, then S must be the same class as T, or S must be a subclass of T;
If T is the type of a class D, then the reference is a value of T if C is D or a subclass of D.
If T is an interface type, then S must implement interface T.
If T is the type of an interface I, then the reference is a value of T if C implements I.
If S is an array type SC the reference is to an array with component type SC, then:[]
, that is, an array of components of type SC
If T is a class type, then T must be the reference is a value of T if T is Object
Object
.
If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3) the reference is a value of T if T is Cloneable
or java.io.Serializable
(as loaded by the bootstrap class loader).
It's unnecessary and especially risky to tie JVMS to the Java Language Specification here—we certainly don't want language changes to accidentally impact the routine behavior of JVM instructions.
If T is an array type TC[]
, that is, an array of components of type TC, then one of the following must be true the reference is a value of T if one of the following are true:
TC and SC are the same primitive type.
TC and SC are reference types, and type SC can be cast to TC by recursive application of these rules.
Bug fix: an earlier cleanup of these rules (JDK-8069130) removed cases to handle an interface type S. These cases appeared vacuous at the top level, but were necessary to support a recursive analysis for array types. Rather than restoring the old rules, it's probably easier to follow if the recursion is contained within the array type discussion.
Further, recursion to the top level is no longer a good fit, because the rules are expressed in terms of a specific reference, not types.
TC is the class type Object
.
TC is a class type, SC is a class type, and the class of SC is a subclass of the class of TC.
TC is an interface type, SC is a class type, and the class of SC implements the interface of TC.
TC is an interface type, SC is an interface type, and the interface of SC extends the interface of TC.
TC is the interface type Cloneable
or java.io.Serializable
(as loaded by the bootstrap class loader), and SC is an array type.
TC is an array type TCC[]
, SC is an array type SCC[]
, and one of these tests of array component types apply recursively to TCC and SCC.
During resolution of the symbolic reference to the class, array, or interface type interface, or array type, any of the exceptions documented in 5.4.3.1 can be thrown.
Otherwise, if objectref cannot be cast to the resolved class, array, or interface type is not null and is not a value of the type given by the resolved class, interface, or array type, the checkcast instruction throws a ClassCastException
.
The checkcast instruction is very similar to the instanceof instruction ([6.5.instanceof]). It differs in its treatment of null
, its behavior when its test fails (checkcast throws an exception, instanceof pushes a result code), and its effect on the operand stack.
Determine if object is of given type
instanceof
indexbyte1
indexbyte2
instanceof = 193 (0xc1)
..., objectref →
..., result
The objectref, which must be of type reference
, is popped from the operand stack. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 <<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type interface, or array type.
If objectref is null
, the instanceof instruction pushes an int
result of 0 as an onto the operand stack.int
Otherwise, the named class, array, or interface type interface, or array type is resolved (5.4.3.1). If objectref is an instance of a value of the type given by the resolved class, interface, or array type, or implements the resolved interface, the instanceof instruction pushes an int
result of 1 as an onto the operand stack; otherwise, it pushes an int
int
result of 0.
The following rules are used to determine whether an objectref that is not null
is an instance of the resolved type. If S is the type of the object referred to by objectref, and T is the resolved class, array, or interface type, then instanceof determines whether objectref is an instance of T as follows:
If S is a class type, then:
If T is a class type, then S must be the same class as T, or S must be a subclass of T;
If T is an interface type, then S must implement interface T.
If S is an array type SC[]
, that is, an array of components of type SC, then:
If T is a class type, then T must be Object
.
If T is an interface type, then T must be one of the interfaces implemented by arrays (JLS §4.10.3).
If T is an array type TC[]
, that is, an array of components of type TC, then one of the following must be true:
TC and SC are the same primitive type.
TC and SC are reference types, and type SC can be cast to TC by these run-time rules.
Whether objectref is a value of the type given by the resolved class, interface, or array type is determined according to the rules given for [checkcast][6.5.checkcast].
aastore, checkcast, and instanceof use the same rules to interpret types. It's helpful to consolidate those rules in one place, so that readers can clearly see that the rules are the same, and so that future enhancements to the type system have fewer rules to maintain.
During resolution of the symbolic reference to the class, array, or interface type interface, or array type, any of the exceptions documented in 5.4.3.1 can be thrown.
The instanceof instruction is very similar to the checkcast instruction ([6.5.checkcast]). It differs in its treatment of null
, its behavior when its test fails (checkcast throws an exception, instanceof pushes a result code), and its effect on the operand stack.
Push item from run-time constant pool
ldc
index
ldc = 18 (0x12)
... →
..., value
The index is an unsigned byte that must be a valid index into the run-time constant pool of the current class (2.5.5). The run-time constant pool entry at index must be loadable (5.1), and not any of the following:
A numeric constant of type long
or double
.
A symbolic reference to a dynamically-computed constant whose field descriptor is J
(denoting long
) or D
(denoting double
).
If the run-time constant pool entry is a numeric constant of type int
or float
, then the value of that numeric constant is pushed onto the operand stack as an int
or float
, respectively.
Otherwise, if the run-time constant pool entry is a string constant, that is, a reference
to an instance of class String
, then value, a reference
to that instance, is pushed onto the operand stack.
Otherwise, if the run-time constant pool entry is a symbolic reference to a class, or interface, or array type, then the named class or interface symbolic reference is resolved (5.4.3.1) and value, a reference
to the Class
object representing that class, or interface, or array type, is pushed onto the operand stack.
Otherwise, the run-time constant pool entry is a symbolic reference to a method type, a method handle, or a dynamically-computed constant. The symbolic reference is resolved (5.4.3.5, 5.4.3.6) and value, the result of resolution, is pushed onto the operand stack.
During resolution of a symbolic reference, any of the exceptions pertaining to resolution of that kind of symbolic reference can be thrown.
The ldc instruction can only be used to push a value of type float
taken from the float value set (2.3.2) because a constant of type float
in the constant pool (4.4.4) must be taken from the float value set.
Push item from run-time constant pool (wide index)
ldc_w
indexbyte1
indexbyte2
ldc_w = 19 (0x13)
... →
..., value
The unsigned indexbyte1 and indexbyte2 are assembled into an unsigned 16-bit index into the run-time constant pool of the current class (2.5.5), where the value of the index is calculated as (indexbyte1 <<
8) | indexbyte2. The index must be a valid index into the run-time constant pool of the current class. The run-time constant pool entry at the index must be loadable (5.1), and not any of the following:
A numeric constant of type long
or double
.
A symbolic reference to a dynamically-computed constant whose field descriptor is J
(denoting long
) or D
(denoting double
).
If the run-time constant pool entry is a numeric constant of type int
or float
, or a string constant, then value is determined and pushed onto the operand stack according to the rules given for the ldc instruction.
Otherwise, the run-time constant pool entry is a symbolic reference to a class, interface, array type, method type, method handle, or dynamically-computed constant. It is resolved and value is determined and pushed onto the operand stack according to the rules given for the ldc instruction.
During resolution of a symbolic reference, any of the exceptions pertaining to resolution of that kind of symbolic reference can be thrown.
The ldc_w instruction is identical to the ldc instruction (6.5.ldc) except for its wider run-time constant pool index.
The ldc_w instruction can only be used to push a value of type float
taken from the float value set (2.3.2) because a constant of type float
in the constant pool (4.4.4) must be taken from the float value set.
Create new multidimensional array
multianewarray
indexbyte1
indexbyte2
dimensions
multianewarray = 197 (0xc5)
..., count1, [count2, ...] →
..., arrayref
The dimensions operand is an unsigned byte that must be greater than or equal to 1. It represents the number of dimensions of the array to be created. The operand stack must contain dimensions values. Each such value represents the number of components in a dimension of the array to be created, must be of type int
, and must be non-negative. The count1 is the desired length in the first dimension, count2 in the second, etc.
All of the count values are popped off the operand stack. The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 <<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class, array, or interface type an array type. The named class, array, or interface type referenced array type is resolved (5.4.3.1). The resulting entry must be an array class type of dimensionality greater than or equal to dimensions.
Verification has ensured that the referenced CONSTANT_Class_info
denotes an array type.
A new multidimensional array of the array type is allocated from the garbage-collected heap. If any count value is zero, no subsequent dimensions are allocated. The components of the array in the first dimension are initialized to subarrays of the type of the second dimension, and so on. The components of the last allocated dimension of the array are initialized to the default initial value (2.3, 2.4) for the element type of the array type component type of that dimension. A reference
arrayref to the new array is pushed onto the operand stack.
During resolution of the symbolic reference to the class, array, or interface type array type, any of the exceptions documented in 5.4.3.1 can be thrown.
Otherwise, if the current class does not have permission to access the element type of the resolved array class, multianewarray throws an IllegalAccessError
.
If the element type is inaccessible, resolution will fail.
Otherwise, if any of the dimensions values on the operand stack are less than zero, the multianewarray instruction throws a NegativeArraySizeException
.
It may be more efficient to use newarray or anewarray (6.5.newarray, 6.5.anewarray) when creating an array of a single dimension.
The array class type referenced via the run-time constant pool may have more dimensions than the dimensions operand of the multianewarray instruction. In that case, only the first dimensions of the dimensions of the array are created.
Create new object
new
indexbyte1
indexbyte2
new = 187 (0xbb)
... →
..., objectref
The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (2.6), where the value of the index is (indexbyte1 <<
8) | indexbyte2. The run-time constant pool entry at the index must be a symbolic reference to a class or interface type. The named class or interface type is resolved (5.4.3.1) and should result in a non-abstract
class type. Memory for a new instance of that class is allocated from the garbage-collected heap, and the instance variables of the new object are initialized to their the default initial values of their types (2.3, 2.4). The objectref, a reference
to the instance, is pushed onto the operand stack.
On successful resolution of the class, it is initialized if it has not already been initialized (5.5).
During resolution of the symbolic reference to the class or interface type, any of the exceptions documented in 5.4.3.1 can be thrown.
Otherwise, if the symbolic reference to the class or interface type resolves to an interface or an abstract
class, new throws an InstantiationError
.
Otherwise, if execution of this new instruction causes initialization of the referenced class, new may throw an Error
as detailed in JLS §15.9.4 5.5.
This is an unnecessary JLS reference, and also appears to be out of date: JLS 15.9.4 doesn't describe class initialization at all.
The new instruction does not completely create a new instance; instance creation is not completed until an instance initialization method (2.9.1) has been invoked on the uninitialized instance.