Difference between revisions of "Java Bytecode"

From Termination-Portal.org
Jump to navigationJump to search
m (intro)
 
(68 intermediate revisions by the same user not shown)
Line 1: Line 1:
This page is to record the current status of the Java Bytecode Category of the Termination Competition.
+
This page is to record the current status of the Java Bytecode (JBC) Category of the Termination Competition.
 
(Discussion should take place on the termtools mailing list.)  
 
(Discussion should take place on the termtools mailing list.)  
 
== Java vs. Java Bytecode ==
 
== Java vs. Java Bytecode ==
TODO.
+
Every Java problem can be compiled into an equivalent Java Bytecode problem.
 +
For technical reasons, the current tools only work on this compiled version of the original program.
 +
However, the result of the analysis also gives information about the termination behaviour of the original Java program.
 +
 
 +
Every JBC problem that resulted out of Java source code should include this source in addition to the compiled class files.
 +
This way we can communicate that we are interested in termination analysis of a real imperative programming language.
 +
 
 +
As discussed in the steering committee, the compilation of a JBC problem based on given Java code is done manually, so the existing tools do not have to implement or use a Java compiler.
 +
 
 +
There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...).
 +
The JBC category can be filled with corresponding problems.
 +
Here, too, the source files used to generate the JBC code should be provided.
 
== Problem definition ==
 
== Problem definition ==
A JBC problem is defined by three parts.
+
A JBC problem is defined by up to three parts.
  
 
* byte code
 
* byte code
 +
* source
 
* start method
 
* start method
* annotations
 
  
 
=== Code ===
 
=== Code ===
Providing the code is straight forward and can be accomplished by using a [http://en.wikipedia.org/wiki/JAR_(file_format) .jar file] that contains all .class files needed for the analysis.
+
Providing the code is straightforward and is accomplished using a [http://en.wikipedia.org/wiki/JAR_(file_format) .jar file] that contains all .class files needed for the analysis.
 +
=== Java Source ===
 +
This part is optional. If the source for the compiled program is available, the corresponding source files are packed into a file '''source.zip''' which is then stored in the .jar file.
 +
 
 
=== Start Method ===
 
=== Start Method ===
Instead of only proving termination of the '''main''' method one wants to analyze specific methods (including non-static ones and constructors). The main advantage here is that one can prove termination of these methods with respect to all possible inputs (e.g. '''f(x, y)''' for all '''int''' values x, y).
+
Analysis starts with '''public static void main(String[] args)'''. This method is specified in the .jar file by adding a file '''META-INF/MANIFEST.MF''' including a line '''Main-Class: some/package/SomeClass'''. Note that it is unclear whether the standard is slashed or dotted notation, so it may be a good idea to accept both.
 +
 
 +
Using constructs like '''args.length''' or '''args[i].length()''' one gets unknown non-negative '''int''' values and can use these to create a large number of problem instances. As an example, the following code can be used to construct problems where termination of some algorithm working on lists of arbitrary length should be proved.
 +
 
 +
public static void main(String[] args) {
 +
  // create list of given length
 +
  List list = new List(args.length);
 +
                   
 +
  // work on the created list
 +
  doSomething(list);
 +
}
 +
 
 +
== Participants ==
 +
In the 2009 competition the tools
 +
[[Tools:Julia|Julia]],
 +
[http://costa.ls.fi.upm.es/costa/costa.php COSTA] and
 +
[[Tools:AProVE|AProVE]] took part in the JBC categories.
 +
 
 +
== Details ==
 +
=== Version ===
 +
The bytecode major version must be 50 (Java 6) or lower. Version 51 (Java 7) is not allowed.
 +
 
 +
=== Errors and Exceptions ===
 +
In most cases, an instance of '''java.lang.Error''' thrown by the JVM is undesired behaviour.
 +
If an example problem triggers this, we should consider this example as broken and not use it in the competition.
 +
 
 +
=== Considered Types ===
 +
Exactly the types provided in the .jar file are considered, with a few additions needed for basic examples.
 +
 
 +
The list of considered classes can be extended in the future to allow for more advanced examples, e.g. involving '''System.out.println(...)''' and other "library" code.
 +
 
 +
All additional class files are taken from a Debian version of
 +
OpenJDK [1].
 +
 
 +
The class '''java.lang.String''' needs to be considered. Especially the method
 +
'''String.length()I''' is useful to construct problem instances out of the
 +
argument array of the initial method ('''String[] args''').
 +
 
 +
To be able to initialize '''java.lang.String''', we also need the classes
 +
'''java.lang.String$CaseInsensitiveComparator''',
 +
'''java.util.Comparator''', and '''java.io.ObjectStreamField'''.
 +
 
 +
Furthermore, usage of all the exceptions/errors mentioned in the JVMS is allowed (all in '''java.lang.*'''):
 +
* AbstractMethodError
 +
* ArithmeticException
 +
* ArrayIndexOutOfBoundsException
 +
* ArrayStoreException
 +
* ClassCastException
 +
* IllegalAccessError
 +
* IncompatibleClassChangeError
 +
* NegativeArraySizeException
 +
* NoClassDefFoundError
 +
* NoSuchFieldError
 +
* NoSuchMethodError
 +
* NullPointerException
 +
 
 +
Because all array types implement '''java.lang.Cloneable''' and
 +
'''java.io.Serializable''', these interfaces must also be allowed.
 +
 
 +
The following types are part of the hierarchy of a type
 +
mentioned before, so they must be included (all in '''java.lang.*'''):
 +
* CharSequence
 +
* Comparable
 +
* Error
 +
* Exception
 +
* IndexOutOfBoundsException
 +
* LinkageError
 +
* Object
 +
* RuntimeException
 +
* Throwable
 +
 
 +
==== Native Methods ====
 +
To initialize '''Object''', we need to implement the native method '''Object.registerNatives()V'''.
 +
This method is implemented "no-op".
 +
To actually throw instances of '''Exception''' (or '''Throwable''' in general), the native method
 +
'''Throwable.fillInStackTrace()V''' needs to be implemented. This method, too, is implemented as "no-op".
 +
 
 +
The remaining native methods are not considered (all in '''java.lang.*'''):
 +
* Object.clone()Ljava/lang/Object;
 +
* Object.getClass()Ljava/lang/Class;
 +
* Object.hashCode()I
 +
* Object.notifyAll()V
 +
* Object.notify()V
 +
* Object.wait(J)
 +
* String.intern()Ljava/lang/String;
 +
 
 +
==== Example ====
 +
 
 +
As an example, examples calling
 +
'''java.lang.Object.equals(Ljava/lang/Object;)Z''' are allowed, but the method
 +
'''java.lang.Object.toString()Ljava/lang/String;''' may not be called because the class
 +
'''java.lang.StringBuilder''' is unknown.
 +
 
 +
==== Included Files ====
 +
* java/io/ObjectStreamField.class
 +
* java/io/Serializable.class
 +
* java/lang/AbstractMethodError.class
 +
* java/lang/ArithmeticException.class
 +
* java/lang/ArrayIndexOutOfBoundsException.class
 +
* java/lang/ArrayStoreException.class
 +
* java/lang/CharSequence.class
 +
* java/lang/ClassCastException.class
 +
* java/lang/Cloneable.class
 +
* java/lang/Comparable.class
 +
* java/lang/Error.class
 +
* java/lang/Exception.class
 +
* java/lang/IllegalAccessError.class
 +
* java/lang/IncompatibleClassChangeError.class
 +
* java/lang/IndexOutOfBoundsException.class
 +
* java/lang/LinkageError.class
 +
* java/lang/NegativeArraySizeException.class
 +
* java/lang/NoClassDefFoundError.class
 +
* java/lang/NoSuchFieldError.class
 +
* java/lang/NoSuchMethodError.class
 +
* java/lang/NullPointerException.class
 +
* java/lang/Object.class
 +
* java/lang/RuntimeException.class
 +
* java/lang/String$CaseInsensitiveComparator.class
 +
* java/lang/String.class
 +
* java/lang/Throwable.class
 +
* java/util/Comparator.class
 +
 
 +
[1]: rt.jar in:
 +
    openjdk-6-jre-headless_6b11-9.1+lenny2_amd64.deb:
 +
    http://packages.debian.org/lenny/amd64/openjdk-6-jre-headless/download
 +
    Exact Size  22423776 Byte (21.4 MByte)
 +
    MD5 checksum  645caac427ee007eed470895fc12ab9e
 +
    SHA1 checksum  f2cbf934681503b648f7dceb662e086a3bda5767
 +
    SHA256 checksum 349a2a500d01574906d96ca7a8ee2486415d1aba5bf2c3e4c0cc2fe8a5c00efd
 +
 
 +
=== Opcodes ===
 +
All opcodes are allowed, with the following exceptions.
 +
 
 +
* The opcodes '''LDC (0x12)''' and '''LDC_W (0x13)''' are not allowed in the variants that return a '''String''' or '''Class''' instance. These variants are problematic, because they may return some already known instance.
 +
* The opcode '''INVOKEDYNAMIC (0xBA)''' is not part of the instruction set of Java 6, so it is not allowed.
 +
* The opcodes '''MONITORENTER (0xC2)''' and '''MONITOREXIT (0xC3)''' only make sense when using multithreading. Furthermore, correct handling of '''MONITOREXIT''' is more involved. As a result, these two opcodes are not allowed.
 +
* The reserved opcodes '''BREAKPOINT (0xCA)''', '''IMPDEP1 (0xFE)''', '''IMPDEP2 (0xFF)''' are not allowed.
 +
 
 +
=== Overflows ===
 +
Integer overflows and related problems are not handled. This means proving
 +
termination of the following Java program snippet is not considered to be wrong.
 +
 
 +
int i = 0;
 +
while (i <= 2147483647) {
 +
  i++;
 +
}
 +
 
 +
=== Categories ===
 +
 
 +
The JBC problems are divided into two categories.
 +
One category contains all examples that make use of recursion, whereas all examples in the other category do not use recursion at all.
 +
 
 +
The reason is that recursion poses a special problem for the
 +
analysis since one has to handle the call stack properly. We think that
 +
it is interesting to evaluate the performance of termination analyzers
 +
separately for programs with and without recursion.
  
To define this start method the signature of the method can be used, e.g. '''ClassName.methodName(II)V''' or '''ClassName.<init>()V'''. This identification string can be added to the file '''META-INF/MANIFEST.MF''' which is the standard place to define the '''main''' method. Extending this file by additional information relevant to termination analysis is easily possible. Adding a line '''Start-Function: ClassName.methodName(II)V''' is enough to provide information about the start method.
+
=== Verification ===
=== Annotations ===
+
All input problems must pass the bytecode verification process. Therefore,
For algorithms working on non-primitive data types (this includes all user-defined data types) it may be interesting to prove termination only for a subset of theoretically possible input values. A simple example is a list traversal algorithm which only terminates for acyclic lists. Here, the information "the argument x is acyclic" can be provided.
+
the analyzers do not need to verify the given bytecode.
This idea can be extended to arbitrary annotations, where the following come to mind:
 
* argument x is acyclic
 
* argument x has a tree structure (acyclic and not a (real) DAG)
 
* arguments x and y are "disjoint" in a sense
 
* argument x is non-null
 
* (for primitives) argument x is non-negative
 
* ...
 
The discussion about relevant annotations should start as soon as examples are generated.
 
  
Adding this kind of information can also easily be done using the '''META-INF/MANIFEST.INF''' file.
+
[[Category:Categories]]
== (Possible) Participants ==
 
[[http://julia.scienze.univr.it:8080/julia/ Julia]] and [[Tools:AProVE|AProVE]] are able to work on Java Bytecode problems.
 
== Questions ==
 
=== java.lang.* ===
 
Many classes provided in the package '''java.lang''', most prominently '''java.lang.Object''', are used by all JBC programs (even those not resulting out of Java). All Java Virtual Machines executing real JBC provide these classes and also provide machine code for all native methods defined in these classes. As a consequence, it should be the responsibility of the tool to provide these classes (and deal with their native methods), so that a JBC problem can be defined without providing these additional classes. The '''classpath exception''' allows using class files from e.g. openjdk without having to provide the tool under GPL.
 
=== Native Methods ===
 
There is no precise description about the expected result from certain native methods defined in '''java.lang'''. As an example, the function '''java.lang.Object.hashCode()I''' returns an integer value that is strongly related to the memory position of the object. Do we need to take care of all possible values? Intuitive answer: Yes, everything that is possible should be handled! Would it be OK to only handle the case where 0 is returned? Intuitive answer: Yes, because the specification does not enforce specific values. This problem can be extended to native methods dealing with objects (always non-null? acyclic? ...).
 
=== Initialization / Static Fields ===
 
When starting in a main method (which is static) it is clear how the classes will be initialized and what values will be stored in static fields (after all, every real JVM just does this). However, when starting in an arbitrary method, this is different. One could argue that certain classes must already have been initialized before the method is started (this includes the class where the method is defined). One could also say that analysis must start in a state where everything is not yet analyzed. Another alternative would be to leave every information is not known in some unknown state and handle all resulting cases. In the latter case one would like to know if certain (unknown) values of static fields are object references that share with some other object stored in another static field or - perhaps worse - with an object provided as an argument to the method that should be analyzed.
 

Latest revision as of 11:21, 31 January 2013

This page is to record the current status of the Java Bytecode (JBC) Category of the Termination Competition. (Discussion should take place on the termtools mailing list.)

Java vs. Java Bytecode

Every Java problem can be compiled into an equivalent Java Bytecode problem. For technical reasons, the current tools only work on this compiled version of the original program. However, the result of the analysis also gives information about the termination behaviour of the original Java program.

Every JBC problem that resulted out of Java source code should include this source in addition to the compiled class files. This way we can communicate that we are interested in termination analysis of a real imperative programming language.

As discussed in the steering committee, the compilation of a JBC problem based on given Java code is done manually, so the existing tools do not have to implement or use a Java compiler.

There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...). The JBC category can be filled with corresponding problems. Here, too, the source files used to generate the JBC code should be provided.

Problem definition

A JBC problem is defined by up to three parts.

  • byte code
  • source
  • start method

Code

Providing the code is straightforward and is accomplished using a .jar file that contains all .class files needed for the analysis.

Java Source

This part is optional. If the source for the compiled program is available, the corresponding source files are packed into a file source.zip which is then stored in the .jar file.

Start Method

Analysis starts with public static void main(String[] args). This method is specified in the .jar file by adding a file META-INF/MANIFEST.MF including a line Main-Class: some/package/SomeClass. Note that it is unclear whether the standard is slashed or dotted notation, so it may be a good idea to accept both.

Using constructs like args.length or args[i].length() one gets unknown non-negative int values and can use these to create a large number of problem instances. As an example, the following code can be used to construct problems where termination of some algorithm working on lists of arbitrary length should be proved.

public static void main(String[] args) {
  // create list of given length
  List list = new List(args.length);
                   
  // work on the created list
  doSomething(list);
}

Participants

In the 2009 competition the tools Julia, COSTA and AProVE took part in the JBC categories.

Details

Version

The bytecode major version must be 50 (Java 6) or lower. Version 51 (Java 7) is not allowed.

Errors and Exceptions

In most cases, an instance of java.lang.Error thrown by the JVM is undesired behaviour. If an example problem triggers this, we should consider this example as broken and not use it in the competition.

Considered Types

Exactly the types provided in the .jar file are considered, with a few additions needed for basic examples.

The list of considered classes can be extended in the future to allow for more advanced examples, e.g. involving System.out.println(...) and other "library" code.

All additional class files are taken from a Debian version of OpenJDK [1].

The class java.lang.String needs to be considered. Especially the method String.length()I is useful to construct problem instances out of the argument array of the initial method (String[] args).

To be able to initialize java.lang.String, we also need the classes java.lang.String$CaseInsensitiveComparator, java.util.Comparator, and java.io.ObjectStreamField.

Furthermore, usage of all the exceptions/errors mentioned in the JVMS is allowed (all in java.lang.*):

  • AbstractMethodError
  • ArithmeticException
  • ArrayIndexOutOfBoundsException
  • ArrayStoreException
  • ClassCastException
  • IllegalAccessError
  • IncompatibleClassChangeError
  • NegativeArraySizeException
  • NoClassDefFoundError
  • NoSuchFieldError
  • NoSuchMethodError
  • NullPointerException

Because all array types implement java.lang.Cloneable and java.io.Serializable, these interfaces must also be allowed.

The following types are part of the hierarchy of a type mentioned before, so they must be included (all in java.lang.*):

  • CharSequence
  • Comparable
  • Error
  • Exception
  • IndexOutOfBoundsException
  • LinkageError
  • Object
  • RuntimeException
  • Throwable

Native Methods

To initialize Object, we need to implement the native method Object.registerNatives()V. This method is implemented "no-op". To actually throw instances of Exception (or Throwable in general), the native method Throwable.fillInStackTrace()V needs to be implemented. This method, too, is implemented as "no-op".

The remaining native methods are not considered (all in java.lang.*):

  • Object.clone()Ljava/lang/Object;
  • Object.getClass()Ljava/lang/Class;
  • Object.hashCode()I
  • Object.notifyAll()V
  • Object.notify()V
  • Object.wait(J)
  • String.intern()Ljava/lang/String;

Example

As an example, examples calling java.lang.Object.equals(Ljava/lang/Object;)Z are allowed, but the method java.lang.Object.toString()Ljava/lang/String; may not be called because the class java.lang.StringBuilder is unknown.

Included Files

  • java/io/ObjectStreamField.class
  • java/io/Serializable.class
  • java/lang/AbstractMethodError.class
  • java/lang/ArithmeticException.class
  • java/lang/ArrayIndexOutOfBoundsException.class
  • java/lang/ArrayStoreException.class
  • java/lang/CharSequence.class
  • java/lang/ClassCastException.class
  • java/lang/Cloneable.class
  • java/lang/Comparable.class
  • java/lang/Error.class
  • java/lang/Exception.class
  • java/lang/IllegalAccessError.class
  • java/lang/IncompatibleClassChangeError.class
  • java/lang/IndexOutOfBoundsException.class
  • java/lang/LinkageError.class
  • java/lang/NegativeArraySizeException.class
  • java/lang/NoClassDefFoundError.class
  • java/lang/NoSuchFieldError.class
  • java/lang/NoSuchMethodError.class
  • java/lang/NullPointerException.class
  • java/lang/Object.class
  • java/lang/RuntimeException.class
  • java/lang/String$CaseInsensitiveComparator.class
  • java/lang/String.class
  • java/lang/Throwable.class
  • java/util/Comparator.class

[1]: rt.jar in:

    openjdk-6-jre-headless_6b11-9.1+lenny2_amd64.deb:
    http://packages.debian.org/lenny/amd64/openjdk-6-jre-headless/download
    Exact Size  22423776 Byte (21.4 MByte)
    MD5 checksum  645caac427ee007eed470895fc12ab9e
    SHA1 checksum   f2cbf934681503b648f7dceb662e086a3bda5767
    SHA256 checksum 349a2a500d01574906d96ca7a8ee2486415d1aba5bf2c3e4c0cc2fe8a5c00efd

Opcodes

All opcodes are allowed, with the following exceptions.

  • The opcodes LDC (0x12) and LDC_W (0x13) are not allowed in the variants that return a String or Class instance. These variants are problematic, because they may return some already known instance.
  • The opcode INVOKEDYNAMIC (0xBA) is not part of the instruction set of Java 6, so it is not allowed.
  • The opcodes MONITORENTER (0xC2) and MONITOREXIT (0xC3) only make sense when using multithreading. Furthermore, correct handling of MONITOREXIT is more involved. As a result, these two opcodes are not allowed.
  • The reserved opcodes BREAKPOINT (0xCA), IMPDEP1 (0xFE), IMPDEP2 (0xFF) are not allowed.

Overflows

Integer overflows and related problems are not handled. This means proving termination of the following Java program snippet is not considered to be wrong.

int i = 0;
while (i <= 2147483647) {
  i++;
}

Categories

The JBC problems are divided into two categories. One category contains all examples that make use of recursion, whereas all examples in the other category do not use recursion at all.

The reason is that recursion poses a special problem for the analysis since one has to handle the call stack properly. We think that it is interesting to evaluate the performance of termination analyzers separately for programs with and without recursion.

Verification

All input problems must pass the bytecode verification process. Therefore, the analyzers do not need to verify the given bytecode.