Difference between revisions of "Java Bytecode"

From Termination-Portal.org
Jump to navigationJump to search
(update)
(source)
Line 13: Line 13:
 
There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...). The JBC category can be filled with corresponding problems.
 
There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...). The JBC category can be filled with corresponding problems.
 
== Problem definition ==
 
== Problem definition ==
A JBC problem is defined by three parts.
+
A JBC problem is defined by up to three parts.
  
 
* byte code
 
* byte code
 +
* Java source
 
* start method
 
* start method
* annotations
 
  
 
=== Code ===
 
=== Code ===
Providing the code is straightforward and can be accomplished by using a [http://en.wikipedia.org/wiki/JAR_(file_format) .jar file] that contains all .class files needed for the analysis.
+
Providing the code is straightforward and is accomplished using a [http://en.wikipedia.org/wiki/JAR_(file_format) .jar file] that contains all .class files needed for the analysis.
 +
=== Java Source ===
 +
This part is optional. If the Java source for the compiled program is available, the corresponding source files are packed into a file source.zip which is then stored in the .jar file.
 
=== Start Method ===
 
=== Start Method ===
 
Instead of only proving termination of the '''main''' method one wants to analyze specific methods (including non-static ones and constructors). The main advantage here is that one can prove termination of these methods with respect to all possible inputs (e.g. '''f(x, y)''' for all '''int''' values x, y).
 
Instead of only proving termination of the '''main''' method one wants to analyze specific methods (including non-static ones and constructors). The main advantage here is that one can prove termination of these methods with respect to all possible inputs (e.g. '''f(x, y)''' for all '''int''' values x, y).

Revision as of 14:16, 21 October 2009

This page is to record the current status of the Java Bytecode (JBC) Category of the Termination Competition. (Discussion should take place on the termtools mailing list.)

Java vs. Java Bytecode

Every Java problem can be compiled into an equivalent Java Bytecode problem. For technical reasons, the current tools only work on this compiled version of the original program. However, the result of the analysis also gives information about the termination behaviour of the original Java program.

Every JBC problem that resulted out of Java source code should include this source in addition to the compiled class files. This way we can communicate that we are interested in termination analysis of a real imperative programming language.

As discussed in the steering committee, the compilation of a JBC problem based on given Java code is done manually, so the existing tools do not have to implement or use a Java compiler.

There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...). The JBC category can be filled with corresponding problems.

Problem definition

A JBC problem is defined by up to three parts.

  • byte code
  • Java source
  • start method

Code

Providing the code is straightforward and is accomplished using a .jar file that contains all .class files needed for the analysis.

Java Source

This part is optional. If the Java source for the compiled program is available, the corresponding source files are packed into a file source.zip which is then stored in the .jar file.

Start Method

Instead of only proving termination of the main method one wants to analyze specific methods (including non-static ones and constructors). The main advantage here is that one can prove termination of these methods with respect to all possible inputs (e.g. f(x, y) for all int values x, y).

To define this start method the signature of the method can be used, e.g. ClassName.methodName(II)V or ClassName.<init>()V. This identification string can be added to the file META-INF/MANIFEST.MF which is the standard place to define the main method. Extending this file by additional information relevant to termination analysis is easily possible. Adding a line Start-Method: ClassName.methodName(II)V is enough to provide information about the start method.

Annotations

For algorithms working on non-primitive data types (this includes all user-defined data types) it may be interesting to prove termination only for a subset of theoretically possible input values. A simple example is a list traversal algorithm which only terminates for acyclic lists. Here, the information "the argument x is acyclic" can be provided. This idea can be extended to arbitrary annotations, where the following come to mind:

  • argument x is acyclic
  • argument x has a tree structure (acyclic and not a (real) DAG)
  • arguments x and y are "disjoint" in a sense
  • argument x is non-null
  • (for primitives) argument x is non-negative
  • ...

The discussion about relevant annotations should start as soon as examples are generated.

Adding this kind of information can also easily be done using the META-INF/MANIFEST.INF file.

(Possible) Participants

Julia, COSTA and AProVE are able to work on Java Bytecode problems.

Questions

java.lang.*

Many classes provided in the package java.lang, most prominently java.lang.Object, are used by all JBC programs (even those not resulting out of Java). All Java Virtual Machines executing real JBC provide these classes and also provide machine code for all native methods defined in these classes. As a consequence, it should be the responsibility of the tool to provide these classes (and deal with their native methods), so that a JBC problem can be defined without providing these additional classes. The classpath exception allows using class files from e.g. openjdk without having to provide the tool under GPL.

Native Methods

There is no precise description about the expected result from certain native methods defined in java.lang. As an example, the function java.lang.Object.hashCode()I returns an integer value that is strongly related to the memory position of the object. Do we need to take care of all possible values? Intuitive answer: Yes, everything that is possible should be handled! Would it be OK to only handle the case where 0 is returned? Intuitive answer: Yes, because the specification does not enforce specific values. This problem can be extended to native methods dealing with objects (always non-null? acyclic? ...).

Initialization / Static Fields

When starting in a main method (which is static) it is clear how the classes will be initialized and what values will be stored in static fields (after all, every real JVM just does this). However, when starting in an arbitrary method, this is different. One could argue that certain classes must already have been initialized before the method is started (this includes the class where the method is defined). One could also say that analysis must start in a state where everything is not yet analyzed. Another alternative would be to leave every information that is not known in some unknown state and handle all resulting cases. In the latter case one would like to know if certain (unknown) values of static fields are object references that share with some other object stored in another static field or - perhaps worse - with an object provided as an argument to the method that should be analyzed.