Difference between revisions of "Java Bytecode"
Line 45: | Line 45: | ||
The restriction forbids usage of complicated code like | The restriction forbids usage of complicated code like | ||
'''System.out.println(...)''' and other "library" code. | '''System.out.println(...)''' and other "library" code. | ||
+ | |||
+ | Following this definition, analysis of a method that involves a type not | ||
+ | provided with the problem .jar file and not included in the list above, must | ||
+ | throw a '''NoClassDefFoundError''' and, because some instance of '''Error''' is thrown by the JVM, | ||
+ | abort analysis. | ||
All additional class files are taken from the current Debian version of | All additional class files are taken from the current Debian version of | ||
Line 102: | Line 107: | ||
String.intern()Ljava/lang/String; | String.intern()Ljava/lang/String; | ||
− | + | ==== Example ==== | |
− | |||
− | |||
− | |||
As an example, analysis of | As an example, analysis of |
Revision as of 14:30, 21 October 2009
This page is to record the current status of the Java Bytecode (JBC) Category of the Termination Competition. (Discussion should take place on the termtools mailing list.)
Contents
Java vs. Java Bytecode
Every Java problem can be compiled into an equivalent Java Bytecode problem. For technical reasons, the current tools only work on this compiled version of the original program. However, the result of the analysis also gives information about the termination behaviour of the original Java program.
Every JBC problem that resulted out of Java source code should include this source in addition to the compiled class files. This way we can communicate that we are interested in termination analysis of a real imperative programming language.
As discussed in the steering committee, the compilation of a JBC problem based on given Java code is done manually, so the existing tools do not have to implement or use a Java compiler.
There are languages different from Java that compile to JBC (e.g. Python/Jython, Ruby/JRuby, Scala, ...). The JBC category can be filled with corresponding problems.
Problem definition
A JBC problem is defined by up to three parts.
- byte code
- Java source
- start method
Code
Providing the code is straightforward and is accomplished using a .jar file that contains all .class files needed for the analysis.
Java Source
This part is optional. If the Java source for the compiled program is available, the corresponding source files are packed into a file source.zip which is then stored in the .jar file.
Start Method
Analysis starts with public static void main(String[] args). This method is specified in the .jar file by adding a file META-INF/MANIFEST.MF including a line Main-Class: some.package.SomeClass.
(Possible) Participants
Julia, COSTA and AProVE are able to work on Java Bytecode problems.
Details
Errors and Exceptions
Every instance of java.lang.Error thrown by the JVM must be seen as an error during analysis. As a result, it is not possible to prove termination because an instance of Error thrown by the JVM is not caught. For Error instances thrown by the program (using ATHROW) and all thrown Exception instances (even thrown by the JVM) normal analysis is allowed.
Considered Types
Exactly the types provided in the .jar file are considered, with a few additions needed for basic analysis.
This construction can be extended by allowing more types in the future. The restriction forbids usage of complicated code like System.out.println(...) and other "library" code.
Following this definition, analysis of a method that involves a type not provided with the problem .jar file and not included in the list above, must throw a NoClassDefFoundError and, because some instance of Error is thrown by the JVM, abort analysis.
All additional class files are taken from the current Debian version of OpenJDK [1].
The class java.lang.String needs to be considered. Especially the method "String.length()I" is useful to construct problem instances out of the argument array of the initial method (String[] args).
To be able to initialize java.lang.String, we also need the classes java.lang.String$CaseInsensitiveComparator, java.util.Comparator, and java.io.ObjectStreamField.
Furthermore, all the exceptions/errors mentioned in the JVMS are part of the analysis (all in java.lang.*):
- NullPointerException
- ArithmeticException
- ArrayStoreException
- ArrayIndexOutOfBoundsException
- ClassCastException
- NegativeArraySizeException
- NoClassDefFoundError
- IllegalAccessError
- NoSuchFieldError
- IncompatibleClassChangeError
- NoSuchMethodError
- AbstractMethodError
Because every array implements java.lang.Cloneable and java.io.Serializable, these interfaces must also be part of the analysis.
Furthermore, the following types are part of the hierarchy of a type mentioned before, so they must be included (all in java.lang.*):
- Object
- Throwable
- Exception
- Error
- RuntimeException
- IndexOutOfBoundsException
- LinkageError
- CharSequence
- Comparable
Native Methods
To initialize Object, we need to implement the native method Object.registerNatives()V. This method is implemented "no-op". To actually throw instances of Exception (or Throwable in general), the native method Throwable.fillInStackTrace()V needs to be implemented. This method, too, is implemented as "no-op".
The remaining native methods are not considered (all in java.lang.*): Object.hashCode()I Object.getClass()Ljava/lang/Class; Object.clone()Ljava/lang/Object; Object.notify()V Object.notifyAll()V Object.wait(J) String.intern()Ljava/lang/String;
Example
As an example, analysis of java.lang.Object.equals(Ljava/lang/Object;)Z is possible, but analysis of java.lang.Object.toString()Ljava/lang/String; fails because the class java.lang.StringBuilder is unknown.
[1]: rt.jar in:
openjdk-6-jre-headless_6b11-9.1+lenny2_amd64.deb: http://packages.debian.org/lenny/amd64/openjdk-6-jre-headless/download Exact Size 22423776 Byte (21.4 MByte) MD5 checksum 645caac427ee007eed470895fc12ab9e SHA1 checksum f2cbf934681503b648f7dceb662e086a3bda5767 SHA256 checksum 349a2a500d01574906d96ca7a8ee2486415d1aba5bf2c3e4c0cc2fe8a5c00efd
Questions
java.lang.*
Many classes provided in the package java.lang, most prominently java.lang.Object, are used by all JBC programs (even those not resulting out of Java). All Java Virtual Machines executing real JBC provide these classes and also provide machine code for all native methods defined in these classes. As a consequence, it should be the responsibility of the tool to provide these classes (and deal with their native methods), so that a JBC problem can be defined without providing these additional classes. The classpath exception allows using class files from e.g. openjdk without having to provide the tool under GPL.
Native Methods
There is no precise description about the expected result from certain native methods defined in java.lang. As an example, the function java.lang.Object.hashCode()I returns an integer value that is strongly related to the memory position of the object. Do we need to take care of all possible values? Intuitive answer: Yes, everything that is possible should be handled! Would it be OK to only handle the case where 0 is returned? Intuitive answer: Yes, because the specification does not enforce specific values. This problem can be extended to native methods dealing with objects (always non-null? acyclic? ...).
Initialization / Static Fields
When starting in a main method (which is static) it is clear how the classes will be initialized and what values will be stored in static fields (after all, every real JVM just does this). However, when starting in an arbitrary method, this is different. One could argue that certain classes must already have been initialized before the method is started (this includes the class where the method is defined). One could also say that analysis must start in a state where everything is not yet analyzed. Another alternative would be to leave every information that is not known in some unknown state and handle all resulting cases. In the latter case one would like to know if certain (unknown) values of static fields are object references that share with some other object stored in another static field or - perhaps worse - with an object provided as an argument to the method that should be analyzed.