Difference between revisions of "Complexity:Rules"
Georg.moser (talk | contribs) |
Georg.moser (talk | contribs) |
||
(5 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | <pre> | ||
+ | This page is outdated an no longer maintained. | ||
+ | </pre> | ||
+ | |||
+ | An up to date version of the content of this page | ||
+ | can now be found here: [http://cl-informatik.uibk.ac.at/users/georg/cbr/competition/ complexity competition]. | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
The purpose of this page is to provide a place for ongoing discussions | The purpose of this page is to provide a place for ongoing discussions | ||
on the rules, and to describe the current rules itself. | on the rules, and to describe the current rules itself. | ||
Line 104: | Line 114: | ||
=== Problem Sets and Problem Selection === | === Problem Sets and Problem Selection === | ||
− | + | ==== Testbeds ==== | |
− | the | + | |
− | + | All authors of complexity tools that participated in the last | |
− | + | complexity competition agreed upon | |
+ | the following selection of examples from the TPDB. For simplicity, | ||
+ | we decided on | ||
+ | using the same testbed for full and innermost rewriting. However, we | ||
+ | decided on | ||
+ | two separate testbeds for derivational and runtime complexity | ||
+ | analysis. The testbeds | ||
+ | are described below in detail, additionally a list of | ||
+ | problems for TPDB version 8.0 is provided for [[File:RC-8.0.txt|RC]] and [[File:DC-8.0.txt|DC]]. For the | ||
+ | individual | ||
+ | subcategories RC, RCi, DC and DCi all strategy and start-term | ||
+ | annotations should be overwritten | ||
+ | appropriately. Thus we simply ignore for instance | ||
+ | context-sensitive strategies. | ||
− | + | ===== Derivational Complexity ===== | |
− | + | For derivational complexity analysis we restrict the TPDB as follows: | |
− | + | # keep only one instance of duplicated problems (modulo strategy annotations). A list of the TRSs that appear multiple times in the current TPDB version 8.0 can be found [[File:duplicates-8.0.txt|here]]. | |
− | + | # remove all problems with relative, theory or conditional annotations. The reason for this is that none of the tools can handle those problems currently. | |
− | |||
− | For | + | ===== Runtime Complexity ===== |
− | + | For runtime complexity analysis, we further restrict the testbed for derivational complexity analysis as follows: | |
− | + | # remove all examples as determined by the tool Oops that participated in the competition of 2010. A list of these examples for TPDB version 8.0 can be found [[File:Oops-8.0.txt|here]]. This step aims at eliminating trivial problems that | |
+ | ## admit only 0-ary constructors, or | ||
+ | ## admit only 0-ary defined function symbols, or | ||
+ | ## admit only rules with nested defined function symbols on left-hand sides, thus all basic terms are normal forms. | ||
+ | # remove higher-order examples from following folder, as here the notion of runtime complexity is inapropriate: | ||
+ | ## AotoYamada_05, | ||
+ | ## Applicative_05, | ||
+ | ## Applicative_AG01_innermost, | ||
+ | ## Applicative_first_order_05, and | ||
+ | ## Mixed_HO_10. | ||
==== Selection function ==== | ==== Selection function ==== | ||
− | + | On the above described testbeds, a randomly chosen subset that is actually used in the competition is determined as follows. | |
+ | Here we denote by ''select'' the function that relates | ||
each family from the TPDB to the number of randomly chosen examples within this family as defined | each family from the TPDB to the number of randomly chosen examples within this family as defined | ||
for the termination competition. | for the termination competition. | ||
Line 126: | Line 158: | ||
aware of different difficulties of proving complexity bounds. We do so by | aware of different difficulties of proving complexity bounds. We do so by | ||
# partitioning each family ''F'' into ''n'' different sets ''F = F_1 \cup ... \cup F_n'', where the sets ''F_i'' may be seen as collections of TRSs similar in difficulty. For this years competition we propose following partitioning of a family ''F'': | # partitioning each family ''F'' into ''n'' different sets ''F = F_1 \cup ... \cup F_n'', where the sets ''F_i'' may be seen as collections of TRSs similar in difficulty. For this years competition we propose following partitioning of a family ''F'': | ||
− | #:* | + | #:* we propose to partition each family into |
#:*:(i) those upon which a polynomial bound could be shown automatically in last years competition (denoted by ''F_auto'' below) and | #:*:(i) those upon which a polynomial bound could be shown automatically in last years competition (denoted by ''F_auto'' below) and | ||
#:*:(ii) those where a polynomial bound could not be shown (''F_nonauto''). | #:*:(ii) those where a polynomial bound could not be shown (''F_nonauto''). | ||
− | + | # In accordance to the above described partitioning, we define a probability distribution ''p'' on ''F''. For this year's competition we propose the following distribution: | |
− | # In accordance to the above described partitioning, we define a probability distribution ''p'' on ''F | + | #:for all subcategories and families ''F'', we propose ''p(F_auto) = 0.4'' and ''p(F_nonauto) = 0.6''. That is, we want to consider 40% examples that could be solved automatically in last years competition, and 60% of examples that could not be solved automatically. Based on the probability distribution ''p'' we define the extended selection function ''select_comp(F,i) = min(|F_i|, p(i) * select(F))''. Here ''|F_i|'' denotes the size of ''F_i''. |
− | #:for all subcategories and families ''F'', we propose ''p(F_auto) = 0.4'' and ''p(F_nonauto) = 0.6'' | ||
# From each partition ''F_i'' of a family ''F'', we randomly select ''select_comp(F,i)'' examples. | # From each partition ''F_i'' of a family ''F'', we randomly select ''select_comp(F,i)'' examples. | ||
Latest revision as of 11:58, 8 June 2012
This page is outdated an no longer maintained.
An up to date version of the content of this page can now be found here: complexity competition.
The purpose of this page is to provide a place for ongoing discussions on the rules, and to describe the current rules itself.
Contents
Discussion
Lower Bounds
In the future the tools should also be able to provide certificates on the lower bound. This would imply to extend the grammar as follows
F -> O(1) | O(n^Nat) | POLY | EXP | INF
such that e.g. "YES(EXP,?)" indicated an exponential lower-bound, or "YES(INF,INF)" indicated non-termination.
(JW: I don't like the looks of an answer starting "YES" and indicating non-termination. See "BOUNDS" proposal below.)
(GM: I don't see a problem with that: "YES" indicates that the prover has found a proof. In the case you mention, a proof for non-termination.)
Scoring (proposals)
- as for the upper bound the lower bound certificate should be ranked and both ranks could be compared lexicographically (with the upper bound as the primary criterion)
- JW prefers: don't define some artificial total order on the bounds. The natural partial ordering is given by the inclusion relation on the sets of functions that are described by the bounds. This inclusion can be computed from
(low1, up1) "is better than" (low2, up2) iff low1 >= low2 and up1 <= up2
Then for each problem, answer A gets awarded k points if A is strictly better than k of the answers, where "no answer" counts as BOUNDS(LIN,INF), and "strictly better = better and not equal".
This would imply that if all answers are identical, then no-one gets a point. Perhaps we want to add one virtual prover that always says"BOUNDS(LIN,INF)" - so anyone who gives a better answer, gets at least one point.
- GM: For clarification: I suppose you want to use the following order: POLY >= O(n^7) >= O(n^6) etc. to name an example. And I would insist
on the use of a virtual prover: getting 0 points although an answer has been given I don't like
Concrete syntax
- JW would prefer the following output format as it is easier to parse:
F -> POLY(Nat) | POLY(?)
Here "POLY(k)" abbreviates "O(n^k)" and "POLY(?)" denotes an unspecified
polynomial.
- JW: I'm not giving up ... one more reason against the O(n^k) syntax: 3. it cannot be used for lower bounds, as we would need Omega instead of Oh. (The other two reasons are: 2. needlessly complicated, and 1. n is an undefined variable)
- proposal to replace YES/NO/MAYBE by BOUNDS: http://dev.aspsimon.org/bugzilla/show_bug.cgi?id=85#c4
LN: I'd like to support this notation. But I think "?" for an unknown bound is unnecessary. It can always be replaced by POLY(0) for the lower bound and INF for the upper bound. Noschinski 13:26, 13 February 2010 (UTC)
- GM: I don't like the format "POLY(k)" mainly due to presentation reasons: Our first format used exactly this and in presentations I immediately got the question
what "POLY(k)" should mean. Since "O(k)" is used I don't get this question. With respect to the lower-bound: I agree that "O(k)" should be replaced by "Omega(k)". So let's do that. But I don't see a chance to change this for the upcoming competition ...
Rules of the Competition
Input Format
Problems are given in the newly TPDB-format, cf. [1]. where the XML-element problem will have the type complexity given. Further, depending on the category DC, iDC, RC and iRC, the attributes strategy and startterm will be set to FULL/INNERMOST and full/constructor-based respectively.
Output Format
The output format is adapted so that additional information on the asymptotic complexity is given for lower as well as upper bounds. Hence the output written to the first line of STDOUT shall be a complexity statement according to the following grammar:
S -> NO | MAYBE | YES( F, F) | YES( ?, F) | YES( F, ?) F -> O(1) | O(n^Nat) | POLY
where "Nat" is a non-zero natural number and YES(F1, F2) means F2 is upper bound and that F1 is a lower-bound. "O(n^k)" is the usual big-Oh notation and "POLY" indicates an unspecified polynomial. Either of the functions F1, F2 (but not both) may be replaced by ``don't know, indicated by ?. Any remaining output on STDOUT will be considered as proof output and has to follow the normal rules for the competition.
Scoring
Currently we focus on (polynomial) upper bounds. As the output format indicates, this restriction should be lifted later, see below. In order to take into account the quality of the upper bound provided by the different tools, we propose the following scoring algorithm, where we suppose the number of competitors is x.
Firstly, for each TRS the competing tools are ranked, where constant complexity, i.e., output "YES(?,O(1))" is best and "MAYBE", "NO" or time-out is worst. As long as the output is of form "YES(?,O(n^k))" or "YES(?,POLY)" the rank of the tool defines the number of points. More precisely the best tool gets x+1 points, the second gets x points and so on. On the other hand a negative output ("MAYBE", "NO" or time-out) gets 0 points. If two or more tools would get the same rank, the rank of the remaining tools is adapted in the usual way.
Secondly, all resulting points for all considered systems are summed up and the contestant with the highest number of points wins. If this cannot establish a winner, the total number of wins is counted. If this still doesn't produce a winner, we give up and provide two (or more) winners.
The maximal allowed CPU time is 60 seconds.
Problem Sets and Problem Selection
Testbeds
All authors of complexity tools that participated in the last complexity competition agreed upon the following selection of examples from the TPDB. For simplicity, we decided on using the same testbed for full and innermost rewriting. However, we decided on two separate testbeds for derivational and runtime complexity analysis. The testbeds are described below in detail, additionally a list of problems for TPDB version 8.0 is provided for RC and DC. For the individual subcategories RC, RCi, DC and DCi all strategy and start-term annotations should be overwritten appropriately. Thus we simply ignore for instance context-sensitive strategies.
Derivational Complexity
For derivational complexity analysis we restrict the TPDB as follows:
- keep only one instance of duplicated problems (modulo strategy annotations). A list of the TRSs that appear multiple times in the current TPDB version 8.0 can be found here.
- remove all problems with relative, theory or conditional annotations. The reason for this is that none of the tools can handle those problems currently.
Runtime Complexity
For runtime complexity analysis, we further restrict the testbed for derivational complexity analysis as follows:
- remove all examples as determined by the tool Oops that participated in the competition of 2010. A list of these examples for TPDB version 8.0 can be found here. This step aims at eliminating trivial problems that
- admit only 0-ary constructors, or
- admit only 0-ary defined function symbols, or
- admit only rules with nested defined function symbols on left-hand sides, thus all basic terms are normal forms.
- remove higher-order examples from following folder, as here the notion of runtime complexity is inapropriate:
- AotoYamada_05,
- Applicative_05,
- Applicative_AG01_innermost,
- Applicative_first_order_05, and
- Mixed_HO_10.
Selection function
On the above described testbeds, a randomly chosen subset that is actually used in the competition is determined as follows. Here we denote by select the function that relates each family from the TPDB to the number of randomly chosen examples within this family as defined for the termination competition. The idea is to make select aware of different difficulties of proving complexity bounds. We do so by
- partitioning each family F into n different sets F = F_1 \cup ... \cup F_n, where the sets F_i may be seen as collections of TRSs similar in difficulty. For this years competition we propose following partitioning of a family F:
- we propose to partition each family into
- (i) those upon which a polynomial bound could be shown automatically in last years competition (denoted by F_auto below) and
- (ii) those where a polynomial bound could not be shown (F_nonauto).
- we propose to partition each family into
- In accordance to the above described partitioning, we define a probability distribution p on F. For this year's competition we propose the following distribution:
- for all subcategories and families F, we propose p(F_auto) = 0.4 and p(F_nonauto) = 0.6. That is, we want to consider 40% examples that could be solved automatically in last years competition, and 60% of examples that could not be solved automatically. Based on the probability distribution p we define the extended selection function select_comp(F,i) = min(|F_i|, p(i) * select(F)). Here |F_i| denotes the size of F_i.
- From each partition F_i of a family F, we randomly select select_comp(F,i) examples.
Test Cases
In the following test cases we restrict to full rewriting. test cases - derivational complexity
R = {a(b(x)) -> b(a(x))}, expected output "YES(?,O(n^2))" or "YES(O(n^1),O(n^2))" or "YES(O(n^2),O(n^2))" R= {a(a(x)) -> b(c(x)), b(b(x)) -> a(c(x)), c(c(x)) -> a(b(x))}, expected output "YES(O(n^2),?)" or "YES(?,?)" R= {+(s(x),+(y,z)) -> +(x,+(s(s(y)),z)), +(s(x),+(y,+(z,w))) -> +(x,+(z,+(y,w)))}, expected output "YES(?,?)"
test cases - runtime complexity
R = {a(b(x)) -> b(b(a(x)))}, expected output "YES(?,O(n^1))" or "YES(O(n^1),O(n^1))" R = {plus(0,y) -> y, plus(s(x),y) -> s(plus(x,y)), mul(0,y) -> 0, mul(s(x),y) -> plus(mul(x,y),y)}, expected output "YES(?,O(n^2))" or "YES(O(n^1),O(n^2))" or "YES(O(n^2),O(n^2))" R = {f(x,0) -> s(0), f(s(x),s(y)) -> s(f(x,y)), g(0,x) -> g(f(x,x),x)}, expected output "YES(?,O(n^1))" or "YES(O(n^1),O(n^1))" R= {f(0) -> c, f(s(x)) -> c(f(x),f(x))}, expected output "YES(?,?)"
In the following test cases we restrict to innermost rewriting.
test cases - derivational complexity
R = {f(x) -> c(x,x)}, expected output "YES(O(n^1),O(n^1))" or "YES(?,O(n^1))"
test cases - runtime complexity
R= {f(x) -> c(x,x), g(0) -> 0, g(s(x)) -> f(g(x))}, expected output "YES(O(n^1),O(n^1))" or "YES(?,O(n^1))"