- CSE 1.0
- CSE 1.1
- CSE_E 1.0
- CVC4 1.6pre
- E 2.2pre
- Geo-III 2018C
- Grackle 0.1
- iProver 2.6
- iProver 2.8
- leanCoP 2.2
- LEO-II 1.7.0
- Leo-III 1.3
- MaLARea 0.6
- nanoCoP 1.1
- Princess 170717
- Prover9 1109a
- Satallax 3.2
- Satallax 3.3
- Twee 2.2
- Vampire 4.0
- Vampire 4.1
- Vampire 4.2
- Vampire 4.3

Southwest Jiaotong University, China

Ulster University, United Kingdom (Jun Liu)

CSE 1.0 adopts conventional factoring, equality resolution, and variable renaming. Some pre-processing techniques, including pure literal deletion and simplification based on the distance to the goal clause, and a number of standard redundancy criteria for pruning the search space: tautology deletion, subsumption (forward and backward) are applied as well.

Internally, CSE 1.0 works only with clausal normal form. E prover [Sch13] is adopted with thanks for clausification of full first-order logic problems during preprocessing.

- Clause selection. This strategy category mainly considers the times of the clauses taken part in the inference, the weights considering clause redundancy, term weight, clause complexity, the number of literals, the number of complementary predicates, the clause being a source clause or an inferred clause, and so on.
- Literal selection. This strategy category mainly considers the times of the literal taken part in the inference, literal stability, literal complexity, the number of predicates in the literal, etc.
- Weight strategy. The weights to the clauses are calculated mainly considering the times of clause taken part in the deduction process, and clause redundancy. The weights are updated dynamically during the deduction process.
- Contradiction separation clause (CSC) strategy. This strategy category mainly considers the number of literals in the CSC, the percent of the ground literals appearing in the CSC, the number of function symbol acting on the literals in a clause, and the effectiveness evaluation of the CSC.
- Control strategy of the contradiction separation deduction process. The number of literals in the CSC, and the number of clauses involved in each contradiction are dynamically changed during the deduction process.

Southwest Jiaotong University, China

Ulster University, United Kingdom (Jun Liu)

CSE 1.1 adopts conventional factoring, equality resolution, and variable renaming. Some pre-processing techniques, including pure literal deletion and simplification based on the distance to the goal clause, and a number of standard redundancy criteria for pruning the search space: tautology deletion, subsumption (forward and backward) are applied as well.

Internally, CSE 1.1 works only with clausal normal form. E prover [Sch13] is adopted with thanks for clausification of full first-order logic problems during preprocessing.

- Deduction framework. This provides two overall options for S-CS deduction: integrity deduction mode, which takes all the clauses into consideration during deduction process, and contradiction separation clause deduction mode, which considers only a subset of clauses.
- Repeat usage of clause. This strategy provides two strategies: repeat usage of axiom and repeat usage of clause.
- Contradiction separation clause strategy. Besides the CSC strategies in CSE 1.0, CSE 1.1 allows the usage of the medium CSCs during the contradiction construction process.

Southwest Jiaotong University, China

DHBW Stuttgart, Germany (Stephan Schulz)

Ulster University, United Kingdom (Jun Liu)

This kind of combination is expected to take advantage of both CSE and E, and produce a better performance. Concretely, CSE is able to generate a good number of unit clauses, based on the fact that unit clauses are helpful for proof search and equality handling. On the other hand, E has a good ability on equality handling.

- Evaluation of contradiction separation clause. The evaluation is based on the number of clauses, the distance to the goal clause, and other factors.
- Deletion of contradiction separation clause. This is realized based on weights. Three weight calculation methods considering variables, functions, terms, and the time of a clause being involved in the deduction are provided.

University of Iowa, USA

https://github.com/CVC4

DHBW Stuttgart, Germany

For LTB division, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E. E will probably not use on-the-fly learning this year.

For CASC-J9, E implements a strategy-scheduling automatic mode. The total CPU time available is broken into several (unequal) time slices. For each time slice, the problem is classified into one of several classes, based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). For each class, a schedule of strategies is greedily constructed from experimental data as follows: The first strategy assigned to a schedule is the the one that solves the most problems from this class in the first time slice. Each subsequent strategy is selected based on the number of solutions on problems not already solved by a preceding strategy. About 220 different strategies have been evaluated on all untyped first-order problems from TPTP 6.4.0. About 90 of these strategies are used in the automatic mode, and about 210 are used in at least one schedule.

http://www.eprover.org

Nazarbayev University, Kazakhstan

- The Geo family of provers uses exhaustive backtracking, in combination with learning after failure. Earlier versions (before 2016) learned only conflict formulas. Geo III learns disjunctions of arbitrary width. Experiments show that this often results in shorter proofs.
- If Geo will be ever embedded in proof assistants, these assistants will require proofs. In order to be able to provide these at the required level of detail, Geo III contains a hierarchy of proof rules that is independent of the rest of the system, and that can be modified independently.
- In order to be more flexible in the main algorithm, recursive backtracking has been replaced by use of a stack. By using a stack, it has become possible to implement non-chronological backtracking, remove unused assumptions, or to rearrange the order of assumptions. Also, restarts are easier to implement with a stack.
- Matching a geometric formula into a candidate model is a critical operation in Geo. Compared to previous versions, the matching algorithm has been improved theoretically, reimplemented, and is no longer a bottle neck.

https://cs-sst.github.io/faculty/nivelle/implementation/index

Czech Technical University in Prague, Czech Republic

https://github.com/ai4reason/atpyThe Grackle system is a part of this library. Grackle itself uses ParamILS to improve an existing strategy. Grackle requires a set of reasonably performing E prover strategies to start with. These are extracted from E's auto mode and previous Grackle/BliStrTune runs on a subset of TPTP library. The code of ATPy and Grackle is released under GPL2.

University of Manchester, United Kingdom

In the LTB and SLH divisions, iProver combines an abstraction-refinement framework [HK17] with axiom selection based on the SinE algorithm [HV11] as implemented in Vampire [KV13], i.e., axiom selection is done by Vampire and proof attempts are done by iProver.

Some of iProver features are summarised below.

- proof extraction for both instantiation and resolution [KS12],
- model representation, using first-order definitions in term algebra [KS12],
- answer substitutions,
- semantic filtering,
- incremental finite model finding,
- sort inference, monotonic [CLS11] and non-cyclic [Kor13] sorts,
- support for the TFF format restricted to clauses,
- predicate elimination [KK16].

http://www.cs.man.ac.uk/~korovink/iprover/

University of Manchester, United Kingdom

University of Oslo, Norway

leanCoP can read formulae in leanCoP syntax and in TPTP first-order syntax. Equality axioms and axioms to support distinct objects are automatically added if required. The leanCoP core prover returns a very compact connection proof, which is then translated into a more comprehensive output format, e.g., into a lean (TPTP-style) connection proof or into a readable text proof.

The source code of leanCoP 2.2 is available under the GNU general public license. It can be downloaded from the leanCoP website at:

http://www.leancop.deThe website also contains information about ileanCoP [Ott08] and MleanCoP [Ott12, Ott14], two versions of leanCoP for first-order intuitionistic logic and first-order modal logic, respectively.

Freie Universität Berlin, Germany

Unfortunately the LEO-II system still uses only a very simple sequential collaboration model with first-order ATPs instead of using the more advanced, concurrent and resource-adaptive OANTS architecture [BS+08] as exploited by its predecessor LEO.

The LEO-II system is distributed under a BSD style license, and it is available from

http://www.leoprover.org

Freie Universität Berlin, Germany

Leo-III heavily relies on cooperation with external (first-order) ATPs that are called asynchronously during proof search. At the moment, first-order cooperation focuses on typed first-order (TFF) systems, where CVC4 [BC+11] and E [Sch02,Sch13] are used as default external systems. Nevertheless, cooperation is not limited to first-order systems. Further TPTP/TSTP-compliant external systems (such as higher-order ATPs or counter model generators) may be included using simple command-line arguments. If the saturation procedure loop (or one of the external provers) finds a proof, the system stops, generates the proof certificate and returns the result.

The term data structure of Leo-III uses a polymorphically typed spine term representation augmented with explicit substitutions and De Bruijn-indices. Furthermore, terms are perfectly shared during proof search, permitting constant-time equality checks between alpha-equivalent terms.

Leo-III's saturation procedure may at any point invoke external reasoning tools. To that end, Leo-III includes an encoding module which translates (polymorphic) higher-order clauses to polymorphic and monomorphic typed first-order clauses, whichever is supported by the external system. While LEO-II relied on cooperation with untyped first-order provers, Leo-III exploits the native type support in first-order provers (TFF logic) for removing clutter during translation and, in turn, higher effectivity of external cooperation.

Leo-III is available on GitHub:

https://github.com/leoprover/Leo-III

Czech Technical University in Prague, Czech Republic

https://github.com/JUrban/MPTP2/tree/master/MaLAReaThe metasystem's Perl code is released under GPL2.

University of Oslo, Norway

nanoCoP can read formulae in leanCoP/nanoCoP syntax and in TPTP first-order syntax. Equality axioms are automatically added if required. The nanoCoP core prover returns a compact non-clausal connection proof.

The source code of nanoCoP 1.1 is available under the GNU general public license. It can be downloaded from the nanoCoP website at:

http://www.leancop.de/nanocopThe provers nanoCoP-i and nanoCoP-M are version of nanoCoP for first-order intuitionistic logic and first-order modal logic, respectively. They are based on an adapted non-clausal connection calculus for non-classical logics [Ott17].

Uppsala University, Sweden

The internal calculus of Princess only supports uninterpreted predicates; uninterpreted functions are encoded as predicates, together with the usual axioms. Through appropriate translation of quantified formulae with functions, the e-matching technique common in SMT solvers can be simulated; triggers in quantified formulae are chosen based on heuristics similar to those in the Simplify prover.

Princess is available from:

http://www.philipp.ruemmer.org/princess.shtml

University of New Mexico, USA

Prover9 has available positive ordered (and nonordered) resolution and paramodulation, negative ordered (and nonordered) resolution, factoring, positive and negative hyperresolution, UR-resolution, and demodulation (term rewriting). Terms can be ordered with LPO, RPO, or KBO. Selection of the "given clause" is by an age-weight ratio.

Proofs can be given at two levels of detail: (1) standard, in which each line of the proof is a stored clause with detailed justification, and (2) expanded, with a separate line for each operation. When FOF problems are input, proof of transformation to clauses is not given.

Completeness is not guaranteed, so termination does not indicate satisfiability.

Given a problem, Prover9 adjusts its inference rules and strategy according to syntactic properties of the input clauses such as the presence of equality and non-Horn clauses. Prover9 also does some preprocessing, for example, to eliminate predicates.

For CASC Prover9 uses KBO to order terms for demodulation and for the inference rules, with a simple rule for determining symbol precedence.

For the FOF problems, a preprocessing step attempts to reduce the problem to independent subproblems by a miniscope transformation; if the problem reduction succeeds, each subproblem is clausified and given to the ordinary search procedure; if the problem reduction fails, the original problem is clausified and given to the search procedure.

http://www.cs.unm.edu/~mccune/prover9/

Universität Innsbruck, Austria

Proof search: A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch. Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro13]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable. Optionally, Satallax generates first-order formulae in addition to the propositional clauses. If this option is used, then Satallax periodically calls the first-order theorem prover E [Sch13] to test for first-order unsatisfiability. If the set of first-order formulae is unsatisfiable, then the original branch is unsatisfiable. Upon request, Satallax attempts to reconstruct a proof which can be output in the TSTP format. The proof reconstruction has been significantly changed since Satallax 3.0 in order to make proof reconstruction more efficient and thus less likely to fail within the time constraints.

http://satallaxprover.com

Universität Innsbruck, Austria

Proof search: A branch is formed from the axioms of the problem and the negation of the conjecture (if any is given). From this point on, Satallax tries to determine unsatisfiability or satisfiability of this branch. Satallax progressively generates higher-order formulae and corresponding propositional clauses [Bro13]. These formulae and propositional clauses correspond to instances of the tableau rules. Satallax uses the SAT solver MiniSat to test the current set of propositional clauses for unsatisfiability. If the clauses are unsatisfiable, then the original branch is unsatisfiable. Optionally, Satallax generates first-order formulae in addition to the propositional clauses. If this option is used, then Satallax periodically calls the first-order theorem prover E [Sch13] to test for first-order unsatisfiability. If the set of first-order formulae is unsatisfiable, then the original branch is unsatisfiable. Upon request, Satallax attempts to reconstruct a proof which can be output in the TSTP format.

http://satallaxprover.com

Chalmers University of Technology, Sweden

Twee's implementation of ground joinability testing performs case splits on the order of variables, in the style of [MN90], and discharges individual cases by rewriting modulo a variable ordering. It is able to pick only useful case splits and to case split on a subset of the variables, which makes it efficient enough to be switched on unconditionally.

Horn clauses are encoded as equations as described in [CS18]. The CASC version of Twee "handles" non-Horn clauses by discarding them.

The main loop is a DISCOUNT loop. The active set contains rewrite rules and unorientable equations, which are used for rewriting, and the passive set contains unprocessed critical pairs. Twee often interreduces the active set, and occasionally simplifies the passive set with respect to the active set. Each critical pair is scored using a weighted sum of the weight of both of its terms. Terms are treated as DAGs when computing weights, i.e., duplicate subterms are only counted once per term. The weights of critical pairs that correspond to Horn clauses are adjusted by the heuristic described in [CS18], section 5.

The passive set is represented as a heap. It achieves high space efficiency (12 bytes per critical pair) by storing the parent rule numbers and overlap position instead of the full critical pair and by grouping all critical pairs of each rule into one heap entry.

Twee uses an LCF-style kernel: all rules in the active set come with a certified proof object which traces back to the input axioms. When a conjecture is proved, the proof object is transformed into a human-readable proof. Proof construction does not harm efficiency because the proof kernel is invoked only when a new rule is accepted. In particular, reasoning about the passive set does not invoke the kernel. The translation from Horn clauses to equations is not yet certified.

Twee can be downloaded from:

http://nick8325.github.io/twee

University of Manchester, United Kingdom

A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering. Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Internally, Vampire works only with clausal normal form. Problems in the full first-order logic syntax are clausified during preprocessing. Vampire implements many useful preprocessing transformations including the Sine axiom selection algorithm.

When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

- Choices of saturation algorithm:
- Limited Resource Strategy
- DISCOUNT loop
- Otter loop
- Instantiation using the Inst-Gen calculus
- MACE-style finite model building with sort inference

- Splitting via AVATAR
- A variety of optional simplifications.
- Parameterized reduction orderings.
- A number of built-in literal selection functions and different modes of comparing literals.
- Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
- Set-of-support strategy.
- Ground equational reasoning via congruence closure.
- Evaluation of interpreted functions.
- Extensionality resolution with detection of extensionality axioms

University of Manchester, United Kingdom

A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering. Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Internally, Vampire works only with clausal normal form. Problems in the full first-order logic syntax are clausified during preprocessing. Vampire implements many useful preprocessing transformations including the SinE axiom selection algorithm.

When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

- Choices of saturation algorithm:
- Limited Resource Strategy [RV03].
- DISCOUNT loop
- Otter loop
- Instantiation using the Inst-Gen calculus
- MACE-style finite model building with sort inference

- Splitting via AVATAR
- A variety of optional simplifications.
- Parameterized reduction orderings.
- A number of built-in literal selection functions and different modes of comparing literals.
- Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
- Set-of-support strategy.
- Ground equational reasoning via congruence closure.
- Addition of theory axioms and evaluation of interpreted functions.
- Use of Z3 [dMB08] with AVATAR to restrict search to ground-theory-consistent splitting branches.
- Extensionality resolution [GK+14] with detection of extensionality axioms.

University of Manchester, United Kingdom

A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering. Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Internally, Vampire works only with clausal normal form. Problems in the full first-order logic syntax are clausified during preprocessing. Vampire implements many useful preprocessing transformations including the SinE axiom selection algorithm. When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

- Choices of saturation algorithm:
- Limited Resource Strategy [RV03]
- DISCOUNT loop
- Otter loop
- Instantiation using the Inst-Gen calculus
- MACE-style finite model building with sort inference

- Splitting via AVATAR [Vor14]
- A variety of optional simplifications.
- Parameterized reduction orderings.
- A number of built-in literal selection functions and different modes of comparing literals [HR+16].
- Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
- Set-of-support strategy.
- Ground equational reasoning via congruence closure.
- Addition of theory axioms and evaluation of interpreted functions.
- Use of Z3 with AVATAR to restrict search to ground-theory-consistent splitting branches [RB+16].
- Specialised theory instantiation and unification
- Extensionality resolution with detection of extensionality axioms

University of Manchester, United Kingdom This description is very similar to that of Vampire 4.2. The main difference is the use of theory instantiation and unification with abstraction [RSV18] for theory reasoning (this was experimental in 4.2). The set-of-support strategy for theory reasoning has also been extended. Little has changed in other areas of Vampire. As always there have been some small improvements to heuristics, data structures and schedules but nothing fundamentally new.

A number of standard redundancy criteria and simplification techniques are used for pruning the search space: subsumption, tautology deletion, subsumption resolution and rewriting by ordered unit equalities. The reduction ordering is the Knuth-Bendix Ordering. Substitution tree and code tree indexes are used to implement all major operations on sets of terms, literals and clauses. Internally, Vampire works only with clausal normal form. Problems in the full first-order logic syntax are clausified during preprocessing. Vampire implements many useful preprocessing transformations including the SinE axiom selection algorithm. When a theorem is proved, the system produces a verifiable proof, which validates both the clausification phase and the refutation of the CNF.

- Choices of saturation algorithm:
- Limited Resource Strategy [RV03]
- DISCOUNT loop
- Otter loop
- Instantiation using the Inst-Gen calculus
- MACE-style finite model building with sort inference

- Splitting via AVATAR [Vor14]
- A variety of optional simplifications.
- Parameterized reduction orderings.
- A number of built-in literal selection functions and different modes of comparing literals [HR+16].
- Age-weight ratio that specifies how strongly lighter clauses are preferred for inference selection.
- Set-of-support strategy.
- Ground equational reasoning via congruence closure.
- Addition of theory axioms and evaluation of interpreted functions.
- Use of Z3 with AVATAR to restrict search to ground-theory-consistent splitting branches [RB+16].
- Specialised theory instantiation and unification [RSV18]
- Extensionality resolution with detection of extensionality axioms