SAT solving: Creating a solver in Java (part 2)

SAT solving: Creating a solver in Java (part 2)

This is the second blogpost in a series about SAT solving. Today we’re going to build a simple solver in Java. Before continuing, if you don’t know what a SAT solver is, read [part one][].

The algorithm we’re going to implement is called DPLL (Davis Putnam Logemann Loveland).

Input parsing

The input we’ll accept is in DIMACS format. For now we can just skip the p-line with all the settings and just parse all the numbers. Here is an example input:

p cnf 3 4
 1  2 -3  0
 2  3  0
-2  0
-1  3  0

In Java we can parse this entire file with almost a one liner (don’t do this in production code):

        // Read a DIMACS file and turn into a list of clauses:
        final List<List<Integer>> clauses = Files.lines(Paths.get(inputFile))
                // Trim the lines:
                .map(line -> line.trim().replaceAll("\\s+", " ").trim())
                // Only parse lines ending with a 0:
                .filter(line -> line.endsWith(" 0"))
                // Turn each line into a list of integers:
                .map(line -> Arrays.stream(line
                        .substring(0, line.length() - 2)
                        .trim().split("\\s+"))
                        .map(Integer::parseInt)
                        .collect(Collectors.toList())
                ).collect(Collectors.toList());

Unit propagation

The next step in creating a (slow) SAT solver is to implement the following pseudo code:

Step 1a: Find all the unit variables

Find all the unit variables (-4 0). These are clauses that have just a single (active) variable. If a clause has just a single variable, it has to be true. So in this case we mark 4 as false.

Step 1b: Process all unit variables

Now we go over all the other clauses that contain this variable -4 or the inverse (4).

If the clause has -4 (which we now know is true), we mark this entire clause as ‘satisfied’.

If the clause contains the inverse of the variable, in this case 4, we know that particular variable will never become true, so we remove it from the list.

Lets try step 1 with the following example:

 1  4  2 -3 -6 0
 2  3  0
-2  0
-1  3  5 0
-2 -4 -5 0
-1  6  0

We have a single unit variable, -2, so we mark this variable as true. Next we apply step 1b:

Known: -2

[1  4 2 -3 -6 0] becomes: [1 4 -3 -6 0]
[2  3  0] becomes: [3 0]
[-2  0] marked true (contains -2)
[-1  3  5 0] stays the same: [-1  3  5 0]
[-2 -4 -5 0] marked true (contains -2)
[-1  6  0] stays the same: [-1  6  0]

Now we have a new unit variable, 3 0. This again can be processed:

Known: -2, 3

[1 4 -3 -6 0] becomes: [1 4 -6 0]
[3  0] marked true (contains 3)
[-2  0] marked true (contains -2)
[-1  3  5 0] marked true (contains 3)
[-2 -4 -5 0] marked true (contains -2)
[-1  6  0] stays the same: [-1  6  0]

And now we are done with the unit propagation. We are now left with two clauses that both are not a unit clause.

Step 2: Gamble on a variable

First we repeat step 1 until there are no unit clauses left. Now we need to take a gamble. In this case we have unresolved: -1, 1, 4 and -6, 6. We need to pick one of these and try it.

Let’s pick 6.

Again, we process this, if a clause contains 6 we mark it as true. If a clause contains -6 we remove that variable as alive variable.

Known: -2, 3

 1  4  -6 0 becomes: 1 4 0
-1  6  0 is marked true (contains 6)

Now we go back to step 1 and repeat.

Conflict

There are two possible ways this loop will end.

1) There are no clauses left, everything it true. In this case we have reached SAT. 2) We run into a situation that we have a conflict (!)

A conflict happens when we have a situation that can’t happen. For example when we have the following situation:

-1  0
 1 -2  0
 1  2  0

In this case we have -1 as a unit clause, we assign this variable. Now we process the following lines and what happens? We are left with:

 -2  0
  2  0

This is a conflict, we can’t have -2 AND 2. It can’t be true and false at the same time.

In the case of a conflict we backtrack back to the last decision point where we gambled. Reset everything and try a different variable.

Nothing left to gamble? UNSAT

If, at a certain point, you’ve run out of options to gamble on, we are also done. We’ve proven there is no way to assign the variables to satisfy all the clauses. In this case we can return UNSAT.

Put this into code:

If you put the above algorithm into Java code you might end up with something like this:

package com.royvanrijn.boolish;

import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

public class DPLLSolver {

    public static void main(String[] args) throws Exception {
        new DPLLSolver().solve("dimacs/example.cnf");
    }

    private void solve(final String inputFile) throws Exception {

        // Read a DIMACS file and turn into a list of clauses:
        final List<Clause> clauses = Files.lines(Paths.get(inputFile))
                .map(line -> line.trim().replaceAll("\\s+", " ").trim())
                .filter(line -> line.endsWith(" 0"))
                .map(Clause::new).collect(Collectors.toList());

        List<Integer> literals = new ArrayList<>();

        processDPLL(clauses, literals);

        // If we get here, we are unable to assign SAT, so we are UNSAT:
        System.out.println("Got to the end, UNSAT");
    }

    private void processDPLL(List<Clause> clauses, List<Integer> literals) {

        // Check if we have no more clauses that are unsatisfied:
        if(!clauses.stream().filter(clause -> !clause.clauseSatisfied).findAny().isPresent()) {
            // We are done, SAT!
            exitWithSAT(literals);
        }

        List<Integer> newLiterals = new ArrayList<>();

        // Step 1a: Find unit clauses:
        List<Integer> unitPropagation = findUnitClauses(clauses);

        // Step 1b: Process the unit variables (new units can be created while processing):
        while(unitPropagation.size() > 0) {

            // Remove this unit from all clauses:
            for(Integer unit : unitPropagation) {
                // Detect conflicts: We can not have both unit and -unit in the set, if there is we've hit a conflict.
                if(unitPropagation.contains(-unit)) {
                    // Undo everything we've done and return.
                    for(Integer literal : newLiterals) {
                        System.out.println("Undo unit: " + literal);
                        undoStep(clauses, literal);
                    }
                    return;
                }
                newLiterals.add(unit);
                System.out.println("Applying unit: " + unit);
                applyStep(clauses, unit);
            }
            unitPropagation.removeAll(newLiterals);

            // Look if we've created new unit clauses:
            unitPropagation.addAll(findUnitClauses(clauses));
        }

        // Get all the unassignedLiterals from the alive clauses:
        Set<Integer> unassignedLiterals = clauses.stream()
                .filter(clause -> !clause.clauseSatisfied)
                .flatMap(clause -> clause.unassignedLiterals.stream()).collect(Collectors.toSet());

        for(Integer decisionLiteral : unassignedLiterals) {
            System.out.println("Gambling/deciding on: " + decisionLiteral);
            applyStep(clauses, decisionLiteral);

            // Go deeper with a fresh list:
            List<Integer> deeperAssignedLiterals = new ArrayList();
            deeperAssignedLiterals.addAll(literals);
            deeperAssignedLiterals.addAll(newLiterals);
            deeperAssignedLiterals.add(decisionLiteral);

            processDPLL(clauses, deeperAssignedLiterals);

            System.out.println("Undo gambling: " + decisionLiteral);
            undoStep(clauses, decisionLiteral);
        }

        // Undo all we've done this step:
        for(Integer literal : newLiterals) {
            System.out.println("Undo units: " + literal);
            undoStep(clauses, literal);
        }
    }

    /**
     * When applying a step, keep all the dead literals (variables) so we can undo.
     *
     * @param clauses
     * @param literal
     */
    private void applyStep(final List<Clause> clauses, final Integer literal) {
        for(Clause clause : clauses) {
            if(!clause.clauseSatisfied) {
                if(clause.unassignedLiterals.contains(literal)) {
                    clause.clauseSatisfied = true;
                } else if(clause.unassignedLiterals.contains(-literal)) {
                    clause.unassignedLiterals.remove((Integer) (-literal));
                    clause.deadLiterals.add(-literal);
                }
            }
        }
    }

    /**
     * Because we are backtracking we need to be able to undo all the steps in a clause, so we keep all the information
     *
     * @param clauses
     * @param literal
     */
    private void undoStep(final List<Clause> clauses, final Integer literal) {
        for(Clause clause : clauses) {
            if(clause.clauseSatisfied && clause.unassignedLiterals.contains(literal)) {
                clause.clauseSatisfied = false;
            }
            if(clause.deadLiterals.contains(-literal)) {
                clause.deadLiterals.remove((Integer) (-literal));
                clause.unassignedLiterals.add(-literal);
            }
        }
    }

    private List<Integer> findUnitClauses(final List<Clause> clauses) {
        List<Integer> unitPropagation = new ArrayList<>();
        // Check if there are unit clauses:
        for(Clause clause : clauses) {
            if(clause.isUnitClause()) {
                unitPropagation.add(clause.unassignedLiterals.get(0));
            }
        }
        return unitPropagation;
    }

    private void exitWithSAT(final List<Integer> literals) {
        // We are done:
        System.out.println("SAT");

        // Sort the output as absolute values.
        Collections.sort(literals, Comparator.comparingInt(Math::abs));

        // TODO: We might not list all the input variables here, some are not needed and can be + or -.

        // And print:
        System.out.println(literals.stream().map(n -> String.valueOf(n)).collect(Collectors.joining(" ")) + " 0");

        // Bye bye.
        System.exit(1);
    }

    public class Clause {

        private List<Integer> unassignedLiterals = new ArrayList<>();
        private List<Integer> deadLiterals = new ArrayList<>();
        private boolean clauseSatisfied = false;

        private Clause(String inputLine) {
            unassignedLiterals.addAll(
                    Arrays.stream(inputLine
                            .substring(0, inputLine.length() - 2)
                            .trim().split("\\s+"))
                            .map(Integer::parseInt)
                            .collect(Collectors.toList()));
        }

        boolean isUnitClause() {
            return !clauseSatisfied && unassignedLiterals.size() == 1;
        }
    }
}

This is all. This algorithm can solve any problem that can be turned into a SAT problem, which are a lot of mathematical problems.

DPPL isn’t the fastest algorithm, a lot of improvements can be made (part 3?). For example we can learn new clauses during analysis, and instead of backtracking to the last decision moment, sometimes you can backtrack to a much earlier decision moment, making the resolution much faster.

But the basic algorithm to solve SAT, is very simple indeed.

part one