# October 10, 2008

## SAT and the quest for the perfect study subject…

This must be the oldest post in my drafts bucket. Its creation (the title, I mean, because the body as been empty up until today) dates from May 23rd, 2005.

It was always my wish to write more about my research work but for one reason or another I end up writing about the most uninteresting topics and failing to discuss what I think to be one of the most interesting research topics out there (oh yeah, flame me…).

SAT is the short name for the SATisfiability problem (or Boolean Satisfiability Problem). The problem is: Given a propositional formula… yes, those like ((x => y) /\ ~x)… is there an assignment to the variables of the formula such that it evaluates to true?

This is the problem, simple, huh, well, simple indeed to state it but not so easy to solve it. Anyway, when there is such an assignment we say the formula is *sat* (for satisfiable), otherwise it is *unsat* (for unsatisfiable). In the example above the formula is *sat* because a value of false assigned to x and a value of false assigned to y makes the formula true. A trivially *unsat* formula would be (x /\ ~x). You see, this was the first problem to be proved NP-Complete by Cook in 1971. This is a very interesting class of problems since every NP problem can be converted into SAT and solved using a SAT solver (a software that uses the SAT problem). Sometimes it happens that these properties are interesting theoretical results with no practical application but this is not the case for SAT. Many problems are indeed reduced to SAT in order to be solved by a SAT solver. This is because current state of the art SAT solvers are so good that a problem which would take a near infinite amount of time to solve a decade ago, now takes a reasonable time to solve. We are, obviously, talking about problems where the number of variables amounts not to dozens, or even thousands but to millions!

Understand that if you apply a trial-and-error search for an assignment in a formula with 1000000 (one million) variables, in the worst case the formula is *unsat*, and you have to test 2^1000000 assignments (oh yeah, that’s a big number). However, it’s not uncommon for SAT solvers to prove the unsatisfiability of a formula that big.

It was by chance that I came to work in subjects related to SAT… it was also by chance that I came to work with my supervisor (Prof. Joao Marques-Silva), which lingers among the few that know SAT solvers inside out for a very long time.

SAT is fascinating, not only because it is a problem like the 3n+1 problem (easy to state, hard to solve), but also because of the huge application it has in both academic and industrial settings. In between a long list of applications, my favourites are:

- Model Checking, which includes Software and Hardware Verification
- Test Case Generation
- Haplotype Inference in Computational Biology
- Model Finding in Soft Formal Methods

Obviously, being such an interesting problem, several related problems exist which base themselves on SAT:

- SMT – Can we solve the SAT problem on propositional logic + decidable theories?
- #SAT – How many solutions a
*sat*formula has? - SAT Enumeration – Which solutions a
*sat*formula has?

Besides the problems related with research and commercial application, programming is certaintly a much important issue in SAT. Which data structures and algorithms to use, how to program it, which language? A yearly competition puts to test the best SAT solvers and it takes more than just reading a couple of papers to have the best solver. You need to be clever while programming, while choosing the data structures, the algorithms, and still factoring in how extensible, usable, you want it to be… How much more can you ask from a research subject?

Ricardo Correia said,

October 11, 2008 at 1:03 pm

Hi Paulo,

I have been very interested in this subject for many years.

Personally, I find the best way to do research on such SAT solving algorithms is to apply them to boolean circuits (without loops), because they have very simple semantics and have a nice visual structure, which makes them easier to visualize and do research on.

Furthermore, solving a difficult circuit, with say, around 200,000 intermediate nodes, such as the one that calculates the SHA1 hash of a set of boolean variables (though you could call it a boolean tree, because the circuit has the shape of a tree), would have huge practical implications.

On a related note, another subject that also interests me greatly is artificial intelligence. A few months ago, I ran into a video of a Google Tech Talk from Geoffrey Hinton explaining his research into artificial neural networks, and the results he has achieved.

I found that his results were *truly* fascinating, you should see them for yourself (it’s about 1 hour long, but please bear with it, I think it’s well worth it): http://www.youtube.com/watch?v=AyzOUbkUf3M

Considering how neural networks are so similarly structured to boolean circuits/trees, I wonder if ideas from his research could be directly or indirectly applied to solving SAT?

pmatos said,

October 12, 2008 at 9:28 pm

Thanks for the video URL, I will look into it, however, considering the heavy research SAT solving has been under in the last few decade I do doubt there is something in NNs that might be applied to speed-up SAT solving.