# June 20, 2006

## SMT for everyone…

**or** *SMT in 20 minutes*!

**The Beginning**

In fact, this is my attempt to create a SMT tutorial, which everyone is able to grasp. In fact, the only requirement is basic knowledge of Mathematics and basic knowledge of the English language.

This tutorial will serve as an anchor to everyone which will, in the future, ask me what is my research all about. If you asked me that recently, then that’s why I told you to come here. It will be brief and simple but at least you’ll understand what I’m currently working on, from a non-specific point of view. From a specific point of view I’d have to write a tutorial on “*Architectural Design for Lazy SMT Multi-Theory Solvers in 2 years*” or something! (Yes, I tried to find a very complex title just for the fun of it…) :-)

**Propositional Logic**

OK, so, *yesterday I went to the beach* **and** *to the cinema*. Note that although, yesterday I went to the cinema, I did not went to the beach (which was a pity), but anyway, the sentence is **false**! For it to be true, I had to go both to the cinema and to the beach, i.e. **both** conditions had to be true. This is intuitive, right?

If had used an **or** I would have said *tommorrow I will go to the beach* **or** *to the cinema*. If I happen to go to one, or both of those places the sentence is *true*, otherwise, i.e. I go neither to the beach nor the cinema tommorrow, it is *false*. Note that usually in current speech when we say *this* **or** *that*, we mean one thing or the other but not both. In mathematics, this is not the case. An **or** is true whenever one or both things happen. The kind of **or** we usually use in current speech is known as **xor** (which stands for *exclusive or*).

We can instead of using sentences, which have a *true* of *false* value, use letters for variables which can have *true* of *false* values depending on their meaning. So, briefly, A **or** B is *false* if both A is *false* and B is *false* but *true* otherwise. A **and** B is true if A is *true* and B is *true* and *false* otherwise. Note that we can define the **and** in terms of the **or** and vice-versa. We also have the negation (**not**) operator **~**. **~**A is *false* if A is *true* and *true* otherwise. We can define many other operators in terms of these, like implication (If this then that), equivalence (this if and only if that), exclusive or (this or that but not both), and so on.. In practice, we can translate a sentence with a lot of operators into a sentence which maintains it’s truthness (it’s value of true of false) with only **and**s, **or**s and **not**s.

The truth value of a sentence depends usually on its variables:

(A **or** **~**(B **and** C)) cannot be said *true* of *false*. It depends on the values for the variables. If I tell you that A is *false*, B is *false* and C is *false* then you can say that the sentence is *true*. Note that we do not always need to specify the values of all the variables in the sentence to know the sentence value. If I say that A is *true*, no matter the values of B and C, the sentence is *true*. However, there are (evil) sentences which are *false* no matter what (kind’a like some people). For example, (A **and** **~**A). There are obviously some which are *true* no matter what, (A **or** **~**A).

Now, imagine that I give you the formula:

**~**(A **or** (B **and** **~**A)) **and** **~**(**~**B **and** A)

Now, can you find values for this sentence which makes it *true*, or is it always *false*?

(… trumpets …)

Ah, you’re trying to solve the SAT(isfiability) problem for that sentence. The formula is satisfiable if there are some values to the variables in the sentence (which we usually call formula) which make it true. If the formula is false no matter what, we say that the formula is unsatisfiable.

Who would imagine that this is one of the major problems in Computer Science? And it turns out to be pretty hard (which is fortunate since it turns out to employ a lot of colleagues of mine).

So, now, if someone tells you that it is working on SAT, you should say: “Ah, that’s nice!” instead of “Huh? What’s that?” and imagine that same person day after day trying to find new ways to solve this problem as fast as possible for huge formulas. Humm, so, let me give you an idea… Imagine that I write the formula on one straight line, in toilet paper, where a character is square and takes up the whole height of the toilet paper. Then, you would need about 180Km of toilet paper to write the whole formula… and that’s what I mean by huge.

Since, a lot of development has been done in SAT recently, there are a lot of people translating their problems in to SAT so that they are able to solve them using current efficient SAT solvers.

**Adding a Mathematical Component**

Problem is… there are problems which are not easily translated into SAT since they have inherently mathematical contraints which are hard to translate to SAT under certain circunstances. So, people decided to extend the SAT are to what is currently known as the SAT Modulo Theories (SMT) problem.

Now, in place of formulas like the ones in the last section which are composed of variables which are true or false, you can also have mathematical constraints like 2x + 3y >= 4, or f(a) – f(b) = 2, or even a[2] > 5.

What happened is that we substituted the Boolean variables (variables which can only have true or false values) into more complex expression, but otherwise more expressive which still have themselves a Boolean value. These expressions can belong to any decidable theory, linear arithmetic, uninterpreted functions and theory of arrays respectively (from the examples above).

So, the SMT formula syntax depends on the background theory (or theories) we are considering. If we consider Integer Linear Arithmetic, (x + 2y >= 3 and x < 3) or B is correct if the x,y are integers but is not correct if we consider Integer Difference Logic.

Now, we can use mix the theories and the expressions, generate arbitrarily large formulas to solve specific problems and check for their satisfiability like we did for SAT. In SAT we would search for a solution and if one existed we could output a value find for each of the variables which makes the formula true (satisfies the formula), and these values would be Boolean. Now, the values can be integer, real, Boolean, whatever.

The requirement is that the main underlying logic of these formulas is a many-sorted version of first order logic, which still leaves us the possibility of adding quantifiers but I think there are still few those who are working on Quantified SMT formulas.

Guess what? You can easily translate a Sudoku problem to SMT (which can be done to SAT, but not easily and not so intuitively)… you can also translate thousands of other (more practicaly interesting) problems to SMT. The translation, technically called *encoding* is also studied by some since the way you encode the problem strongly influences how easy it’ll be to solve.

Well, I think this is it for now… Hope you are SATisfied with this brief tutorial!

**Pointers:**

- SAT on MathWorld
- SAT on Wikipedia
- SAT Live! : Somewhere where you can find the latest news on SAT.
- SATLIB : Library of SAT benchmarks.
- Journal on Satisfiability
- SMTLIB : Library of SMT benchmarks.

A said,

June 20, 2006 at 1:10 pm

Amazingly the term SMT is never expanded. (!)

SMT = Satisfiability Modulo Theories

pmatos said,

June 20, 2006 at 1:24 pm

You get there by recursive expansion since SAT = SATisfibility and SMT = SAT Modulo Theories. The recursive expansion is an exercise left to the reader.

Masud said,

October 24, 2008 at 1:41 pm

A very nice introduction to SAT and SMT. I really liked it. Would like a bit of more technical details, especially about SMT, either here or in another tutorial.

Cheers,

M

pmatos said,

October 24, 2008 at 2:13 pm

Thanks, I will try to accommodate your request in a future tutorial. In the meantime you may be interested in having a look at my MSc thesis “Algorithms for SMT” for a more technical approach.

Send me a message if you need something else.