Building an Interpreter for Propositional Logic: Handling variables

When we left our interpreter for propositional logic, we had an executable test script with minitest/autorun that executed a full test suite of our functionality, which is the full set of valid expressions in classical propositional logic:

You can find the script we have built to this point in this revision of this Gist.

In this post, we are going to extend that interpreter to handle variables, which means allowing us to produce truth tables for all of the possible outcomes as well as providing a boolean for a specific set of values for the variables. With this functionality in place, we will have what I consider a feature-complete interpreter for propositional logic.

So, let’s dig into it…


The first thing we need to do is update our Lexer to parse variables. Since we are using uppercase T and F for our boolean tokens, I think it makes sense to use lowercase (single) letters for variables. Moreover, to avoid possible confusion, let’s disallow t, f, and v as variables since t and f correspond to our uppercase boolean tokens and v is our disjunction operator. This is minimalistic, allowing only single letter variables, but it helps us to focus on the essential details of the implementation, and it also aligns with the standard syntax used by philosophers.

We can extend our Lexer by adding a new condition to our case-statement:

case char
# ...
when *(('a'..'z').to_a - ['t', 'f', 'v'])
Token.new(:VAR, char)
end

As you can see, all we do is splat an array of lowercase letters (minus t, f, and v). When we get a match, we create a new :VAR type Token.

With a new Token type, we now need to create a new AST class as well so that our Parser can correctly handle these new tokens.

module AST
class Variable
attr_reader :value
 
def initialize(value)
@value = value
end
end
end

This is a simple class that actually mirrors our AST::Atom class. We simply have an object with a value. We want to ensure that the syntax tree classes are distinct, though, as we will need to interpret them different. An Atom is a fixed boolean value, while a Variable is, well, a variable boolean value. Ensuring that our abstract syntax tree generated by the Parser correctly encodes the difference between these two objects is necessary to allow us to then extend the interpreter.

So, now that we have a new Token type and a new AST class, let’s consider what we need to update in our Parser to correctly build our syntax tree.


As a quick refresher, a parser encodes a grammar and outputs an abstract syntax tree. Our current grammar rules are:

expression :: disjunction (IFSO expression)*
disjunction :: conjunction (OR disjunction)*
conjunction :: formula (AND conjunction)*
formula :: (NOT)* formula | LPAREN expression RPAREN | term
term :: TRUE | FALSE

Each grammar rule corresponds to a method in our Parser class. In order to support variables, we need to update our grammar and the corresponding parsing method.

A variable is akin to an atom, and thus can simply be the third option for our term rule:

term :: TRUE | FALSE | VAR

In order to allow for our Parser#term method to handle :VAR Token types, we simply need to add another elsif block:

# ...
elsif token.type == :VAR
eat(:VAR)
return AST::Variable.new(token.value)
# ...

To this point, things have been simple. Our Lexer and Parser know how to handle variables. When we parse a string like ~p & q, we will get a correct abstract syntax tree. The real complexity comes in the Interpreter. Up to this point, our interpreter was rather simple as it only ever worked with fixed values. Now, we need the interpreter to handle variable values. This means that we can’t simply compute the boolean output of the logical expression; instead, we need to allow context to be passed in to resolve the variables.


Our Interpreter works using the “visitor pattern”. We define methods to visit each kind of AST node that the Parser can produce. So, we need to add a new visit_variable method. This method can be similar to the visit_atom method, except that the node.value won’t be a boolean value but rather the variable name. In order to map from a variable name to a boolean value, we will need the Interpreter to allow for a hash of variable names to boolean values to be passed in as well.

This means we can extend the initializer to accept an optional context hash, which we can store in an instance variable so that it is available in our new visit_variable method:

class Interpreter
def initialize(ast, context = {})
@ast = ast
@context = context
end
 
# ...
 
def visit_variable(node)
@context[node.value]
end
end

Let’s put this into action. Our interpret method was defined as so:

def interpret(string)
tokens = Lexer.new(string).tokens
ast = Parser.new(tokens).parse
Interpreter.new(ast).interpret
end

Let’s extend this as well to accept an optional context argument and pass that to Interpreter.new:

def interpret(string, context = {})
tokens = Lexer.new(string).tokens
ast = Parser.new(tokens).parse
Interpreter.new(ast, context).interpret
end

This will allow us to now perform:

interpret('~p & q', 'p' => true, 'q' => false)
# => false

Unfortunately, this won’t work with symbol keys in our hash, which is the terser and simpler hash format:

interpret('~p & q', p: true, q: false)
# => nil

Fortunately, though, this is a quick and simple fix. In our Interpreter#initialize method, we can simply stringify the context keys:

class Interpreter
def initialize(ast, context = {})
@ast = ast
@context = context.transform_keys(&:to_s)
end
end

Now, the terse hash syntax works:

interpret('~p & q', p: true, q: false)
# => false

I’m happy with this. We can successfully and correctly interpret logical expressions that use variables by passing in a context hash. The only other feature that I want to implement is to build a truth table that outputs the full set of possibilities.


Let’s consider what the truth table looks like for ~p & q:

¯
p q ~p ~p & q
T T F F
T F F F
F T T T
F F T F

The key details here are:

  1. the enumeration of all possible combinations of states for the variables, and
  2. a breakdown of the boolean output at each stage of the computation.

We need to find a way to encode this in Ruby.

The simplest way to encode a set of possibilities is with a Hash. We can use the context hashes as the top-level keys for a hash of possibilities and then provide values as a hash of stage outputs. To take our ~p & q example, we could construct a hash like so:

{{"p"=>true, "q"=>true}=>{"~p"=>false, "~p & q"=>false},
{"p"=>true, "q"=>false}=>{"~p"=>false, "~p & q"=>false},
{"p"=>false, "q"=>true}=>{"~p"=>true, "~p & q"=>true},
{"p"=>false, "q"=>false}=>{"~p"=>true, "~p & q"=>false}}

This provides us the same information as the HTML table; we have all possible combinations of variable states as well as the breakdowns of the boolean outputs at each stage of the computation.

The task now is simply to make it possible to generate such a hash.

Let’s start by building a hash of all of the possible combinations of variable states. What we need is Ruby code that will generate every permutation of true and false for the total number of variables. Luckily for us, Ruby provides convenience methods for working with combinations and permutations. We get 4 methods:

Combinations are for situations where the order of the output doesn’t matter, while permutations are for when order does matter. The repeated_* variants are for situations when we want duplicates. The differences might be clearer with some examples. Let’s consider the output of each method when working with an array of boolean values ([true, false]) and wanting an array of tuples:

> [true, false].combination(2).to_a
=> [[true, false]]
> [true, false].permutation(2).to_a
=> [[true, false], [false, true]]
> [true, false].repeated_combination(2).to_a
=> [[true, true], [true, false], [false, false]]
> [true, false].repeated_permutation(2).to_a
=> [[true, true], [true, false], [false, true], [false, false]]

When order doesn’t matter and duplicates aren’t allowed (Array#combination), our array of two values will only output an array of one tuple. This is because [true, false] and [false, true] are only distinct in that their order. If order doesn’t matter, the output won’t distinguish between these two tuples. When order does matter but duplicates still aren’t allowed (Array#permutation), we will get these two distinct tuples. When order doesn’t matter but duplicates are allowed (Array#repeated_combination), we get our original tuple as well as the two value duplicates. Finally, when order does matter but duplicates are allowed (Array#repeated_permutation), we get the largest array of tuples.

In our case, we can see that we need Array#repeated_permutation. We simply take the array of boolean values and produce an array of arrays for however many variables we have in our expression.

So, let’s start writing some code to build this out. I think that it makes the most sense to build an Expression class to hold this functionality. This will isolate this functionality from the specific functionality of the Interpreter. It also provides a foundation for additional features related to dealing with a logic expression string. We start with a simple initializer:

class Expression
def initialize(string)
@string = string
end
end

This allows us to create expression instances for valid logic expression strings, like Expression.new('~p & q').

Next, let’s run the Lexer and Parser to generate an array of Tokens and an AST, respectively:

class Expression
# ...
 
def tokens = Lexer.new(@string).tokens
def ast = Parser.new(tokens).parse
end

For convenience with fixed value expressions, let’s add a resolve method that runs the Interpreter:

class Expression
# ...
 
def resolve
interpreter = Interpreter.new(ast)
interpreter.interpret
end
end

This can replace our interpret method from previous iterations. So, for example, we can update our tests:

def test_tokens
assert_equal true, Expression.new('T').resolve
assert_equal false, Expression.new('F').resolve
end

For expressions with variables, we can add a given method that takes a context hash:

class Expression
# ...
 
def given(context)
interpreter = Interpreter.new(ast, context)
interpreter.interpret
end
end

This would allow us to add tests for variable expressions:

def test_simple_variable_expressions
assert_equal false, Expression.new('~p & q').given(p: false, q: false)
assert_equal false, Expression.new('p v ~q').given(p: false, q: true)
end

With all of this setup, we can finally turn back to creating a truth_table method. As discussed earlier, we want to use Array#repeated_permutation to get a full set of combinations of boolean values for the number of variables in the expression, and then we will build a hash where each combination is one key/value pair:

class Expression
# ...
 
def truth_table
table = {}
[true, false].repeated_permutation(variables.count).each do |booleans|
context = Hash[variables.zip(booleans)]
table[context] = given(context)
end
table
end
 
def variables = tokens.select { |t| t.type == :VAR }.map(&:value)
end

Here we get the number of variables simply by selecting the :VAR type tokens and counting them. This we can pass into Array#repeated_permutation to get our array of arrays. We iterate over this array to work with each array of boolean values. We can go from an array of boolean values (e.g. [true, false, true]) to a hash of variables and booleans (e.g. {p: true, q: false, r: true}) using Hash#zip. With a context hash prepared, we can simply call our when method to get the result and provide that as the value to our key/value pair.

In full, this is what our Expression class looks like now:

class Expression
def initialize(string)
@string = string
end
 
def resolve
interpreter = Interpreter.new(ast)
interpreter.interpret
end
 
def given(context)
interpreter = Interpreter.new(ast, context)
interpreter.interpret
end
 
def truth_table
table = {}
[true, false].repeated_permutation(variables.count).each do |booleans|
context = Hash[variables.zip(booleans)]
table[context] = given(context)
end
table
end
 
def variables = tokens.select { |t| t.type == :VAR }.map(&:value)
def tokens = Lexer.new(@string).tokens
def ast = Parser.new(tokens).parse
end

Our new Expression#truth_table method implements the first of the two features we want—it enumerates all possible combinations of states for the variables. So, if we call:

Expression.new('~p & q').truth_table

we will get back:

{{"p"=>true, "q"=>true}=>false,
{"p"=>true, "q"=>false}=>false,
{"p"=>false, "q"=>true}=>true,
{"p"=>false, "q"=>false}=>false}

This is lovely. The only thing left is to ensure that our Expression#truth_table function provides a value for each state that is a breakdown of the boolean output at each stage of the computation; that is, instead of the hash above with final boolean outputs, we want a hash like so:

{{"p"=>true, "q"=>true}=>{"~p"=>false, "~p & q"=>false},
{"p"=>true, "q"=>false}=>{"~p"=>false, "~p & q"=>false},
{"p"=>false, "q"=>true}=>{"~p"=>true, "~p & q"=>true},
{"p"=>false, "q"=>false}=>{"~p"=>true, "~p & q"=>false}}

However, I am noticing the size of this post, and I think we should make this a separate post in this series, as the code changes for this one feature are somewhat extensive. So, let’s leave things here with that teaser.

You can find the script we have built to this point in this revision of this Gist. Check out the full script to see the full set of test cases that we added for testing our new Expression#truth_table function.


All posts in this series