![]() |
|
| | |
| Craig Holman Research Jabberwocky Quotations Contact | |
| Home >
Jabberwocky      |
|
Jabberwocky for June 7, 2008 |
|
Boolean Expressions - An Overview    
This work is derived from my 1990 doctoral dissertation and was written in 2000.
Truth is the daughter of time. - Aulus Gellius
This work introduces Boolean expressions, focusing on the concepts and terminology that will be used throughout the other parts of this series, and providing an overview of the topics that will be discussed in this series. I had intended to post this overview before getting into details, as I did two days ago with the Constraint Calculus. So it goes.
Boolean ExpressionsCraft must have clothes, but truth loves to go naked. - Thomas Fuller
A Boolean value is either True or False. A Boolean variable may only take on a Boolean value. The negation of a Boolean value is the other Boolean value (e.g. the negation of True is False). The symbol that is often used to denote the negator, Not, is '¬' (e.g. ¬ True = False). There are sixteen Boolean junctors, binary operators for joining two Boolean variables or Boolean values, but only two Boolean junctors, And and Or, will be treated here. The And junctor is often denoted by the symbol 'Ù' and is known as the conjunctor. The value that is the result of conjoining two Boolean values is True only if both of the values are True. The Or junctor is often denoted by the symbol 'Ú' and is known as the disjunctor. The value that is the result of disjoining two Boolean values is True if either or both of the values are True. The symbols '~' and '-' are also used to denote negation, '*' and '&' are also used to denote conjunction, and '+' and '|' are also used to denote disjunction. Not has higher precedence than And, and And has higher precedence than Or, so Not X And Y Or Z is equivalent to ((Not X) And Y) Or Z. Parentheses may be used to clarify or override precedence, so parenthesization in expressions may range from unparenthesized to fully parenthesized, where there is one pair of parentheses for each junctor. Variable names may be any sequence of letters, numbers, and the underscore character. A Boolean expression is defined recursively as follows: Each of the following is a Boolean expression: a Boolean value, a Boolean variable, the negation of a Boolean expression, the conjunction of two Boolean expressions, the disjunction of two Boolean expressions. Each occurrence of a variable in an expression is known as a literal. The number of variables in an expression is the number of distinct variables, excluding duplicates, while the number of literals in an expression is the number of occurrences of variables, including duplicates. The numbers 0 through the number of variables - 1 are often used to denote variables. The vocabulary of an expression is the set of variables that occur in the expression. For example, the expression ( A Ù ¬ B ) Ú ( ( ¬ A Ù C ) Ù ( ¬ D Ú ( B Ù ¬ C ) ) ) ) has 7 literals, 4 of which are negated, which are drawn from a vocabulary { A, B, C, D }, which contains 4 variables. The representation of a Boolean expression is the actual text form of the expression. The fixed-symbol mapping for an expression specifies the symbols that are used for negators, junctors, parentheses, and expression termination, if necessary. The format of a Boolean expression describes the fixed-symbol mapping, any additional fields (e.g. identifier, description, logical class, source), and the layout of the fixed-symbol mapping, fields, and representation. From a Boolean expression, we can construct a binary expression tree, which actually looks more like an upside-down tree of nodes, as follows: each negator, junctor, and literal is converted to a node, nodes which become parents of other nodes are placed higher than their children and connected by a line to each child, the operand of a negator is made the right child of that negator, the left and right operands of a junctor become the left and right children of that junctor, and parentheses, after being used to specify the structure of the expression, are redundant and are implied by the structure of the tree rather than being represented explicitly. From a binary expression tree, we can construct an expression tree in which all adjacent And nodes have been collapsed into a single And node and all adjacent Or nodes have been collapsed into a single Or node. The root of a binary expression tree or expression tree is the top-most junctor node. An expression is said to be in a normal form if satisfies the properties which make up the definition of that normal form. Several normal forms have been defined, three of which will be discussed in this work and others of which, such as Elegant Normal Form (ENF), will be discussed in subsequent presentations. A Boolean expression is in Negative Normal Form (NNF) if, in the binary expression tree for the expression, negators only have literals as children. A Boolean expression is in Conjunctive Normal Form (CNF) if it is in Negative Normal Form and if, in the expression tree for the expression, there is only one And node and it is the root of the tree. The expression consists of two or more conjoined clauses, each of which consists of one or more (possibly negated) literals that are disjoined. A Boolean expression is in Disjunctive Normal Form (DNF) if it is in Negative Normal Form and if, in the expression tree for the expression, there is only one Or node and it is the root of the tree. The expression consists of two or more disjoined clauses, each of which consists of one or more (possibly negated) literals that are conjoined. The structure of a Boolean expression identifies the normal forms that the expression is in. If the expression is not in CNF or DNF, its structure is compact. The description of the structure of an expression may specify more than one compatible descriptors (e.g. compact and NNF, CNF and NNF). A general Boolean expression may have any structure. A constraint in the context of Boolean expressions is a binding of a Boolean variable to a Boolean constant. The constraint of a Boolean variable X to True will be denoted by +X and the constraint of X to False will be denoted by -X. A consistent set of constraints does not contain a constraint to True and a constraint to False for the same variable. An inconsistent set of constraints does contain such a pair of conflicting constraints on at least one variable. For example, { +a, -b, +c, +d, +g } is a consistent set of constraints and { +a, +b, -b } is an inconsistent set of constraints. An interpretation of a Boolean expression is a consistent constraint set that covers the expression's vocabulary - it contains exactly one constraint for each variable in the expression's vocabulary. The evaluation of a Boolean expression under an interpretation is a process in which each variable instance in the expression is replaced by the Boolean value to which the variable is constrained in the interpretation, after which the resulting expression is simplified to either True or False. An interpretation under which an expression evaluates to True is a model for that expression; otherwise, the interpretation is a non-model for that expression. For example, consider the expression ( ( a Ù ¬ b ) Ú c ) Ù b . { +a, +b, -c } is an interpretation under which the expression evaluates to False and is, therefore, a non-model of the expression. { +a, +b, +c } is an interpretation under which the expression evaluates to True and is, therefore, a model of the expression. A Boolean expression represents a partition of interpretations into models and non-models. Partitions are equivalence classes for expressions. Two expressions are equivalent if and only if they represent the same partition. For example, consider ( ( ( a Ù ¬ b ) Ú c ) Ù b ) and ( c Ù b ) . The models for both expressions all have { +b, +c } as a subset and the nonmodels for both expressions all have { -b, -c } as a subset. Both expressions represent the same partition and are equivalent. An expression is a tautology if all its interpretations are models - if it always evaluates to True. An expression is a contradiction if none of its interpretations are models - if it always evaluates to False. An expression that has at least one model is satisfiable. An expression that has at least one non-model is falsifiable. The logical class of an expression is one of {contradiction, satisfiable and falsifiable, tautology}. The process of determining the satisfiability of a Boolean expression is known to be in the NP-hard and NP-complete complexity classes, as is the problem of determining the falsifiability of a Boolean expression. For example, ( a Ú ¬ a ) is a simple tautology and ( a Ú ¬ a ) is a simple contradiction. The expression ( a Ù b ) is both satisfiable and falsifiable. The Random Generation of General Boolean ExpressionsHe therefore who wishes to rejoice without doubt in regard to the truths underlying phenemonon must know how to devote himself to experiment. - Roger Bacon
Most methods for randomly generating Boolean expressions are limited to generating expressions that are in a restrictive normal form, such as CNF or DNF. Nearly all Boolean expressions are in neither CNF nor DNF, however. All expressions can, in theory, be transformed to CNF or DNF, but a naive transformation to CNF or DNF will usually result in an expression having significantly more literals than the original, and even a careful transformation to CNF or DNF can result in huge expressions. Here are some experimental measurements of the number of clauses in the CNF equivalent (naive transformation) of general expressions: 10% having 100 literals have more than 104 clauses in their CNF equivalents 10% having 200 literals have more than 108 clauses in their CNF equivalents 10% having 300 literals have more than 1011 clauses in their CNF equivalents 10% having 400 literals have more than 1014 clauses in their CNF equivalents 10% having 500 literals have more than 1017 clauses in their CNF equivalents
2% having 100 literals have more than 105 clauses in their CNF equivalents 2% having 200 literals have more than 109 clauses in their CNF equivalents 2% having 300 literals have more than 1013 clauses in their CNF equivalents 2% having 400 literals have more than 1017 clauses in their CNF equivalents 2% having 500 literals have more than 1020 clauses in their CNF equivalents
At least one having 100 literals has more than 108 clauses in its CNF equivalent At least one having 200 literals has more than 1013 clauses in its CNF equivalent At least one having 300 literals has more than 1017 clauses in its CNF equivalent At least one having 400 literals has more than 1022 clauses in its CNF equivalent At least one having 500 literals has more than 1026 clauses in its CNF equivalent
The results for DNF equivalents are similar. It is apparent that the naive transformation of expressions to CNF or DNF can significantly inflate the number of literals and that methods which only generate expressions that are in CNF or DNF are inadequate to generate expressions for studies of general Boolean expressions. I've developed a method of generating general Boolean expressions which has the following parameters: the number of literals N, the number of variables, the probability of a junctor being an And, and the probability of literal negation. An expression is generated by randomly selecting an unlabeled binary tree having N-1 nodes, associating a junctor with each node of that tree, creating two literal nodes as children of each leaf node of the junctor tree, and inserting a negator node as the parent of literals according to the probability of literal negation. The expression can be read from this tree or the tree can be processed directly. The challenge lies in efficiently generating the binary tree so that all binary trees of the same size have the same probability of being generated. The Constraint CalculusWhen you have eliminated the impossible, whatever remains, however improbable, must be the truth. - Sir Arthur Conan Doyle, as Sherlock Holmes
When searching for a model for a Boolean expression, it seems reasonable to assume that the expression evaluates to True and then study the consequences that proceed from this assumption. These consequences will take the form of a system of constraints upon variables. If, within this system, we are able to construct a consistent set of constraints, then any interpretation that conforms to this set is a model of the expression. The constraint calculus is an algorithm for generating all of the models of a Boolean expression. The constraint calculus is proven both sound and complete. An algorithm for determining the satisfiability of Boolean expressions is derived from the constraint calculus, as is a complementary algorithm for determining the falsifiability of Boolean expressions. The Reduction of Constraint TreesTruth is the most valuable thing we have. Let us economize it. - Mark Twain
The constraint tree is an intermediate data structure of the constraint calculus. It is derived from the expression tree by a process in which negators are consumed, literals are converted into appropriate constraints upon variables, and these constraints are associated with parent And nodes, if they exist, or with new And nodes which replace the literals. The constraint tree derived from an expression is a system of constraints on possible models of the expression. This system often contains redundant information and patterns of constraint that cannot possibly lead to models. Due to its structure and meaning, the constraint tree is an excellent locus for the application of simplifying transformations, resulting in a constraint tree that has less redundancy and fewer patterns of unproductive constraint. As long as the transformations do not change the model-generability of the constraint tree, the simplified expressions will be equivalent to the original expressions. Transformations that reduce some specified measure of a structure are known as reductions. If a two or more reductions are available for the reduction of a structure, two questions arise: do some sequences of reduction applications result in divergent structures, and, if so, which sequence of reduction applications will result in the simplest structure. It would be very convenient to be able to ignore the question of divergence and focus only on efficiency. Some sets of reductions, known as complete sets of reductions, have the property that all exhaustive sequences of reduction applications will result in the same structure, regardless of the order in which the reductions were applied. This is a remarkable and highly desirable property, for it allows the algorithm developer to focus only on efficiency. The search is on for complete sets of reductions for the simplification of constraint trees. Each complete set of reductions for constraint trees corresponds to a new normal form for Boolean expressions. The first set of constraint tree reductions proven to be complete reduces constraint trees and their expressions to Elegant Normal Form, and their definition and the proof of their completeness is the most important result in my dissertation. Other complete sets of reductions have been discovered and algorithms for effecting the reduction are being developed. The original goal in developing methods for reducing constraint trees was to have an additional tool available to aid in the determination of the satisfiability of challenging expressions. It is now apparent that the reduction techniques are powerful enough to be interesting in their own right, for they offer methods for simplifying multilevel Boolean expressions. Elegant Normal FormIf you are out to describe the truth, leave elegance to the tailor. - Albert Einstein
Elegant Normal Form (ENF), a new reductive normal form for Boolean expressions, is defined and motivated. Eight graph transformations are introduced and are proven to constitute a complete set of reductions that can be exhaustively applied to a constraint tree to reduce it to an elegant constraint tree - a constraint tree that corresponds to an elegant expression, which is a Boolean expression that is in ENF. A Reduce-To-Elegance algorithm is introduced which exhaustively applies the eight elegance reductions to a constraint tree, transforming it to an elegant constraint tree. The results of applying this algorithm to 300,000 general Boolean expressions having from 100 to 500 literals each are presented. Resequencing Constraint TreesSome truth there was, but dash'd and brew'd with lies, - John Dryden
The partial graph traversal algorithm, which is the final phase of the satisfiabilitydetermination algorithm, is extremely sensitive to the ordering of the nodes in the constraint graph, which in turn is determined by the ordering of the lists of children of the And nodes and Or nodes in the constraint tree from which it was derived. If an expression is satisfiable, then at least one of the paths from Start through Stop in the constraint graph derived from that expression is consistent. If the nodes in the constraint graph can be resequenced in order to improve the probability of encountering a consistent path early in the traversal, the efficiency of the satisfiability-determination algorithm can be improved. Several heuristics for resequencing the nodes in a constraint tree or for dynamically resequencing the nodes in a constraint graph have been developed. They vary significantly in cost and effectiveness. Resequencing constraint trees should also be of use in improving the efficiency of algorithms for reducing constraint trees to ENF or other normal forms. The families of resequencing heuristics are described and the results of applying one of these heuristics to 300,000 general Boolean expressions having from 100 to 500 literals each are presented. Boolean Expression MetricsNo man thoroughly understands a truth until he has contended against it. - Ralph Waldo Emerson
Many Boolean expression metrics have been defined. They will be defined and discussed at length in a subsequent paper, but here is a brief list of some of them. They can be grouped according to the data structures from which they can be derived. From the representation of the expression, we can determine the number of literals and the number of variables. From the NNF of the expression, we can determine the degree of negation, the degree of conjunction, and the degree of constraint opposition. From the expression tree, we can determine the eccentricity, which measures how far the tree is from being balanced, the degree of coincidence, which measures how much structure sharing occurs in the tree, the number of clauses in the CNF equivalent of the expression, and the number of clauses in the DNF equivalent of the expression. Many additional metrics can be obtained during the application of satisfiability-determination and reduction algorithms, including the number of nodes visited during partial traversal, the amount of backtracking during partial traversal, the amount of work required to reduce a constraint tree to elegance or some other normal form, the number of applicability tests performed for each reduction, the number of applications of each reduction, and the number of literals removed as the result of applications of each reduction. The collection and analysis of metrics can provide us with insights that can greatly facilitate the processing of Boolean expressions. The Architecture of an Expert System for Processing General Boolean ExpressionsTelling the truth is a pretty hard thing. - Thomas Wolfe
I am developing a processing system that will be able to efficiently process large numbers of general Boolean expressions, performing operations including characterization, satisfiability and falsifiability determination, and simplification to different normal forms. Experimental data suggest that the unenhanced satisfiability-determination algorithm is adequate for determining the satisfiability of nearly all expressions. The customary use of any enhancements to the algorithm would tend to reduce the effectiveness of the system, even though the use of enhancements for challenging expressions is essential. Similarly, enhancements for reduction algorithms may have varying costs and benefits. In order to make an informed decision about the order in which algorithms and enhancements should be applied, the expression must be characterized and classified as to its expected susceptibility to each of the algorithms that the system has at its disposal. The system will need to select those metrics which are sufficiently inexpensive and which will provide enough information to enable the system to make its next decision. We believe that the most effective expression-processing system will feature an expert system interacting with the expression-processing system by requesting measurements and making processing decisions incrementally rather than all at once. The expert system will have at its disposal a statistical database that contains correlation data on expression characteristics and the performance of the available algorithms. The database initially will be the product of Monte Carlo studies of large numbers of randomly generated general expressions. Metrics from the processing of expressions actually of interest to a user may be used to update the database and thereby train the expert system. Some Experimental Results from my 1990 DissertationThe results of applying the Satisfiability-Determination algorithm, as defined and as enhanced by the Reduction-to-Elegance algorithm, to 300,000 randomly generated general Boolean expressions having from 100 to 500 literals each are presented.
Future WorkAreas of particular interest for current and future work include efficient set representation, the efficient random generation of large binary trees, the search for new reductive normal forms and complete sets of reductions for the transformation to those normal forms, efficient algorithms for reduction to ENF and other reductive normal forms, better heuristics for constraint tree resequencing, expression equality testing, design and implementation of the expert system, and the translation of challenging problem instances from other NP-complete problems to Boolean satisfiability.
     |
| Home >
Jabberwocky      |
|
| Craig Holman Research Jabberwocky Quotations Contact | |
| Patterncraft and Constraint and Consequence are trademarks of Craig Stewart Holman Copyright © 2008 Craig Stewart Holman All Rights Reserved. |
|
| | |
![]() |
|