![]() |
|
| | |
| Craig Holman Research Jabberwocky Quotations Contact | |
| Home >
Jabberwocky      |
|
Jabberwocky for June 15, 2008 |
|
Calculating the maximum number of CNF and DNF clauses for an expression    
The following is adapted from a piece I wrote in August, 2001. It describes how I calculated two metrics for my doctoral work on boolean expressions. It was originally entitled "Calculating the Maximum Number of Clauses in the CNF- and DNF-Equivalents of General Boolean Expressions".
Introduction
Several algorithms for processing Boolean expressions work best with multilevel expressions that have a compact structure as opposed to expressions that are in Conjunctive Normal Form (CNF) or Disjunctive Normal Form (DNF). Since a very small percentage of all expressions having L literals happen to be in CNF or DNF, and since conversion to CNF or DNF usually involves a large increase in the number of literals, we are interested in calculating the number of clauses in the CNF- and DNF-equivalents of a general Boolean expression without actually performing the conversions. Clauses in CNF (DNF) expressions are the largest subexpressions that don't contain an And (Or). I found two simple algorithms for performing these calculations around 1987. Without loss of generality, we shall consider only Boolean expressions that contain junctors in { And, Or } and are in Negative Normal Form (negators only before literals). Since only the junctor structure of an expression is relevant to the calculation of the number of clauses in the CNF- or DNF-equivalents of the expression, each literal will be represented by 'L' in the example that follows. We are calculating the number of clauses that are in the CNF- and DNF-equivalent expressions that are produced by a 'naive', non-optimal conversion that is accomplished by the rules specified in the proof. These values are the maximum values for the number of clauses in the CNF- and DNF-equivalents produced by a more cunning conversion. In order to avoid operator precedence problems, the first step of each algorithm is to fully parenthesize the expression. This operation results in there being exactly one pair of parenthesis associated with each junctor in the expression - the left parenthesis is the symbol immediately before the left subexpression of the junctor and the right parenthesis is the symbol immediately after the right subexpression of the junctor. This process is illustrated in the example that follows.
Algorithm for calculating the maximum number of clauses in the CNF-equivalent
To calculate the maximum number of clauses in the CNF-equivalent of a general Boolean expression having junctors in {And, Or} and in Negative Normal Form, fully parenthesize the expression, replace each 'And' by '+', replace each 'Or' by '*', replace each possibly negated literal by '1', and evaluate the resulting expression.
Algorithm for calculating the maximum number of clauses in the DNF-equivalent
To calculate the maximum number of clauses in the DNF-equivalent of a general Boolean expression having junctors in {And, Or} and in Negative Normal Form, fully parenthesize the expression, replace each 'And' by '*', replace each 'Or' by '+', replace each possibly negated literal by '1', and evaluate the resulting expression.
Example
Original expression:
(((Not C And D Or B) And (C Or Not D)) And (Not A Or C Or B And Not D))
Expression after replacing each possibly negated literal by 'L':
(((L And L Or L) And (L Or L)) And (L Or L Or L And L))
Fully parenthesized expression:
((((L And L) Or L) And (L Or L)) And ((L Or L) Or (L And L)))
The maximum number of clauses in the CNF-equivalent of the expression:
((((1 + 1) * 1) + (1 * 1)) + ((1 * 1) * (1 + 1))) = 5
The maximum number of clauses in the DNF-equivalent of the expression:
((((1 * 1) + 1) * (1 + 1)) * ((1 + 1) + (1 * 1))) = 12
Proof
Without loss of generality, we shall only consider Boolean expressions that contain junctors in { And, Or }. Let A, B, and C represent subexpressions. Let L represent an arbitrary, possibly negated literal. An expression is in CNF if it has at least one And junctor and no junctor is dominated by and no And junctor is dominated by an Or junctor. An expression is in CNF if it has at least one And junctor and no junctor is dominated by and no And junctor is dominated by an Or junctor. In a naive conversion of an expression to CNF, every occurrence of ( A And B ) Or C must be converted to ( A Or C ) And ( B Or C ), and every occurrence of A Or ( B And C ) must be converted to ( A Or B ) And ( A Or C ). An expression is in DNF if it has at least one Or junctor and no Or junctor is dominated by an And junctor. In a naive conversion of an expression to DNF, every occurrence of ( A Or B ) And C must be converted to ( A And C ) Or ( B And C ), and every occurrence of A And ( B Or C ) must be converted to ( A And B ) Or ( A And C ). Let CNF( A ) represent the maximum number of clauses in the CNF-equivalent of A. Let DNF( A ) represent the maximum number of clauses in the DNF-equivalent of A. We have the following:
CNF( L ) = 1
CNF( A And B ) = CNF( A ) + CNF( B )
CNF( A Or B ) = CNF( A ) * CNF( B )
CNF( ( A And B ) Or C )
= ( ( CNF( A ) + CNF( B ) ) * CNF( C )
= CNF( A ) * CNF( C ) + CNF( B ) * CNF( C )
= CNF( ( A Or C ) And ( B Or C ) )
CNF( A Or ( B And C )
= CNF( A ) * ( CNF( B ) + CNF( C ) )
= CNF( A ) * CNF( B ) + CNF( A ) * CNF( C )
= CNF ( ( A Or B ) And ( A Or C ) )
DNF( L ) = 1
DNF( A Or B ) = DNF( A ) + DNF( B )
DNF( A And B ) = DNF( A ) * DNF( B )
DNF( ( A Or B ) And C )
= ( ( DNF( A ) + DNF( B ) ) * DNF( C )
= DNF( A ) * DNF( C ) + DNF( B ) * DNF( C )
= DNF( ( A And C ) Or ( B And C ) )
DNF( A And ( B Or C ) )
= DNF( A ) * ( DNF( B ) + DNF( C ) )
= DNF( A ) * DNF( B ) + DNF( A ) * DNF( C )
= DNF( ( A And B ) Or ( A And C ) )
The last two statements in each group of five establish that the quantity that we are computing is invariant under conversion transformations - we are free to calculate the number of clauses without actually performing the conversion to CNF or DNF.
Note: Rather than perform the calculation by operating on the string containing an expression, I performed it on the binary constraint tree that was constructed by the application of the Propagate Truth Value algorithm. It could also be performed on constraint trees.     
Problem-solving systems    
Recap We've begun to consider how to use effectively a set of tools to solve instances of a single problem. This will help us to maximize the throughput and minimize the cost of solving problems on a single problem-solving engine. Our approach, so far, is as follows:
So far, so good. Most of this problem-solving approach can be fully automated today, and the devising and implementing of new metrics should be automatable in the future.
Transforming instances between problems
All problems in the NP-Complete complexity class have the following characteristics:
It isn't that we transform one problem into another but that we transform an instance of one problem into an instance of another, solve that other-problem instance, and transform the solution back into the original problem context. For example, suppose that we have a boolean expression whose satisfiability we want to determine but our program for determining boolean satisfiability is broken. We do, however, have a very good program for solving instances of the k-Clique Exists problem. We can transform the boolean satisfiability problem instance into a k-clique exists problem instance, solve it, and port the solution back into the boolean satisfiabity realm. There are a great many NP-complete problems, including some of my favorites (e.g. Boolean Satisfiability, k-Clique Exists), as well as classics such as the Traveling Salesman problem, Bin Packing, Multiprocessor Scheduling, Timetable Design, and Clustering.
A problem-solving system
I'd like to have a problem-solving system that can solve any of the NP-complete problems, as well as many other challenging problems. This system would offer a catalog of problems that it can solve, although it may not have a good engine for many problems. This system would need to accept a problem instance to be solved, determine which problem-solving engine(s) are likely to be the least expensive solver(s) for this instance (including the likely cost of transforming it between problems), execute the solution plan that it has devised, and returning the solution to the client. At the heart of this system is a directed multigraph (of course!), each node of which is a problem in the catalog, and each directed edge of which represents an instance transformation from one problem to another. Each of the problem-solving engines that are available are associated with its problem's node. Each node may have zero, one, or more problem-solving engines associated with it. Given an instance I of problem P to solve, the system needs to determine which nodes have engines that are worth considering as solvers, given the expected cost of transforming I to an instance of each node's problem. Each of better candidate engines (note the handwaving here) will then be asked to provide an inexpensive estimate of the cost of its solving the transformed instance of I. The system will then manage the top-level of solution of I by the engine(s) that it has selected. Several remarks:
Some of this should sound familiar, for we face some of the same issues when deciding which tools to apply and how to apply them in solving instances of a single problem. This is simply another layer that presents some of the same challenges.
Engines
We can also think of an entire problem-solving system as an engine, one that is able to solve a catalog of problems rather than a single problem. Problem-solving engines, problem-instance-transforming engines, and problem-solving-system engines present some of the same challenges and should share many capabilities. I'd like to abstract 'engine' a bit and explore what they have in common:
This is leading us towards an approach to choosing and wielding tools to solve problems that I call the Adaptive Market pattern. I'll devote a few posts to what the requirements for such a pattern, the pattern itself, and some thoughts on architecture and design.
     |
| Home >
Jabberwocky      |
|
| Craig Holman Research Jabberwocky Quotations Contact | |
| Patterncraft and Constraint and Consequence are trademarks of Craig Stewart Holman Copyright © 2008 Craig Stewart Holman All Rights Reserved. |
|
| | |
![]() |
|