Constraint and Consequence
 
   
  Craig Holman   Research   Jabberwocky   Quotations   Contact
 
  Home > Jabberwocky           Previous      Next
 
 
 

Jabberwocky for June 8, 2008

 
 

Post to Graphs@YahooGroups.com regarding my Graph Isomorphism work

          Next in thread

I recently joined the Graphs group at YahooGroups and look forward to participating in that group. It is located at tech.groups.yahoo.com/group/Graphs/.

In response to a question from the group regarding my work on the graph isomorphism problem, I posted the following last Friday morning:

    Hi, David and all,

    Most of my math / algorithm interests are tied to graph theory in some way.

    Graph isomorphism...  I'm nearly done with a general algorithm and a  
    proof of its complexity, but I've been 'nearly done' for some time.   
    I've been taking a break from it for the past few months.  I'll get  
    back to it again this weekend - I had coffee with a mathematician that  
    I've been discussing it with yesterday and he gently urged me to put  
    other things aside and finish it.

    In starting the blog, I made the decision to do my research openly and  
    to withhold little - that 'little' is my graph isomorphism work, and  
    the reason is that I think I'm within months of being done.  If I  
    don't carry it to completion by the fall, I'll present what I've done  
    and continue to work on the last detail.

    I've found what appears to be a complete invariant for graphs (with a  
    twist) that allows me to construct a canonical representation of a  
    graph (remember the twist).  For most graphs, a single phase of  
    processing is needed to construct this canonical representation. The  
    twist is that, for some graphs, a second phase of processing is  
    required to construct this canonical representation, and I haven't  
    finished the algorithm or its implementation for this second phase -  
    I've got some debugging to do.  There is an easily detectable  
    structural property in the graphs that will require this second phase  
    of processing.

    For the graphs without this structural property, the first phase of  
    processing is sufficient, and the algorithm is complete and provably  
    quite efficient (in the vicinity of O(N^3) or O(N^4)).

    There are two forms of the algorithm - one serial and one parallel.

    The serial algorithm constructs the canonical representation of  
    graph1, constructs the canonical representation of graph2, and then  
    compares them - if identical, the two graphs are isomorphic;  
    otherwise, they are not.

    The parallel algorithm constructs the canonical representations of  
    graph1 and graph2 in parallel, performing the same operations for each  
    graph in tandem, stopping as soon as a difference in the evolving  
    representations is discovered.  If stopped early, the graphs are not  
    isomorphic; if allowed to run to completion, the graphs are isomorphic.

    I've been working with an implementation of the serial algorithm and  
    will continue to do so until it is done and the proofs related to it  
    are complete.  At that point, I'll adapt the parallel algorithm from  
    the serial algorithm.

    I think that the second-phase processing for those graphs that require  
    it will also be quite efficient, but I'm not quite there yet.

    The algorithm, when completed, will work on all pairs of undirected  
    unweighted graphs.  Some initial work I've done suggests that it will  
    be easily extended to directed and/or weighted graphs as well.

    I've also spent many hours considering how this work could be extended  
    to the subgraph isomorphism problem.  I have a bit of hope, but no  
    conclusions.

    Here are a few data points for the performance of my graph isomorphism  
    algorithm, implemented as a single-threaded application, running on an  
    Intel Core 2 CPU 6400 @ 2.13 GHz with 3.25 GB of RAM.  They were  
    gathered quickly in order to give my computer scientist friend some idea  
    of performance.

    The columns specify the probability of an edge in the randomly  
    generated graphs.

    The rows specify the number of nodes in each graph. The values in the  
    intersection are the number of seconds required to determine whether  
    two isomorphic graphs are isomorphic. Each value represents the time  
    for processing a single pair of isomorphic graphs.

    The determination was correct and verified in all cases.  The zone  
    where the problem instances occur where my algorithm requires the  
    incomplete extension all appear to fall in the  
    high-probability-of-an-edge range (e.g. for N=30, around probEdge=0.97).

    Time in seconds to determine whether a pair of isomorphic
    graphs are isomorphic

    ProbEdge   0.01     0.1     0.3     0.5     0.7    0.99

         N

      1000     0.13    0.19    0.28    0.36    0.42    0.48
      3000     1.17    1.89    2.64    3.20    3.92    4.45
      5000     4.43    5.23    7.22    9.27   10.83   12.61
      7000     8.33   10.06   14.03   17.41   20.75   32.99
      9000    19.16   17.50   24.27   30.84   36.91   41.61
     11000    28.44   25.63   36.03   45.70   54.70   62.45

    I'm sorry to still be holding this one close to my chest - I'm so  
    close I can taste it.  If I am not mistaken, I'll be able to prove  
    that the Graph Isomorphism problem is polynomial worst-case and  
    low-order polynomial at that (O(N^4) in time is my current estimate).

    I'll publish what I've got within a few months.

    I've got a lot to do this weekend.  Should be (mostly) fun.

I'll hold off on details on the algorithm for some time, but I will offer up information as I can.

I've started to collect timing data for randomly generated pairs of isomorphic graphs having from 100 to 10,000 nodes. As this collection of data develops, I'll post snapshots of where it is stands.

When generating pairs of isomorphic graphs, I start by randomly generating a graph of N nodes having the probability of an edge ProbEdge. I then randomly map the vector of the node labels of that graph to a second vector of node labels, transforming a copy of the original graph into an isomorphic graph having the labels specified in the second vector.

When the algorithm constructs an isomorphism map from the labels of the first graph to the labels of the second, the second graph is relabelled according to that map and the equality of the original graph and the relabelled second graph is verified.

          Next in thread

The Reduction of Constraint Trees

          First in thread      Previous in thread      Next in thread

This work is derived from my doctoral dissertation and was written in 2001.

Introduction

Truth is the most valuable thing we have. Let us economize it.

    - Mark Twain

Each of the models of a Boolean expression must conform to at least one of the consistent selection sets that are produced by the Traverse-Graph algorithm during an application of the constraint calculus to the expression. We are free to modify the constraint tree in any way that does not alter the model-generability of the set of selection sets produced by traversal. For example, we may apply transformations to the constraint tree that have the effect of deleting inconsistent selection sets, deleting subsumable selection sets, or replacing a group of selection sets by one which one which subsumes each of them but which generates no new interpretations. Transformations that reduce some specified measure of a structure are known as reductions. The concept of a complete set of reductions is introduced and the advantages of reducing structures using a complete set of reductions is presented. If reduction results in the deletion of the entire constraint tree, then the expression is a contradiction. If reduction results in a single And node with an empty set of constraints, the expression is a tautology. In addition to providing a useful tool for facilitating the determination of the satisfiabilty of difficult expressions, the reduction of constraint trees using a complete set of reductions is an effective method for simplifying Boolean expressions.

Constraint Tree Terminology

The following constraint tree was often used when exploring transformations and may be useful as a reference constraint tree.

 

 

The root of the tree is the '[ ]' at the top of the diagraph.

An And node is represented by a pair of square brackets that contain the guard set of constraints associated with that node.

An Or node is represented by a circle.

The distance between two nodes will be measured in the number of edges that lie on a directed path from one node to the other, including both the origin and the terminus.

A branch of a constraint tree is the set of nodes which lie on a path from the root of tree to one of its terminal nodes. The branch set associated with a branch is the union of the guard sets of the And nodes which are in the branch.

Node A is said to dominate node B if A and B are distinct, A and B are in the same branch, and A is closer to the root of the tree than is B. The dominant set of a node is the union of the guard sets of the And nodes which dominate that node.

A broom centered on a node is the union of all branches which contain that node. The center of a broom is the node upon which the broom is centered. The handle of a broom is the set containing the center of the broom and all nodes which dominate the center. The handle set associated with a handle is the union of the guard sets of the And nodes which are in the handle. The brush of a broom is the set containing the center of the broom and all of the nodes which it dominates.

Node A is said to command node B if A is a childless AND node, A is the child of an ancestor of B, and A and B are not in the same branch. The unit-command set of node B is the union of the guard sets of all single-constraint nodes that command B.

Constraint Tree Reduction Terminology

To delete a node from a constraint tree is to remove all of the selections which contain that node from the constraint tree. In order to delete a node B from a constraint tree, we must first identify, if it exists, the ancestor A of B which is an And node, has at least one sibling, and which is the closest such ancestor to B. If no such node exists, then the entire tree containing B should be released; otherwise, remove A from its parent's list of children and release the subtree rooted at A. The node A, if it exists, will be referred to as the deletion-separation node (DSN) for B. Deletion will be treated as a supporting transformation, and the constraint tree resulting from its application to node X will be denoted by <Deletion:X>. To delete a broom is to delete the center of the broom.

A constraint-tree transformation is a reduction if it reduces a particular measure of the constraint tree. The reductions which we shall consider will reduce the weight of constraint trees.

A constraint is raised by a transformation if it is moved closer to the root of the constraint tree by the transformation.

An application (APP) is an ordered pair consisting of a transformation X and a constrainttree node Y at which the application may be applied and will be denoted by X:Y. A sequence of applications will be denoted by a list of applications surrounded by a pair of angle brackets, e.g. <X:Y>, <APP1>, <W:X, Y:Z>. If more than one application is in the list, each application is applied in turn, starting with the leftmost application. For example, T<W:X, Y:Z> represents the constraint tree resulting from the application of transformation W to point-of-application X in T, and then the application of transformation Y to point-of-application Z in T<W:X>.

The point of application (POA) is the node to which the application is applied.

The critical set is the set of nodes which, if modified or released by another transformation, could affect the subsequent applicability of this transformation to the POA, or could alter the effect that the application of the transformation to the POA would have. The modification set is the set of nodes which could be either modified (either its guard set or its list of children) or released by the application of the transformation to the POA. The obviation criteria identify the conditions under which the application of the transformation to the POA is rendered unnecessary.

Two sequences of applications Q1 and Q2 are equivalent if, for every tree T to which they both can be applied, Q1 applied to T equals Q2 applied to T.

Two applications will be said to interact if and only if the intersection of the critical set of one of the applications and the modification set of the other application is not empty. If APP1 and APP2 do not interact, then <APP1, APP2> = <APP2, APP1>.

Complete Sets of Reductions

In order to explore some issues that arise with the reduction of constraint trees, we will first consider rewriting strings of characters. The reduction of strings isn't any less complex than the reduction of constraint trees but it is easier to illustrate in examples.

Given a vocabulary V = { 'a', 'b', 'c' }, let L be the language consisting of the strings of finite length consisting of letters in V. For example, "aababaaa" Î L. Let the weight of a string be three times the number of a's plus two times the number of b's plus the number of c's that are in the string. For example, the weight of "aababaaa" is 3*6+2*2+1*0 =22. Our goal is to reduce any string in L to a string having minimum weight using a set of reductions R.

A reduction will have the form "string1string2" and indicates that string1 should be replaced by string2. The application of a reduction R1 to the substring starting at position n in a string will be denoted by <R1:n>.

Suppose that we want to reduce a string S : "aaaaab" using only two reductions R1 : ab → bc and R2 : ab → cb. The optimal (least-weight) reduction of S is "bccccc" which is obtained by exhaustively applying R1 until it can't be applied any more, as follows:

aaaaab
aaaabc
aaabcc
aabccc
abcccc
bccccc

Any application of R2 is terminal - it precludes any further reduction:

aaaaab
aaaabc
aacbcc

We could discover the optimal reduction sequence for "aaaaab" by inspection and a little thought. Whether applying R1 exhaustively to any string in L is an optimal reduction strategy isn't as obvious.

For a more complex example, let R = { R1 : ac → b , R2 : bba → ca, R3 : bca → ab } be a set of reductions of strings in L.

For a particular string such as "acabbaccaccababbacbacabcabbacabac", it isn't obvious which sequence of applications will result in a string having the least possible weight. Our choices for first application are R1:0, R1:5, R1:8, R1:16, R1:19, R1:27, R1:31, R2:3, R2:14, R2:25, and R3:22. Which one should we pick as the first step in our quest for an optimal reduction of this string? What criteria could we consider to guide the second, third, and subsequent applications of a reduction to the string? Are these criteria heuristics or hard rules that guarantee optimality? Are any of these heuristics or hard rules applicable to any string in L? For any string S . L, how can we perform an optimal reduction on S? If these questions aren't challenging enough, add another reduction to L and try again.

Consider the set of reductions R = { R1 : a → b, R2 : a → c }. It is easy to see that, for every string in L, the optimal strategy is to apply R2 exhaustively, and that each optimally reduced string is unique - the order in which we apply R2 does not matter. If R1 is applied, it blocks optimal reduction - there is no sequence of subsequent applications that can arrive at the minimal-weight reduced string. The application of R1 and the application of R2 result in a divergence, and there is no reduction available that can result in subsequence convergence.

Consider the set of reductions R' = { R1 : a → b, R2 : a → c , R3 : b &rarr c }. The addition of R3 completes the set of reductions in this sense: not only can we reduce a string further, but the order in which we apply the reductions doesn't matter. If we apply the reductions in R' to a string until no application can be applied, we will have produced a reduced string that is unique. Every string S . L reduces under R' to a string having the same length as S that consists only of c's.

A set of reductions that has these properties is known as a complete set of reductions - a finite set of reductions, only finite sequences of applications are possible, and the exhaustive application of reductions in any order results in a unique reduced structure.

More formally, a set of transformations is a complete set of reductions (CSR) if and only if it is both noetherian and confluent. A set of transformations is said to be noetherian if every possible sequence of transformation applications to every structure in the set's domain is finite in length. One method for establishing the noetherianity of a set is to show that each transformation in the set necessarily reduces a specified measure in the structures in the set's domain. Such transformations are referred to as reductions. A set of transformations is said to be confluent if an exhaustive application of the transformations in the set to a structure in the set's domain will result in a unique canonical form for that structure regardless of the order in which the transformations are applied.

Demonstrating that a set of transformations is a complete set of reductions is highly desirable for, once a set has been shown to be a CSR, we need only be concerned with the efficiency of the algorithm for applying the transformations exhaustively.

Definition 1 - A set of transformations is a complete set of reductions if it is both noetherian and confluent.

Definition 2 - A rewriting system is locally confluent iff

" M, M1, M2 [ ( M → M1 Ù M → M2 ) implies $ P ( M1 →* P Ù M2 →* P ) ]

where M, M1, M2, and P are terms, '→' represents a single application of a rewriting rule, and '→*' represents a sequence of zero or more applications of rewriting rules (Le Chenadac 1986:17).

Observe that, while M1 is not equal to M2, P may be M1, M2, or neither. It is possible for the application of one transformation to both obviate and preclude the application of a second application.

For example, let R = { R1: a → b , R2: a → c } and R' = { R1: a → b, R2: a → c, R3: b → c } . R is not locally confluent because there is no equivalent pair of application sequences in which one starts with R1:x and the other with R2:x for some site x. R' is locally confluent in part because <R1:x, R3:x> = <R2:x> for some site x.

For our purposes, a constraint tree is a term. Another way of interpreting Definition 2 is as follows: a set of transformations E are locally confluent iff, for every pair R1 and R2 of transformations selected from E, every constraint tree T, every POA1 in T at which R1 may be applied, and every POA2 in T at which R2 may be applied, there exist two sequences of applications, S1 and S2, such that the first application in S1 is R1:POA1, the first application in S2 is R2:POA2, S1 and S2 both contain zero or more additional applications of a transformation in E to a POA in T, and S1 applied to T is equal to S2 applied to T.

Lemma 1 (Newman) - Let R be a noetherian rewriting system, then: R is confluent iff R is locally confluent.

A discussion of Huet's proof of Newman's lemma may be found in (Le Chenandec 1986:17).

Proving That a Set of Constraint Tree Transformations is a Complete Set of Reductions

In order to prove that a set of constraint tree transformations R is a complete set of reductions, it is necessary to prove that R is noetherian and that R is locally confluent.

Proving that R is noetherian is simple – this can be accomplished by showing that each transformation in R reduces the weight of a constraint tree.

Proving that R is locally confluent can be accomplished by considering the transformations in R a pair at a time and proving that, for all constraint trees and all pairs of points of application in each constraint tree, there exist equivalent pairs of application sequences, each headed by one of the pair of transformations.

References

Le Chenandec, P., Canonical Forms in Finitely Presented Algebras. John Wiley & Sons, New York, N.Y., 1986.

          First in thread      Previous in thread      Next in thread

 
 
 
  Home > Jabberwocky           Previous      Next
 
  Craig Holman   Research   Jabberwocky   Quotations   Contact  
 
 
 
  Patterncraft™ and Constraint and Consequence™
are trademarks of Craig Stewart Holman

Copyright © 2008 Craig Stewart Holman

Copyright and Legal Notices

All Rights Reserved.

 
   
 
  Patterncraft logo