![]() |
|
| | |
| Craig Holman Research Jabberwocky Quotations Contact | |
| Home >
Jabberwocky      |
|
Jabberwocky for June 8, 2008 |
|
Post to Graphs@YahooGroups.com regarding my Graph Isomorphism work    
I recently joined the Graphs group at YahooGroups and look forward to participating in that group. It is located at tech.groups.yahoo.com/group/Graphs/. In response to a question from the group regarding my work on the graph isomorphism problem, I posted the following last Friday morning:
Hi, David and all,
Most of my math / algorithm interests are tied to graph theory in some way.
Graph isomorphism... I'm nearly done with a general algorithm and a
proof of its complexity, but I've been 'nearly done' for some time.
I've been taking a break from it for the past few months. I'll get
back to it again this weekend - I had coffee with a mathematician that
I've been discussing it with yesterday and he gently urged me to put
other things aside and finish it.
In starting the blog, I made the decision to do my research openly and
to withhold little - that 'little' is my graph isomorphism work, and
the reason is that I think I'm within months of being done. If I
don't carry it to completion by the fall, I'll present what I've done
and continue to work on the last detail.
I've found what appears to be a complete invariant for graphs (with a
twist) that allows me to construct a canonical representation of a
graph (remember the twist). For most graphs, a single phase of
processing is needed to construct this canonical representation. The
twist is that, for some graphs, a second phase of processing is
required to construct this canonical representation, and I haven't
finished the algorithm or its implementation for this second phase -
I've got some debugging to do. There is an easily detectable
structural property in the graphs that will require this second phase
of processing.
For the graphs without this structural property, the first phase of
processing is sufficient, and the algorithm is complete and provably
quite efficient (in the vicinity of O(N^3) or O(N^4)).
There are two forms of the algorithm - one serial and one parallel.
The serial algorithm constructs the canonical representation of
graph1, constructs the canonical representation of graph2, and then
compares them - if identical, the two graphs are isomorphic;
otherwise, they are not.
The parallel algorithm constructs the canonical representations of
graph1 and graph2 in parallel, performing the same operations for each
graph in tandem, stopping as soon as a difference in the evolving
representations is discovered. If stopped early, the graphs are not
isomorphic; if allowed to run to completion, the graphs are isomorphic.
I've been working with an implementation of the serial algorithm and
will continue to do so until it is done and the proofs related to it
are complete. At that point, I'll adapt the parallel algorithm from
the serial algorithm.
I think that the second-phase processing for those graphs that require
it will also be quite efficient, but I'm not quite there yet.
The algorithm, when completed, will work on all pairs of undirected
unweighted graphs. Some initial work I've done suggests that it will
be easily extended to directed and/or weighted graphs as well.
I've also spent many hours considering how this work could be extended
to the subgraph isomorphism problem. I have a bit of hope, but no
conclusions.
Here are a few data points for the performance of my graph isomorphism
algorithm, implemented as a single-threaded application, running on an
Intel Core 2 CPU 6400 @ 2.13 GHz with 3.25 GB of RAM. They were
gathered quickly in order to give my computer scientist friend some idea
of performance.
The columns specify the probability of an edge in the randomly
generated graphs.
The rows specify the number of nodes in each graph. The values in the
intersection are the number of seconds required to determine whether
two isomorphic graphs are isomorphic. Each value represents the time
for processing a single pair of isomorphic graphs.
The determination was correct and verified in all cases. The zone
where the problem instances occur where my algorithm requires the
incomplete extension all appear to fall in the
high-probability-of-an-edge range (e.g. for N=30, around probEdge=0.97).
Time in seconds to determine whether a pair of isomorphic
graphs are isomorphic
ProbEdge 0.01 0.1 0.3 0.5 0.7 0.99
N
1000 0.13 0.19 0.28 0.36 0.42 0.48
3000 1.17 1.89 2.64 3.20 3.92 4.45
5000 4.43 5.23 7.22 9.27 10.83 12.61
7000 8.33 10.06 14.03 17.41 20.75 32.99
9000 19.16 17.50 24.27 30.84 36.91 41.61
11000 28.44 25.63 36.03 45.70 54.70 62.45
I'm sorry to still be holding this one close to my chest - I'm so
close I can taste it. If I am not mistaken, I'll be able to prove
that the Graph Isomorphism problem is polynomial worst-case and
low-order polynomial at that (O(N^4) in time is my current estimate).
I'll publish what I've got within a few months.
I've got a lot to do this weekend. Should be (mostly) fun.
I'll hold off on details on the algorithm for some time, but I will offer up information as I can. I've started to collect timing data for randomly generated pairs of isomorphic graphs having from 100 to 10,000 nodes. As this collection of data develops, I'll post snapshots of where it is stands. When generating pairs of isomorphic graphs, I start by randomly generating a graph of N nodes having the probability of an edge ProbEdge. I then randomly map the vector of the node labels of that graph to a second vector of node labels, transforming a copy of the original graph into an isomorphic graph having the labels specified in the second vector. When the algorithm constructs an isomorphism map from the labels of the first graph to the labels of the second, the second graph is relabelled according to that map and the equality of the original graph and the relabelled second graph is verified.
    
The Reduction of Constraint Trees    
This work is derived from my doctoral dissertation and was written in 2001.
IntroductionTruth is the most valuable thing we have. Let us economize it. - Mark Twain
Each of the models of a Boolean expression must conform to at least one of the consistent selection sets that are produced by the Traverse-Graph algorithm during an application of the constraint calculus to the expression. We are free to modify the constraint tree in any way that does not alter the model-generability of the set of selection sets produced by traversal. For example, we may apply transformations to the constraint tree that have the effect of deleting inconsistent selection sets, deleting subsumable selection sets, or replacing a group of selection sets by one which one which subsumes each of them but which generates no new interpretations. Transformations that reduce some specified measure of a structure are known as reductions. The concept of a complete set of reductions is introduced and the advantages of reducing structures using a complete set of reductions is presented. If reduction results in the deletion of the entire constraint tree, then the expression is a contradiction. If reduction results in a single And node with an empty set of constraints, the expression is a tautology. In addition to providing a useful tool for facilitating the determination of the satisfiabilty of difficult expressions, the reduction of constraint trees using a complete set of reductions is an effective method for simplifying Boolean expressions. Constraint Tree TerminologyThe following constraint tree was often used when exploring transformations and may be useful as a reference constraint tree. |
|
|
|
The root of the tree is the '[ ]' at the top of the diagraph. An And node is represented by a pair of square brackets that contain the guard set of constraints associated with that node. An Or node is represented by a circle.
The distance between two nodes will be measured in the number of edges that lie on a directed path from one node to the other, including both the origin and the terminus. A branch of a constraint tree is the set of nodes which lie on a path from the root of tree to one of its terminal nodes. The branch set associated with a branch is the union of the guard sets of the And nodes which are in the branch. Node A is said to dominate node B if A and B are distinct, A and B are in the same branch, and A is closer to the root of the tree than is B. The dominant set of a node is the union of the guard sets of the And nodes which dominate that node. A broom centered on a node is the union of all branches which contain that node. The center of a broom is the node upon which the broom is centered. The handle of a broom is the set containing the center of the broom and all nodes which dominate the center. The handle set associated with a handle is the union of the guard sets of the And nodes which are in the handle. The brush of a broom is the set containing the center of the broom and all of the nodes which it dominates. Node A is said to command node B if A is a childless AND node, A is the child of an ancestor of B, and A and B are not in the same branch. The unit-command set of node B is the union of the guard sets of all single-constraint nodes that command B. Constraint Tree Reduction TerminologyTo delete a node from a constraint tree is to remove all of the selections which contain that node from the constraint tree. In order to delete a node B from a constraint tree, we must first identify, if it exists, the ancestor A of B which is an And node, has at least one sibling, and which is the closest such ancestor to B. If no such node exists, then the entire tree containing B should be released; otherwise, remove A from its parent's list of children and release the subtree rooted at A. The node A, if it exists, will be referred to as the deletion-separation node (DSN) for B. Deletion will be treated as a supporting transformation, and the constraint tree resulting from its application to node X will be denoted by <Deletion:X>. To delete a broom is to delete the center of the broom. A constraint-tree transformation is a reduction if it reduces a particular measure of the constraint tree. The reductions which we shall consider will reduce the weight of constraint trees. A constraint is raised by a transformation if it is moved closer to the root of the constraint tree by the transformation. An application (APP) is an ordered pair consisting of a transformation X and a constrainttree node Y at which the application may be applied and will be denoted by X:Y. A sequence of applications will be denoted by a list of applications surrounded by a pair of angle brackets, e.g. <X:Y>, <APP1>, <W:X, Y:Z>. If more than one application is in the list, each application is applied in turn, starting with the leftmost application. For example, T<W:X, Y:Z> represents the constraint tree resulting from the application of transformation W to point-of-application X in T, and then the application of transformation Y to point-of-application Z in T<W:X>. The point of application (POA) is the node to which the application is applied. The critical set is the set of nodes which, if modified or released by another transformation, could affect the subsequent applicability of this transformation to the POA, or could alter the effect that the application of the transformation to the POA would have. The modification set is the set of nodes which could be either modified (either its guard set or its list of children) or released by the application of the transformation to the POA. The obviation criteria identify the conditions under which the application of the transformation to the POA is rendered unnecessary. Two sequences of applications Q1 and Q2 are equivalent if, for every tree T to which they both can be applied, Q1 applied to T equals Q2 applied to T. Two applications will be said to interact if and only if the intersection of the critical set of one of the applications and the modification set of the other application is not empty. If APP1 and APP2 do not interact, then <APP1, APP2> = <APP2, APP1>. Complete Sets of ReductionsIn order to explore some issues that arise with the reduction of constraint trees, we will first consider rewriting strings of characters. The reduction of strings isn't any less complex than the reduction of constraint trees but it is easier to illustrate in examples. Given a vocabulary V = { 'a', 'b', 'c' }, let L be the language consisting of the strings of finite length consisting of letters in V. For example, "aababaaa" Î L. Let the weight of a string be three times the number of a's plus two times the number of b's plus the number of c's that are in the string. For example, the weight of "aababaaa" is 3*6+2*2+1*0 =22. Our goal is to reduce any string in L to a string having minimum weight using a set of reductions R. A reduction will have the form "string1 → string2" and indicates that string1 should be replaced by string2. The application of a reduction R1 to the substring starting at position n in a string will be denoted by <R1:n>. Suppose that we want to reduce a string S : "aaaaab" using only two reductions R1 : ab → bc and R2 : ab → cb. The optimal (least-weight) reduction of S is "bccccc" which is obtained by exhaustively applying R1 until it can't be applied any more, as follows:
aaaabc aaabcc aabccc abcccc bccccc
Any application of R2 is terminal - it precludes any further reduction:
aaaabc aacbcc
We could discover the optimal reduction sequence for "aaaaab" by inspection and a little thought. Whether applying R1 exhaustively to any string in L is an optimal reduction strategy isn't as obvious. For a more complex example, let R = { R1 : ac → b , R2 : bba → ca, R3 : bca → ab } be a set of reductions of strings in L. For a particular string such as "acabbaccaccababbacbacabcabbacabac", it isn't obvious which sequence of applications will result in a string having the least possible weight. Our choices for first application are R1:0, R1:5, R1:8, R1:16, R1:19, R1:27, R1:31, R2:3, R2:14, R2:25, and R3:22. Which one should we pick as the first step in our quest for an optimal reduction of this string? What criteria could we consider to guide the second, third, and subsequent applications of a reduction to the string? Are these criteria heuristics or hard rules that guarantee optimality? Are any of these heuristics or hard rules applicable to any string in L? For any string S . L, how can we perform an optimal reduction on S? If these questions aren't challenging enough, add another reduction to L and try again.
Consider the set of reductions R = { R1 : a → b, R2 : a → c }. It is easy to see that, for every string in L, the optimal strategy is to apply R2 exhaustively, and that each optimally reduced string is unique - the order in which we apply R2 does not matter. If R1 is applied, it blocks optimal reduction - there is no sequence of subsequent applications that can arrive at the minimal-weight reduced string. The application of R1 and the application of R2 result in a divergence, and there is no reduction available that can result in subsequence convergence. Consider the set of reductions R' = { R1 : a → b, R2 : a → c , R3 : b &rarr c }. The addition of R3 completes the set of reductions in this sense: not only can we reduce a string further, but the order in which we apply the reductions doesn't matter. If we apply the reductions in R' to a string until no application can be applied, we will have produced a reduced string that is unique. Every string S . L reduces under R' to a string having the same length as S that consists only of c's. A set of reductions that has these properties is known as a complete set of reductions - a finite set of reductions, only finite sequences of applications are possible, and the exhaustive application of reductions in any order results in a unique reduced structure. More formally, a set of transformations is a complete set of reductions (CSR) if and only if it is both noetherian and confluent. A set of transformations is said to be noetherian if every possible sequence of transformation applications to every structure in the set's domain is finite in length. One method for establishing the noetherianity of a set is to show that each transformation in the set necessarily reduces a specified measure in the structures in the set's domain. Such transformations are referred to as reductions. A set of transformations is said to be confluent if an exhaustive application of the transformations in the set to a structure in the set's domain will result in a unique canonical form for that structure regardless of the order in which the transformations are applied. Demonstrating that a set of transformations is a complete set of reductions is highly desirable for, once a set has been shown to be a CSR, we need only be concerned with the efficiency of the algorithm for applying the transformations exhaustively. Definition 1 - A set of transformations is a complete set of reductions if it is both noetherian and confluent. Definition 2 - A rewriting system is locally confluent iff " M, M1, M2 [ ( M → M1 Ù M → M2 ) implies $ P ( M1 →* P Ù M2 →* P ) ] where M, M1, M2, and P are terms, '→' represents a single application of a rewriting rule, and '→*' represents a sequence of zero or more applications of rewriting rules (Le Chenadac 1986:17). Observe that, while M1 is not equal to M2, P may be M1, M2, or neither. It is possible for the application of one transformation to both obviate and preclude the application of a second application. For example, let R = { R1: a → b , R2: a → c } and R' = { R1: a → b, R2: a → c, R3: b → c } . R is not locally confluent because there is no equivalent pair of application sequences in which one starts with R1:x and the other with R2:x for some site x. R' is locally confluent in part because <R1:x, R3:x> = <R2:x> for some site x. For our purposes, a constraint tree is a term. Another way of interpreting Definition 2 is as follows: a set of transformations E are locally confluent iff, for every pair R1 and R2 of transformations selected from E, every constraint tree T, every POA1 in T at which R1 may be applied, and every POA2 in T at which R2 may be applied, there exist two sequences of applications, S1 and S2, such that the first application in S1 is R1:POA1, the first application in S2 is R2:POA2, S1 and S2 both contain zero or more additional applications of a transformation in E to a POA in T, and S1 applied to T is equal to S2 applied to T. Lemma 1 (Newman) - Let R be a noetherian rewriting system, then: R is confluent iff R is locally confluent. A discussion of Huet's proof of Newman's lemma may be found in (Le Chenandec 1986:17). Proving That a Set of Constraint Tree Transformations is a Complete Set of ReductionsIn order to prove that a set of constraint tree transformations R is a complete set of reductions, it is necessary to prove that R is noetherian and that R is locally confluent. Proving that R is noetherian is simple – this can be accomplished by showing that each transformation in R reduces the weight of a constraint tree. Proving that R is locally confluent can be accomplished by considering the transformations in R a pair at a time and proving that, for all constraint trees and all pairs of points of application in each constraint tree, there exist equivalent pairs of application sequences, each headed by one of the pair of transformations. ReferencesLe Chenandec, P., Canonical Forms in Finitely Presented Algebras. John Wiley & Sons, New York, N.Y., 1986.
     |
| Home >
Jabberwocky      |
|
| Craig Holman Research Jabberwocky Quotations Contact | |
| Patterncraft and Constraint and Consequence are trademarks of Craig Stewart Holman Copyright © 2008 Craig Stewart Holman All Rights Reserved. |
|
| | |
![]() |
|