![]() |
|
| | |
| Craig Holman Research Jabberwocky Quotations Contact | |
| Home >
Jabberwocky      |
|
Jabberwocky for June 5, 2009 |
|
Graph Isomorphism 1    
Introduction A graph isomorphism is a one-to-one mapping from the nodes of one graph to the nodes of another graph that preserves edges. Several years ago, a friend and colleague noticed my obsession with the k-Clique Exists problem and told me a cautionary tale of his many-year obsession with the Graph Isomorphism problem. His tale did not have the desired effect - I fell in love with graph isomorphism problem as well. Since then, I have been interested in the graph isomorphism problem - the problem of determining whether two graphs are isomorphic and, if they are, of discovering an isomorphic mapping between them. I was attracted by the cleanness and simplicity of the problem, as well as the challenge of its unknown complexity. While no polynomial algorithm for it is known, there is no proof that such an algorithm could not exist. The problem is known to be in NP but thought not to be in NP-complete. Several complexity researchers place it in its own complexity class floating between P and NP-Complete (if and only if P <> NP). We don't quite know what to do with it. I'm inclined to believe that Graph Isomorphism is in P - that a polynomial-time algorithm for it can be devised. All that one has to do to prove this is to exhibit such an algorithm, and there are fewer people pursuing this goal than are trying to prove that such algorithm exists. Also, I perversely tend to enjoy taking minority positions. In this series of posts, I will begin to present a set of algorithms for determining whether two graphs are isomorphic. I am still working on these algorithms and hope to complete the theory, algorithms, and implementations this year. I will describe the characteristics of the graphs on which the current algorithms work and how to detect the graphs for which they do not yet work. Summaries of metrics from applying the algorithms to a large publicly available database of graph pairs will be provided along the way. Canonical representations of graphs The approach that I pursued involved searching for a canonical representation of a graph. A canonical representation of a graph is a way of describing the graph that depends solely on the structure of the graph and not at all on the labels of the graph. If you took a labeled graph G, made a copy H of it, and randomly relabeled the nodes of H, the cores of the canonical representations of G and H would be identical, with the canonical representations of G and H each having an extension that carries the labeling associations. There are several representations of a labeled unweighted graph. Each representation can be transformed into any of the other representations and back again without losing the information needed to represent the graph. Some of these transformations cost little; some are expensive. Some representations reveal little about the structure of the graph; some make useful information about the structure of a graph readily available. Some representations of a graph are:
A set of maximal cliques provides a great deal of structural information, but it is expensive to construct. None of these representations is canonical. As noted earlier, an isomorphism is a one-to-one mapping between the nodes of one graph and the nodes of another graph that preserves edges. An automorphism is a one-to-one mapping between the nodes of a graph and the nodes of the same graph that preserves edges. Two nodes are automorphic if they map to each other under some automorphism. Many researchers have tried to place the nodes of a graph into equivalence classes having the properties that there is no automorphism that maps a node in one class to a node in another class, and that every node in a class is automorphic with every other node in that class - equivalences classes are automorphic groups. How can we distinguish between nodes that are not automorphic? People have made a sport of finding new invariants - non-label properties of nodes and their neighborhoods that can distinguish between non-automorphic pairs of nodes. The simplest invariant is the degree of a node - two nodes that have different degrees are not automorphic. The problem with invariants is knowing when you have enough to be able to conclude that a pair of nodes must be automorphic because none of the invariants have shown them to be non-automorphic. A complete invariant is a set of invariants that, when used to test for automorphism, will give neither false positives nor false negatives. Finding a complete invariant will allow us to construct canonical representations of graphs. What form would a canonical representation of a graph take? The approach that I favor would construct a sequence of equivalence classes, each of which is identified by an integer in [0 .. NumberOfNodes], has a distinguishing signature, and has a set of member nodes (identified by their labels) that have the same signature as the class. A node can only belong to one class. Not every class will end up in use - at least one class will have no signature and no member nodes. The core of each class consists of its class identifier, its signature, and the number of its member nodes. The extension of each class is the set of its member nodes' labels. For example, leaving aside the issue of signatures for the moment, and not displaying any class that has no members,
Equivalence classes for graph G
Core Extension
________________________________ _______________
Class Number of Set of member
ID Signature members node labels
_____ __________ _________ _______________
0 signature4 1 { 0 }
1 signature8 1 { 6 }
2 signature7 1 { 12 }
4 signature1 1 { 4 }
5 signature3 2 { 1, 3 }
6 signature5 1 { 7 }
7 signature2 1 { 10 }
8 signature9 1 { 2 }
9 signature6 1 { 8 }
13 signature0 4 { 5, 9, 11, 13 }
In order for an algorithm to construct a canonical representation of graphs, it must construct the same representation core for all isomorphic graphs. With regard to the previous example, each graph that is isomorphic to G must have an equivalence-class representation in which class 5 always has signature3 and always has two member nodes. However the algorithm is written, it must perform the same sequence of actions on the classes for each of two isomorphic graphs without regard for node labeling. Assume for the moment that we have an algorithm that can construct a canonical representation of a graph. How can this be used to determine whether two graphs are isomorphic? One approach, the serial approach, is to construct the canonical representation of each graph and then compare the cores of each representation to determine whether they are identical. If they are are identical, the two graphs are isomorphic, and the extensions may be used to construct an isomorphism; otherwise, they are not. Another approach, the parallel approach, maintains two extensions, one for each graph, and performs the same operations on each graph in parallel. If a non-label-related difference is found during the application of the parallel algorithm, the graphs are not isomorphic; otherwise they are and the extensions may be used to construct an isomorphism. For example, the following equivalence classes for isomorphic graphs G and H might be produced by the parallel approach:
Equivalence classes for graphs G and H
Core Extension for G Extension for H
________________________________ _______________ _______________
Class Number of Set of member Set of member
ID Signature members node labels node labels
_____ __________ _________ _______________ _______________
0 signature4 1 { 0 } { 9 }
1 signature8 1 { 6 } { 3 }
2 signature7 1 { 12 } { 1 }
4 signature1 1 { 4 } { 12 }
5 signature3 2 { 1, 3 } { 5, 8 }
6 signature5 1 { 7 } { 13 }
7 signature2 1 { 10 } { 7 }
8 signature9 1 { 2 } { 11 }
9 signature6 1 { 8 } { 0 }
13 signature0 4 { 5, 9, 11, 13 } { 2, 4, 6, 10 }
The parallel approach should be more efficient than the serial approach, in that it constructs and maintains only one core, and it ends as soon as a non-label-related difference is discovered. Incremental elaboration We use labels to identify the nodes of a graph. How can we talk about whether two graphs have the same underlying unlabeled graph without using labels to identify nodes? We can try to use some structural description - some structural invariant - to distinguish and identify nodes. I revisited the graph isomorphism problem occasionally for several years looking for an adequate structural description - one that was compact and efficiently calculated. An incremental approach seemed appropriate - I only needed to elaborate the structural description of a node until it could be distinguished from the other nodes of the graph, if that was possible. Some nodes may remain structurally indistinguishable - these nodes are automorphic. Once again, the problem is how to know when we are done - when all pairs of nodes that haven't been distinguished really are automorphic. An incrementally elaborated structural description seemed appropriate. Which aspect of structure is to be described? The only structural aspect of an unlabeled graph is adjacency. The list of adjacency relationships that each node participates in completely describes the graph. How can we structurally (and incrementally) describe the adjacency relationships of a node? One approach is to replace the marker for a node with a set of nodes that are adjacent to that node. In the sequence of elaborations for a single node that follow, '*' represents a node and '(' ... ')' represents the set of nodes that are adjacent to a node. 1: * [a node] 2: ( * * * ) [a node that is adjacent to three other nodes] 3: ( ( * * ) ( * * ) ( * * * * ) ) 4: ( ( ( * * ) ( * * ) ) ( ( * ) ( * * * ) ) ( ( * ) ( * ) ( * * ) ( * * * * ) ) ) We could continue to construct another elaboration of 4 by expanding some or all of the '*', etc. I envisioned a sequence of elaborations of the structural descriptions of a set of nodes that would end when further elaboration was impossible or unneeded. There are some problems with this approach. The length of the description of a node can grow quite large, and the description of a node can appear many times in different contexts with varying levels of elaboration. On August 26, 2004, after becoming saturated once more with the k-Clique Exists problem, I switched back to the Graph Isomorphism problem and quickly saw what I thought was a way through these obstacles:
For example, this is the list of equivalence classes for graph G with degree and adjacency signatures:
Equivalence classes for graph G
Core Extension
_____________________________________________________ _______________
Class Number of Set of member
ID Signature (adjacency) members node labels
_____ _______________________________ _________ _______________
0 [ 2 5 5 6 8 9 13 13 13 13 ] 1 { 0 }
1 [ 2 4 5 5 6 7 13 13 13 13 ] 1 { 6 }
2 [ 0 1 4 5 5 7 8 13 13 13 13 ] 1 { 12 }
4 [ 1 2 5 5 6 7 9 13 13 13 13 ] 1 { 4 }
5 [ 0 1 2 4 6 7 8 9 13 13 13 13 ] 2 { 1, 3 }
6 [ 0 1 4 5 5 7 8 9 13 13 13 13 ] 1 { 7 }
7 [ 1 2 4 5 5 6 8 9 13 13 13 13 ] 1 { 10 }
8 [ 0 2 5 5 6 7 9 13 13 13 13 ] 1 { 2 }
9 [ 0 4 5 5 6 7 8 13 13 13 13 ] 1 { 8 }
13 [ 0 1 2 4 5 5 6 7 8 9 13 13 13 ] 4 { 5, 9, 11, 13 }
The adjacency signature for class 0 should be interpreted as follows: each node in class 0 is adjacent to one node in class 2, two nodes in class 5, one node in class six, one node in class 8, one node in class 9, and four nodes in class 13. A few comments:
An example Here is an example of the construction of the canonical representation of a graph. This example will be discussed at length and an algorithm presented in the next post in this series. Discussion of 'medial analysis' will be deferred for now. Since the degree signature is also the length of the adjacency signature, it has been omitted where possible for the sake of clarity.
Augmented Adjacency Sets for Graph0
0 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14 }
1 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14 }
2 : { 0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 14 }
3 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 }
4 : { 0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13, 14 }
5 : { 0, 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14 }
6 : { 0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14 }
7 : { 0, 1, 2, 3, 6, 7, 9, 10, 11, 12, 13, 14 }
8 : { 0, 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14 }
9 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 }
10 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 }
11 : { 1, 2, 3, 5, 7, 8, 9, 10, 11, 13, 14 }
12 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14 }
13 : { 0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13 }
14 : { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14 }
Application of the algorithm for constructing the canonical representation of Graph0
distributing nodes into their initial equivalence classes based upon their degree
( the degree of each node is the index of the class it will be placed in )
enqueuing signature-refresh tasks for each of the equivalence cases that are in use
phase 1
equivalence classes
10 : degree sig [ 10 ]
graph0 members { 11, 13 }
11 : degree sig [ 11 ]
graph0 members { 7 }
12 : degree sig [ 12 ]
graph0 members { 2, 4, 5, 6, 8, 12 }
13 : degree sig [ 13 ]
graph0 members { 0, 1, 14 }
14 : degree sig [ 14 ]
graph0 members { 3, 9, 10 }
signature-refresh task queue: ( 10 11 12 13 14 )
begin task 1: refresh the signatures of the 2 nodes in equivalence class 10
moving node 13 to equivalence class 0
moving node 11 to equivalence class 1
equivalence class 10 is no longer in use
enqueuing refresh tasks for equivalence classes ( 0 1 )
because their signatures refer to retired equivalence class 10
end task
equivalence classes
0 : adjacency sig [ 10 11 12 12 12 12 13 14 14 14 ]
graph0 members { 13 }
1 : adjacency sig [ 10 11 12 12 12 13 13 14 14 14 ]
graph0 members { 11 }
11 : degree sig [ 11 ]
graph0 members { 7 }
12 : degree sig [ 12 ]
graph0 members { 2, 4, 5, 6, 8, 12 }
13 : degree sig [ 13 ]
graph0 members { 0, 1, 14 }
14 : degree sig [ 14 ]
graph0 members { 3, 9, 10 }
signature-refresh task queue: ( 11 12 13 14 0 1 )
begin task 2: refresh the signature of the node in equivalence class 11
end task
signature-refresh task queue: ( 12 13 14 0 1 )
begin task 3: refresh the signatures of the 6 nodes in equivalence class 12
moving node 5 to equivalence class 2
moving node 8 to equivalence class 2
moving node 6 to equivalence class 3
moving node 4 to equivalence class 4
moving node 2 to equivalence class 5
moving node 12 to equivalence class 6
equivalence class 12 is no longer in use
enqueuing refresh tasks for equivalence classes ( 2 3 4 5 6 11 )
because their signatures refer to retired equivalence class 12
end task
equivalence classes
0 : adjacency sig [ 10 11 12 12 12 12 13 14 14 14 ]
graph0 members { 13 }
1 : adjacency sig [ 10 11 12 12 12 13 13 14 14 14 ]
graph0 members { 11 }
2 : adjacency sig [ 0 1 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 5, 8 }
3 : adjacency sig [ 0 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 6 }
4 : adjacency sig [ 0 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 4 }
5 : adjacency sig [ 1 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 2 }
6 : adjacency sig [ 11 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 12 }
11 : adjacency sig [ 0 1 12 12 12 13 13 13 14 14 14 ]
graph0 members { 7 }
13 : degree sig [ 13 ]
graph0 members { 0, 1, 14 }
14 : degree sig [ 14 ]
graph0 members { 3, 9, 10 }
signature-refresh task queue: ( 13 14 0 1 2 3 4 5 6 11 )
begin task 4: refresh the signatures of the 3 nodes in equivalence class 13
moving node 0 to equivalence class 7
moving node 1 to equivalence class 8
moving node 14 to equivalence class 8
equivalence class 13 is no longer in use
enqueuing refresh tasks for equivalence classes ( 7 8 )
because their signatures refer to retired equivalence class 13
end task
equivalence classes
0 : adjacency sig [ 10 11 12 12 12 12 13 14 14 14 ]
graph0 members { 13 }
1 : adjacency sig [ 10 11 12 12 12 13 13 14 14 14 ]
graph0 members { 11 }
2 : adjacency sig [ 0 1 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 5, 8 }
3 : adjacency sig [ 0 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 6 }
4 : adjacency sig [ 0 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 4 }
5 : adjacency sig [ 1 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 2 }
6 : adjacency sig [ 11 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 12 }
7 : adjacency sig [ 0 2 2 3 4 5 6 11 13 13 14 14 14 ]
graph0 members { 0 }
8 : adjacency sig [ 1 2 2 3 4 5 6 11 13 13 14 14 14 ]
graph0 members { 1, 14 }
11 : adjacency sig [ 0 1 12 12 12 13 13 13 14 14 14 ]
graph0 members { 7 }
14 : degree sig [ 14 ]
graph0 members { 3, 9, 10 }
signature-refresh task queue: ( 14 0 1 2 3 4 5 6 11 7 8 )
begin task 5: refresh the signatures of the 3 nodes in equivalence class 14
all nodes in equivalence class 14 have the same new signature
end task
signature-refresh task queue: ( 0 1 2 3 4 5 6 11 7 8 )
begin task 6: refresh the signature of the node in equivalence class 0
end task
signature-refresh task queue: ( 1 2 3 4 5 6 11 7 8 )
begin task 7: refresh the signature of the node in equivalence class 1
end task
signature-refresh task queue: ( 2 3 4 5 6 11 7 8 )
begin task 8: refresh the signatures of the 2 nodes in equivalence class 2
moving node 8 to equivalence class 9
moving node 5 to equivalence class 15
equivalence class 2 is no longer in use
enqueuing refresh tasks for equivalence classes ( 0 1 9 14 15 )
because their signatures refer to retired equivalence class 2
end task
equivalence classes
0 : adjacency sig [ 1 2 2 3 4 7 11 14 14 14 ]
graph0 members { 13 }
1 : adjacency sig [ 0 2 2 5 8 8 11 14 14 14 ]
graph0 members { 11 }
3 : adjacency sig [ 0 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 6 }
4 : adjacency sig [ 0 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 4 }
5 : adjacency sig [ 1 11 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 2 }
6 : adjacency sig [ 11 12 12 12 12 12 13 13 13 14 14 14 ]
graph0 members { 12 }
7 : adjacency sig [ 0 2 2 3 4 5 6 11 13 13 14 14 14 ]
graph0 members { 0 }
8 : adjacency sig [ 1 2 2 3 4 5 6 11 13 13 14 14 14 ]
graph0 members { 1, 14 }
9 : adjacency sig [ 0 1 2 3 4 6 7 8 8 14 14 14 ]
graph0 members { 8 }
11 : adjacency sig [ 0 1 12 12 12 13 13 13 14 14 14 ]
graph0 members { 7 }
14 : adjacency sig [ 0 1 2 2 3 4 5 6 7 8 8 11 14 14 ]
graph0 members { 3, 9, 10 }
15 : adjacency sig [ 0 1 2 4 5 6 7 8 8 14 14 14 ]
graph0 members { 5 }
signature-refresh task queue: ( 3 4 5 6 11 7 8 0 1 9 14 15 )
begin task 9: refresh the signature of the node in equivalence class 3
end task
signature-refresh task queue: ( 4 5 6 11 7 8 0 1 9 14 15 )
begin task 10: refresh the signature of the node in equivalence class 4
end task
signature-refresh task queue: ( 5 6 11 7 8 0 1 9 14 15 )
begin task 11: refresh the signature of the node in equivalence class 5
end task
signature-refresh task queue: ( 6 11 7 8 0 1 9 14 15 )
begin task 12: refresh the signature of the node in equivalence class 6
end task
signature-refresh task queue: ( 11 7 8 0 1 9 14 15 )
begin task 13: refresh the signature of the node in equivalence class 11
end task
signature-refresh task queue: ( 7 8 0 1 9 14 15 )
begin task 14: refresh the signature of the node in equivalence class 7
end task
signature-refresh task queue: ( 8 0 1 9 14 15 )
begin task 15: refresh the signatures of the 2 nodes in equivalence class 8
all nodes in equivalence class 8 have the same new signature
end task
signature-refresh task queue: ( 0 1 9 14 15 )
begin task 16: refresh the signature of the node in equivalence class 0
end task
signature-refresh task queue: ( 1 9 14 15 )
begin task 17: refresh the signature of the node in equivalence class 1
end task
signature-refresh task queue: ( 9 14 15 )
begin task 18: refresh the signature of the node in equivalence class 9
end task
signature-refresh task queue: ( 14 15 )
begin task 19: refresh the signatures of the 3 nodes in equivalence class 14
all nodes in equivalence class 14 have the same new signature
end task
signature-refresh task queue: ( 15 )
begin task 20: refresh the signature of the node in equivalence class 15
end task
equivalence classes
0 : adjacency sig [ 1 3 4 7 9 11 14 14 14 15 ]
graph0 members { 13 }
1 : adjacency sig [ 0 5 8 8 9 11 14 14 14 15 ]
graph0 members { 11 }
3 : adjacency sig [ 0 4 5 6 7 8 8 9 11 14 14 14 ]
graph0 members { 6 }
4 : adjacency sig [ 0 3 5 6 7 8 8 9 14 14 14 15 ]
graph0 members { 4 }
5 : adjacency sig [ 1 3 4 6 7 8 8 11 14 14 14 15 ]
graph0 members { 2 }
6 : adjacency sig [ 3 4 5 7 8 8 9 11 14 14 14 15 ]
graph0 members { 12 }
7 : adjacency sig [ 0 3 4 5 6 8 8 9 11 14 14 14 15 ]
graph0 members { 0 }
8 : adjacency sig [ 1 3 4 5 6 7 8 9 11 14 14 14 15 ]
graph0 members { 1, 14 }
9 : adjacency sig [ 0 1 3 4 6 7 8 8 14 14 14 15 ]
graph0 members { 8 }
11 : adjacency sig [ 0 1 3 5 6 7 8 8 14 14 14 ]
graph0 members { 7 }
14 : adjacency sig [ 0 1 3 4 5 6 7 8 8 9 11 14 14 15 ]
graph0 members { 3, 9, 10 }
15 : adjacency sig [ 0 1 4 5 6 7 8 8 9 14 14 14 ]
graph0 members { 5 }
20 tasks were executed in phase 1
performing medial analysis
identifying medial components
there are no medial components of the class graph
identifying medial classes
there are no medial classes
20 tasks were executed overall
The final list of equivalence classes in example is a canonical representation of Graph0. The graph in this example was particularly simple - I wanted to show the basic behavior of the algorithm without getting caught up in some complications that will be dealt with later. In particular, an analysis that I refer to as 'medial analysis' was performed at the end of phase 1 and its outcome rendered a second phase unnecessary. These issues, medial analysis, the need for a third signature component, and what happens during a second phase will be discussed in future posts in this series.
     |
| Home >
Jabberwocky      |
|
| Craig Holman Research Jabberwocky Quotations Contact | |
| Patterncraft and Constraint and Consequence are trademarks of Craig Stewart Holman Copyright © 2009 Craig Stewart Holman All Rights Reserved. |
|
| | |
![]() |
|