Constraint and Consequence
 
   
  Craig Holman   Research   Jabberwocky   Quotations   Contact
 
  Home > Jabberwocky           Previous      Next
 
 
 

Jabberwocky for May 27, 2008

 
 

The k-Clique Exists Problem

This is my favorite problem in graph theory. I have spent more time thinking about it than any other and have developed at least six new algorithms for solving it.

Does a graph contain a k-clique (a fully connected subgraph having k nodes)?

If it does, what is an example of a clique of size at least k?

This problem is in complexity class NP-Complete.

The Search Approach to the k-Clique Exists Problem

          Next in thread

This is the first in a series of posts that presents a parameterized algorithm, or an algorithm framework, that solve the k-Clique Exists problem (Clique) by means of the reduction of a search space and, if necessary, a backtracking search of the resulting space. They are derived from a draft of a paper on the subject that I wrote last spring.

The problem is to determine whether a labelled strict graph of N nodes contains a clique of size k, where k < N.

One approach to determining whether a k-clique exists in a graph is to search for one. A naïve search involves visiting nodes exhaustively to see if a k-clique can be found. Rather than visit nodes in any order, the nodes may be labelled 0 .. N-1 and nodes may be visited in ascending order only. By carefully constructing a map from the original labels to the new labels, we may signficantly reduce the work that must be done.

Let G denote the original graph, H denote a copy of the original graph with relabelled nodes, and M denote the 1-1 correspondence that maps the labels of G onto the labels of H. Since we will only be able to visit nodes in ascending order during a search, it will be convenient to construct a directed acyclic graph D such that all of the nodes of H are nodes of D and every edge in H corresponds to a directed edge in D, proceeding from a lower-numbered node to a higher-numbered node.

We will investigate the idea of a search sequence of D, which is an ascending-ordered sequence of k distinct integers, each in 0 .. N-1, such that each pair of nodes in adjacent positions in the sequence corresponds to a directed edge of D. Let S denote an arbitrary search sequence of D.

The question arises, then, as to which nodes of D can occur in each position of a search sequence of D. There is an exact answer to this question, but we do not need to find it in order to determine whether a k-clique exists. The approach that we shall take is to maintain a record of which nodes have not been proven not to occur in each position in a search sequence, and then to reduce this search space by removing node-position matches as we are able to prove that node Y can not occupy position X in S.

The structure that we will use to model this is the text of D, a two-dimensional grid that is N cells tall and K cells wide. Each cell in the text represents the state of our knowledge about whether a node (row) can occupy a position (column) in a search sequence and, if so, under which constraints.

The following is a visualization of the text of a randomly generated graph (N = 50, k = 30, edgeProbability = 0.96) after the PropagateConstraints algorithm has been executed.

The blue cursor in the lower-left corner surrounds cell (0, 0).

Gray cells are Impossible. Cells that are a shade of red are Possible. The brighter the color of a red cell, the larger the cardinality of its LocalAAS - ranging from black cells (cardinality K) to bright red cells (cardinality N).

Initially, we shall assume that all nodes can occupy all positions in S. This corresponds to all cells in the text being marked as Possible. We can quickly mark many cells as Impossible by imposing the ascending-search rule. A second structural rule marks most cells as impossible if they are in the same row or column as a complete node, a node that is adjacent to every other node. The following is a visualization of the text of a randomly generated graph (N = 50, k = 30, edgeProbability = 0.96) after these first two rules have been applied.

Our task is to determine whether a k-clique exists in H by reducing the search space implicit in the text of D and, if necessary, searching the reduced text for a k-clique.

We can construct a search sequence S from the text as follows: For each column of the text, select a cell at (Row, Column) that is Possible and is in a row higher than that of the selected cell of the previous column, if any, and place Row into position Column of S.

If a search sequence S has the property that the intersection of the graph-H augmented adjacency sets for the nodes in S has cardinality of at least k, it will be called a k-sufficient search sequence of D.

We can search the text for a k-clique as follows: Perform a backtracking search of the text, starting with column 0 and ending with column k-1, seeking to construct a k-sufficient search sequence S . If such a search sequence can be found, the nodes in S are a k-clique of H; otherwise, neither H nor G has a k-clique.

One useful algorithm, ProbeForKClique, peforms a very restricted, non-backtracking search of the text that attempts to construct a single k-sufficient search sequence of D.

It is in our interest to further reduce the text prior to conducting a search of the text. As with many reductive processes, text reduction has the effect of burning away irrelevancies, increasingly revealing the structure that we are seeking. It will often be the case that various levels of text reduction, each followed by a probe, will be sufficient to determine whether a k-clique exists without actually performing a search of the reduced text.

Many rules can be woven into text-reduction algorithms. Some of these rules follow.

  • As long as there is at least one Possible cell in each column of the text, we have not proven that a k-clique is impossible. Although an exhaustive search may conclude that a k-clique is impossible, we are concerned at the moment with text-reduction rules that do not involve searching.
  • If any column of the text loses its last Possible cell, we have proven that a k-clique does not exist, and all cells in the text may be marked as Impossible.
  • If a row of the text loses its last Possible cell, we have proven that the H-node associated with that row does not participate in a k-clique of H.
  • If the number of rows that contains at least one Possible cell falls below K, we have proven that a k-clique does not exist.
  • A node is a singularity if it is the only node that occurs in a particular position in S.
  • If node Row has the only cell in column Column, node Row is presumed to be a singularity and all other cells in row Row are marked as Impossible.

    Seven rules for reducing the text of D or resolving whether a k-clique exists have been mentioned. In order to progress beyond this point, we need to consider the adjacency relationships between the nodes of H. Rules for text reduction and resolution will be discussed at length.

    The following is a sketch of the KCliqueExistsSearch algorithm:

  • Construct a map M that will optimize a graph for text reduction and search
  • Apply the map M to the labels of a copy of graph G, resulting in graph H
  • Construct the directed acyclic graph D from H
  • Construct the text of D
  • Probe for a k-clique, terminating and reporting a k-clique if one is found
  • Reduce the text of D, terminating and reporting the resolution of the question, if resolved
  • Probe for a k-clique, terminating and reporting a k-clique if one is found
  • Search the reduced text of D, reporting the resolution of the question

    This algorithm has three major parameter algorithms:

  • ConstructMap - An algorithm to construct the map M
  • ReduceText - An algorithm to reduce the text of D
  • SearchForKClique - An algorithm to search the reduced text of D for a k-clique

    My current algorithm for ConstructMap will be presented later - it is lengthy and complex. It is quite effective but can most certainly can be improved upon.

    I have been investigating several algorithms for text reduction. One of these, TightenText, will be presented in this paper as a stand-in for ReduceText. Many rules that may be used in the design of text-reduction algorithms will also be presented. The design of algorithms for text reduction will be a very fruitful area of research.

    A fairly straightforward SearchForKClique algorithm will be presented.

    The KCliqueExistsSearch algorithm shows promise as a framework in which to design algorithms for solving the k-Clique Exists problem. Further investigation of this framework and the design of improved algorithms for its parameters are among my active research interests.

    Search space construction

    The search space of algorithm SearchForKClique

    While we could treat the directed acyclic graph D as the search space through which SearchForKClique searches for a k-clique, it is rather coarse - D can only be reduced by the elimination of a directed edge or a node. D doesn't represent everything that we may know about which nodes of D can occupy which positions in a search sequence of D.

    Constructing the text of D, reducing it rather than D, and treating the reduced text of D as the search space for SearchForKClique is both more effective and more efficient than working with D directly. The text of D can reduced to reflect the deletion of a directed edge of D and the deletion of a node of D, but it can also be reduced by the deletion of a cell-edge and the deletion of a cell (by marking it as Impossible).

    Reduction of the search space by relabelling nodes

    A directed edge from X to Y is a transition of length (Y-X). Each search sequence of D contains (k-1) transitions having a combined length of less than N. In general, the longer a transition, the fewer search sequences of D contain it. Some transitions are too long to participate in any search sequence of D.

    The more short transitions that can be 'occupied' by a non-edge of D, the fewer the number of search sequences that D will have.

    The construction of the map M from V(G) onto V(H) is a valuable opportunity to reduce the number of search sequences of D. The major goal of map construction is maximize the number of short transitions, especially transitions of length 1, that are occupied by the non-edges of D and thereby forbidden to the edges of D. The algorithm ConstructMap is one of the parameters of this algorithm.

    The adjacency set of node X of G is the set of all nodes in G that have an edge with X and is denoted by AS(X) The augmented adjacency set of a node X in G consists of AS(X) ? {X}, and is denoted by AAS(X).

    A node X of G is complete iff |AAS(X)| = N.

    One requirement that we will place on ConstructMap is that, if there are P complete nodes in G, the map that it constructs will map the labels of all of the complete nodes of G onto the nodes labelled 0 .. P-1 in H.

    Constructing a map that reduces the search space

    My current algorithm for constructing M does so in two phases. The following is an example summary of its operation on a randomly generated graph (N = 50, k = 27, edgeProbability = 0.96):

                Number of transitions occupied by non-edges
    
    Transition              After       After
    Length      Originally  Phase 1     Phase 2
    
       1          2          26          32
       2          1           1           1
       3          3           2           3
       4          1           3           1
       5          4           2           2
       6          3           1           2
       7          1           2           1
       8          3           1	
       9          5           1           2
      10          2           2	
      11                      2	
      12          2           2           1
      13                      1	
      14          3           1           1
      15			    	
      16          2   	
      17          1           1           2
      18          2
      19
      20          2                       1
      21          1           1	
      22          3		
      23					
      24                      2	
      25          4           1	
      26					
      27                                  1
      28          3           1           1
      29          1           1           2
      30                                  1
      31					
      32          1           1	
      33          1		
      34                      1           1
      35                                  1
      36				
      37          1	
      38				
      39          1	
      40          2	
      41				
      42          1	
    

    This algorithm is quite effective at constructing a map that minimizes the number of 1-transitions that are occupied by edges of H, which is all that it attempts.

    It is worth noting that the relative unavailability of 1-transitions to edges that results from the construction and application of map M is responsible for the efficacy of the constraint-propagation algorithm at carving away at the edges of the Possible text, resulting in much steeper top and bottom edges. This can be seen in the following pair of images:

        (Random graph: N = 50, k = 30, edgeProbability = 0.96)
    After Initialization After constraint propagation

    To be continued...

              Next in thread

    Two Identities from the Structure of the N-th Power of Integers

    Twenty-five years ago I was captured by Fermat's Last Theorem and spent a lot of time exploring the levels of differences between n-th powers of integers.

    Several identities came out of this exploration, two of which I found very suggestive as a pair. I hoped that they would provide constraints that would make a proof of Fermat's Last Theorem approachable, but I didn't know how to use them in that way. Nonetheless, I think they are beautiful.

  •  
     
     
      Home > Jabberwocky           Previous      Next
     
      Craig Holman   Research   Jabberwocky   Quotations   Contact  
     
     
     
      Patterncraft™ and Constraint and Consequence™
    are trademarks of Craig Stewart Holman

    Copyright © 2008 - 2009 Craig Stewart Holman

    Copyright and Legal Notices

    All Rights Reserved.

     
       
     
      Patterncraft logo