Constraint and Consequence
 
   
  Craig Holman   Research   Jabberwocky   Quotations   Contact
 
  Home > Jabberwocky           Previous      Next
 
 
 

Jabberwocky for June 3, 2008

 
 

Some Favorite Problems

I wrote this piece on January 29, 2007 for some friends. I wanted to introduce them to some of the things that interest me.

Problem - Factoring a composite integer into to two integer factors

We all know how to multiply two positive integers. Most of us can still do so without using a calculator.

Given two positive integers X and Y, we know how to calculate X * Y = Z.

What about the reverse operation? I don't mean division; I mean factoring.

Give the equation X * Y = Z, we'll refer to Z as the 'product' of X and Y, and to X and Y as the 'factors' of Z.

Remember that a positive integer is 'prime' if it is has no factors other than 1 and itself. 1 is not considered to be prime for technical reasons. The smallest prime number is 2.

The factoring problem involves finding two factors X and Y of an integer Z greater than 1 such that X and Y are each greater than 1. If either X or Y are not prime, there is more than one way to factor Z into two factors.

Think of factoring as reverse multiplication.

Factoring 15 is pretty easy. We can do it in our heads: 15 = 3 * 5 .

What about factoring 391? Not so easy. Its factors are 17 and 23, both of which are prime.

What about factoring 356971? Not easy at all. Its factors are 487 and 733, both of which are prime.

While multiplying two numbers is easy, factoring numbers is hard, and no one has found a very efficient algorithm for factoring numbers. The difficulty in factoring numbers has been taken advantage of in cryptography, where the secureness of some cryptographic methods is based upon the difficulty of factoring the product of two large primes into its prime factors.

Anytime you hear someone talking about 128-bit keys, 256-bit keys, 512-bit keys, or 1024-bit keys, they are most likely talking about a cryptographic method that uses difficulty-of-factoring the basis of its security. The number of bits that they are talking about is the number of binary digits of Z, where Z is represented in base 2 and consists only of ones and zeroes. The larger the number of bits in Z, the more difficult Z is to factor (Z is never even or a multiple of 5 - its two factors are usually large and often about the same size).

RSA Labs uses this approach in its cryptographic products. Since no one knows how difficult factoring really is yet, RSA Labs has been holding factoring competitions in order to reassure its customers. For the current competition, they published a list of eight large numbers, each of which is the product of two prime numbers. Be the first to factor any of these eight numbers, and you get some money.

[Note: RSA Labs ended the competiton in 2007.]

Finding an efficient method to factor large integers would break most Internet security as well as many other encryption mechanisms.

Here's a link to the RSA Labs page on this challenge: Factoring Challenge

I've been playing with factoring for several years. I've only undertaken an aggressive research program on factoring for the past three years. I'm currently redesigning and rewriting a factoring program that implements my algorithm.

Here's the next unfactored RSA challenge number, RSA-704, [which was] worth $ 30,000 :

74037563479561712828046796097429573142593188889231
28908493623263897276503402826627689199641962511784
39958943305021275853701189680982867331732731089309
00552505116877063299072396380786710086096962537934
650563796359

Here's the largest unfactored RSA challenge number, RSA-2048, [which was] worth $ 200,000 :

25195908475657893494027183240048398571429282126204
03202777713783604366202070759555626401852588078440
69182906412495150821892985591491761845028084891200
72844992687392807287776735971418347270261896375014
97182469116507761337985909570009733045974880842840
17974291006424586918171951187461215151726546322822
16869987549182422433637259085141865462043576798423
38718477444792073993423658482382428119816381501067
48104516603773060562016196762561338441436038339044
14952634432190114657544454178424020924616515723350
77870774981712577246796292638635637328991215483143
81678998850404453640235273819513786365643912120103
97122822120720357

I'll write up and post the details of my factoring algorithm late this year [perhaps next year]. I encourage anyone who finds this problem interesting to have fun trying to crack it.

 

Background - Graphs

Get a piece of paper out and draw five dots on the page.

Connect some of the pairs of dots with a line (but at most one line per pair of dots, and no line connecting a dot with itself).

There. You've just drawn a graph.

Most of my favorite problems involve graphs.

We'll call the dots 'nodes' and the lines between dots 'edges'.

This graph has no way of identifying nodes, so it is called an 'unlabeled graph'.

Write a different number by each of the five nodes.

Now the graph is a labeled graph - each of the nodes has a label that uniquely identifies it.

There are many varieties of graphs, but this is the one that I'll be focusing on.

A strict graph has a set of nodes and a set of edges connecting pairs of nodes such that no two nodes have more than one edge between them and no edge connects a node with itself.

We'll be dealing with labeled strict graphs.

A few other terms will help us out.

The 'degree' of a node is the number of edges that touch it - that have it as an end-node.

Two nodes are 'adjacent' if there is an edge between them.

A 'clique' is a set of nodes in a graph such that every node in the clique is adjacent to (has an edge with) every other node in the clique.

You can often think of a graph as though it were a room of people, some of whom are mutual acquaintances and some of whom are mutual strangers. Each person corresponds to a node in the graph, and two people know each other if their corresponding nodes are connected by an edge; otherwise, they don't know each other.

A clique in the room analogy is a group of people all of whom know each other.

Problem - Graph Isomorphism

The Graph Isomorphism problem is very easy to describe but has resisted efficient solution for a long time.

The 'graph' part of the name refers to the thing we're playing with - labeled strict graphs.

The 'isomorphism part of the name indicates what we're trying to do. Think of isomorphism as 'same shape' or 'same structure' and you won't go too far wrong.

The Graph Isomorphism problem: Determine whether two graphs are identical except for the way in which their nodes are labeled.

Suppose that we have two graphs, Red and Blue. Graph Red is labeled with numbers, starting with '0'. Graph Blue is labeled with letters, starting with 'A', and has 26 or fewer nodes.

How can we tell whether graphs Red and Blue are the same except for labeling?

I'll say that we have a 'positive solution' if we can prove that Red and Blue are the same except for labeliing, and that we have a 'negative solution' if we can prove that they aren't.

We'll know that we have a positive solution if we can build a table listing all of the labels of nodes in the Red graph in the first column and all of the labels of nodes in the Blue graph in the second column such that there is an edge between a pair of nodes in the Red graph if and only if there is an edge between the corresponding pair of nodes in the Blue graph.

 

For example,

Red Blue
0 E
1 B
2 A
3 C
4 F
5 D

How do we find a solution, positive or negative?

First of all, the graphs had better have the same number of nodes; otherwise, there is a negative solution.

Then, they'd better have the same number of edges; otherwise, there is a negative solution.

Then, they'd better have the same number of nodes of degree X, for X having values of 0 on up; otherwise, there is a negative solution.

At this point, things get more challenging.

People have proposed all sorts of other tests to perform, but they haven't been able to come up with a set of tests that are able to settle the matter efficiently in all cases.

Of course, it is possible to test all possible labelings of the nodes in the Blue graph to see if any of the resulting graphs corresponds to the Red graph, but this approach becomes prohibitively expensive as the size of the graphs grow.

This problem has many applications in the real world, especially in determining whether two chemical structures are essentially the same.

I've been working on this problem off and on for several years and believe that I'm very close to an algorithm that efficiently tests whether two graphs are isomorphic. I'm not quite ready to publish yet - I'm still working out some details. I'm optimistic, but I've been wrong before.

 

Problem - k-Clique Exists

I think it is obvious from some recent posts in this thread that while the problems that fascinate me are easily explained (I so maintain and will demonstrate), there is a lot of theory and jargon surrounding them.

I'll endeavor to explain enough of the theory and jargon to make sense of the problems and their significance soon, but I've decided to post the problems first and wander around the forest with my machete later.

It is my honor and privilege to introduce you to my favorite problem, k-clique Exists.

I've already written a bit about graphs. One term that I'd like to introduce is 'subgraph'. Take a graph and get rid of some of its nodes. When you get rid of a node, you also have to get rid of each of the edges that connect to that node. What you've got left is a subgraph of the original graph.

Actually, there are two flavors of subgraph. One is the one that I just described - you get rid of some nodes and all of the edges that connect to those nodes, and that's all. In the other flavor of subgraph, you are free to remove as many additional edges as you wish.

Unless I say otherwise, I'll always be talking about the first flavor of subgraph. A subgraph is also a graph on its own. I'll allow the original graph to be an honorary subgraph of itself.

I said that you can often think about graph problems in terms of people in a room. A clique is a group of people who all know each other. A k-clique is a group of k people who all know each other.

In graph terms, a clique is a complete graph - it is a graph in which each node has an edge with every other node of the graph. A complete graph has every possible edge.

A k-clique is a complete graph having k nodes.

So the k-clique Exists problem comes down to this: Given a graph of N nodes, does the graph contain a k-clique?, where k is some integer in the range 2 through N.

For certain values of k, the problem is easy to answer. If the graph contains any edge, then a 2-clique exists. If the graph contains all possible edges, then an N-clique exists.

The problem becomes much more interesting when k isn't such an extreme value. For example, think about a graph containing 100 nodes - does a 37-clique exist?

One could try to exhaustively examine each subset of the 100 nodes that contains 37 nodes and check each subset to see if the subgraph induced by that subset is a 37- clique.

Unfortunately, there are 3,420,029,547,493,938,143,902,737,600 of these subsets that would have to be examined. This the number of combinations of size 37 picked from 100: 100! / ( 37! * (100 - 37)! ) . The ! operator is called 'factorial'. 5! = 5 * 4 * 3 * 2 * 1 = 120.

So much for that approach.

Seriously, think about this for a minute. You can imagine a room with 100 people in it. Would you have imagined that there are that many ways to pick 37 people from that room?

It doesn't seem right that this should be that hard.

The problem is easily described. It is not easily solved.

It turns out that k-Clique Exists is one of a special group of problems (the NP-complete problems) that all have the same worst-case degee of difficulty in being solved, but no one knows how difficult that is. I'll write more about this soon, but it is tremendously important in computer science.

If you could design an algorithm that could solve every instance of the k-clique Exists problem efficiently (in polynomial time, later discussion), you would win $1,000,000.

You'd get the same award for being the first to design an algorithm that could solve every instance of any of the other NP-complete problems in polynomial time.

You'd also get the same award for proving that no such algorithm is possible.

I'll explain more about this soon and provide a link to the competition. For now, I'm just using the competition to point out how important some of these problems are.

I said that the k-Clique Exists problem is my favorite problem. I think this is because it is damned important, it is so easily explained, and of all the NP-complete problems, it is the one that most easily fits in my head. For other NP-complete problems, I have to resort to sketching things on paper, whereas, for this problem, I can do most of it in my head.

I've lain for hours on my bed staring at the ceiling, drawing graphs with my fingers and muttering to myself. I can walk around and try to work through things while I walk. I've often woken up from a dream with a question answered or a new obstacle raised.

We've already established that I'm weird. No need to dwell on it.

Over the past 14 years, I've designed seven new algorithms for this problem. One of them is showing enough promise that I'm focusing on it exclusively for now. I will be writing up some of this work and will provide links to it as it becomes available.

For me, it is the perfect problem.

 

Problem - Constructing One or All of the Maximum Cliques of a Graph

Recall that a clique is a complete graph (containing all possible edges) or, in the people-in-a-room analogy, a group of people who all know each other.

The size of a clique is the number of nodes in that clique. A graph usually contains cliques of many different sizes - some are larger than others.

A 'maximum clique' of a graph is one of the largest cliques in the graph (there may be more than one of the same size). There are no cliques in the graph larger than a maximum clique.

The Maximum Clique problem is to produce one of the maximum cliques of a graph.

The Maximum Cliques problem is to produce all of the maximum cliques of a graph.

Problem - Constructing All of the Maximal Cliques of a Graph

A 'maximal clique' is not the same thing as a maximum clique, though they are often confused.

A clique of a graph is a complete subgraph of the graph. Since it contains all possible edges, it can be simp ly specified by a set of nodes. I'll often cheat a bit and say that a clique is a set of nodes, but it is really the complete subgraph induced by that set of nodes.

The following uses my cheat (viewing cliques as sets of nodes) - it is not quite correct but pretty clear:

A maximal clique is a clique of a graph that is not a subset of any other clique of the graph.

More correctly, but a little less clearly:

A maximal clique is a clique of a graph that is not a subgraph of any other clique of the graph.

The Maximal Cliques problem is to list each of the maximal cliques of a graph.

The output of an algorithm for this problem can be quite long.

The list of maximal cliques is one way of representing the graph. From a list of maximal cliques, you could completely reconstruct the original graph.

I'll provide an example of a solution to one of these problems once I introduce different ways of representing graphs.

 

Problem - Subgraph Isomorphism

Recall that the problem of Graph Isomorphism is to determine whether two graphs of the same size (the same number of nodes) are exactly the same except for how they are labeled.

The Subgraph Isomorphism problem is similar, but more difficult.

The Subgraph Isomorphism problem is to determine whether graph A is isomorphic to a subgraph of graph B, where graph B has more nodes than graph A.

In other words, for each of the subgraphs of graph B that are the same size as graph A, is that subgraph of B isomorphic to A?

Subgraph Isomorphism is another one of those NP-Complete problems that we'll be talking about soon.

My work on Subgraph Isomorphism is very preliminary. I have ideas on how to approach the problem, but have not developed an algorithm for it yet.

Background - Representing Graphs

There are several ways of representinging graphs, including diagrams, matrices, and lists.

In the representations we will discuss, we'll assume that there are N nodes in the graph and that these nodes are labeled with the numbers 0 through N-1.

One graph representation is the adjacency matrix, which is simply N rows of N zeroes or ones. Although the the rows columns are each labeled from 0 through N-1, the labels are usually not shown. Row 0 is the top row in the matrix.

If the intersection of row A and column B has a 1 in it, that means that A and B are adjacent, that there is an edge between nodes A and B; otherwise, there is not an edge between nodes A and B.

0 1 1 1 1 1 1 0 0 0 1 0 0 1 1
1 0 1 0 0 1 1 1 1 1 1 1 1 0 1
1 1 0 1 1 1 0 1 1 0 1 1 1 1 1
1 0 1 0 1 1 1 1 0 1 1 1 1 1 0
1 0 1 1 0 0 1 1 0 1 1 1 1 0 1
1 1 1 1 0 0 1 1 1 0 0 1 1 1 0
1 1 0 1 1 1 0 0 1 1 1 1 1 1 0
0 1 1 1 1 1 0 0 1 0 1 0 1 1 1
0 1 1 0 0 1 1 1 0 1 0 1 1 1 1
0 1 0 1 1 0 1 0 1 0 1 1 1 1 0
1 1 1 1 1 0 1 1 0 1 0 1 0 0 1
0 1 1 1 1 1 1 0 1 1 1 0 1 1 0
0 1 1 1 1 1 1 1 1 1 0 1 0 0 1
1 0 1 1 0 1 1 1 1 1 0 1 0 0 0
1 1 1 0 1 0 0 1 1 0 1 0 1 0 0

Another graph representation is the list of adjacency sets. Each row is labeled with a node label and is followed by the set of labels of nodes that are adjacent to the node having the row label. It is easier to read than the adjacency matrix but is exactly equivalent to it.

0 : { 1, 2, 3, 4, 5, 6, 10, 13, 14 }
1 : { 0, 2, 5, 6, 7, 8, 9, 10, 11, 12, 14 }
2 : { 0, 1, 3, 4, 5, 7, 8, 10, 11, 12, 13, 14 }
3 : { 0, 2, 4, 5, 6, 7, 9, 10, 11, 12, 13 }
4 : { 0, 2, 3, 6, 7, 9, 10, 11, 12, 14 }
5 : { 0, 1, 2, 3, 6, 7, 8, 11, 12, 13 }
6 : { 0, 1, 3, 4, 5, 8, 9, 10, 11, 12, 13 }
7 : { 1, 2, 3, 4, 5, 8, 10, 12, 13, 14 }
8 : { 1, 2, 5, 6, 7, 9, 11, 12, 13, 14 }
9 : { 1, 3, 4, 6, 8, 10, 11, 12, 13 }
10 : { 0, 1, 2, 3, 4, 6, 7, 9, 11, 14 }
11 : { 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13 }
12 : { 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 14 }
13 : { 0, 2, 3, 5, 6, 7, 8, 9, 11 }
14 : { 0, 1, 2, 4, 7, 8, 10, 12 }

I usually work with a variation on the list of adjacency sets that I call the list of augmented adjacency sets. An augmented adjacency set is simply an adjacency set that also contains the label of the row. I find that this simplifies calculations. You just have to remember that a node isn't really adjacent to itself.

0 : { 0, 1, 2, 3, 4, 5, 6, 10, 13, 14 }
1 : { 0, 1, 2, 5, 6, 7, 8, 9, 10, 11, 12, 14 }
2 : { 0, 1, 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 14 }
3 : { 0, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13 }
4 : { 0, 2, 3, 4, 6, 7, 9, 10, 11, 12, 14 }
5 : { 0, 1, 2, 3, 5, 6, 7, 8, 11, 12, 13 }
6 : { 0, 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13 }
7 : { 1, 2, 3, 4, 5, 7, 8, 10, 12, 13, 14 }
8 : { 1, 2, 5, 6, 7, 8, 9, 11, 12, 13, 14 }
9 : { 1, 3, 4, 6, 8, 9, 10, 11, 12, 13 }
10 : { 0, 1, 2, 3, 4, 6, 7, 9, 10, 11, 14 }
11 : { 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13 }
12 : { 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14 }
13 : { 0, 2, 3, 5, 6, 7, 8, 9, 11, 13 }
14 : { 0, 1, 2, 4, 7, 8, 10, 12, 14 }

Example - Maximal Cliques of a Graph

Here is an example of some of the output from my maximal cliques algorithm :

Adjacency set representation of the graph :

0 : { 1, 2, 3, 4, 5, 6, 10, 13, 14 }
1 : { 0, 2, 5, 6, 7, 8, 9, 10, 11, 12, 14 }
2 : { 0, 1, 3, 4, 5, 7, 8, 10, 11, 12, 13, 14 }
3 : { 0, 2, 4, 5, 6, 7, 9, 10, 11, 12, 13 }
4 : { 0, 2, 3, 6, 7, 9, 10, 11, 12, 14 }
5 : { 0, 1, 2, 3, 6, 7, 8, 11, 12, 13 }
6 : { 0, 1, 3, 4, 5, 8, 9, 10, 11, 12, 13 }
7 : { 1, 2, 3, 4, 5, 8, 10, 12, 13, 14 }
8 : { 1, 2, 5, 6, 7, 9, 11, 12, 13, 14 }
9 : { 1, 3, 4, 6, 8, 10, 11, 12, 13 }
10 : { 0, 1, 2, 3, 4, 6, 7, 9, 11, 14 }
11 : { 1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 13 }
12 : { 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 14 }
13 : { 0, 2, 3, 5, 6, 7, 8, 9, 11 }
14 : { 0, 1, 2, 4, 7, 8, 10, 12 }

The maximal cliques of the graph are :

Maximal 6-cliques :

{ 1, 2, 5, 7, 8, 12 }
{ 1, 2, 5, 8, 11, 12 }
{ 1, 5, 6, 8, 11, 12 }
{ 1, 6, 8, 9, 11, 12 }
{ 1, 2, 7, 8, 12, 14 }
{ 3, 4, 6, 9, 10, 11 }
{ 3, 4, 6, 9, 11, 12 }

Maximal 5-cliques :

{ 0, 2, 3, 4, 10 }
{ 0, 3, 4, 6, 10 }
{ 0, 2, 3, 5, 13 }
{ 0, 3, 5, 6, 13 }
{ 0, 1, 2, 10, 14 }
{ 0, 2, 4, 10, 14 }
{ 1, 6, 9, 10, 11 }
{ 1, 2, 7, 10, 14 }
{ 2, 3, 4, 7, 10 }
{ 2, 3, 4, 10, 11 }
{ 2, 3, 4, 7, 12 }
{ 2, 3, 4, 11, 12 }
{ 2, 3, 5, 7, 12 }
{ 2, 3, 5, 11, 12 }
{ 2, 3, 5, 7, 13 }
{ 2, 5, 7, 8, 13 }
{ 2, 3, 5, 11, 13 }
{ 2, 5, 8, 11, 13 }
{ 2, 4, 7, 10, 14 }
{ 2, 4, 7, 12, 14 }
{ 3, 5, 6, 11, 12 }
{ 3, 5, 6, 11, 13 }
{ 3, 6, 9, 11, 13 }
{ 5, 6, 8, 11, 13 }
{ 6, 8, 9, 11, 13 }

Maximal 4-cliques :

{ 0, 1, 2, 5 }
{ 0, 1, 5, 6 }
{ 0, 1, 6, 10 }
{ 1, 2, 10, 11 }

There are no maximal 2-cliques or 3-cliques - all 2-cliques or 3-cliques are subsets of one of the listed maximal cliques and so are not maximal themselves.

Here are some other questions we could ask of this graph, and the answers that we should get:

Does an 8-clique exist in the graph? No

Does a 5-clique exist in the graph? Yes, and an example is { 0, 2, 3, 4, 10 }

Does a 3-clique exist in the graph? Yes, and an example is { 0, 2, 3 }

What is the size of a maximum clique of the graph? 6

What is a maximum clique of the graph? { 1, 2, 5, 7, 8, 12 }

How many maximum cliques are there in the graph? 7

What are the maximum cliques of the graph?

{ 1, 2, 5, 7, 8, 12 }
{ 1, 2, 5, 8, 11, 12 }
{ 1, 5, 6, 8, 11, 12 }
{ 1, 6, 8, 9, 11, 12 }
{ 1, 2, 7, 8, 12, 14 }
{ 3, 4, 6, 9, 10, 11 }
{ 3, 4, 6, 9, 11, 12 }

 

Problem - Universal Traversal Sequences

Imagine that you have a country that contains N cities and that there is exactly one road between each pair of cities.

People in each of these cities are stubborn and cling to their own idea of how to identify the roads that leave their city. Being contrarians (my kind of people), they won't let anyone use these labels on a map.

Fortunately for us, they have at least agreed to use the numbers in the range 0 through N-2 to label the roads (each city has a road with N-1 other cities).

If city A is connected to city B by a road, A might label the road 17 and B might label the road 38 - there is no way of predicting the label at one end of a road given the label at the other end of the road.

If a person wishes to leave a city, he is confronted by N-1 roads, labeled with 0 through N-2. The road signs don't indic ate which city is at the other end, and they all appear to leave in the same direction.

Oh yes, I almost forgot - all of the cities look alike, they don't have names, and their denizens will immediately erase any mark you might make trying to keep track of whether you've been there before.

In other words, it is entirely possible to wander around the country for the rest of your life without ever visiting the city you really want to visit.

As a frat hazing ordeal, you are dropped into one of these cities blindfolded (we'll let you have a car) and you are told you have to visit each of the cities at least once before you'll be allowed to become a member of the frat.

How do you handle this problem (assuming you really really want to be a member of the frat)?

You calculate a Universal Traversal Sequence for N cities and use it to guide you through the country.

OK, what's a Universal Traversal Sequence for N cities? Its a sequence of numbers, each of which is in the range of 0 through N-2. We can interpret each number as the number on a road sign.

How do you use a Universal Traversal Sequence (UTS)? You start in some city and leave that city by the road that has the first number in the sequence on its sign. That takes you to another city. You leave that city by the road that has the second number in the sequence on its sign. That takes you to another city (possibly the one you started in). An so on...

The goal is to have a sequence that works regardless of how each city labels its outgoing roads. A UTS may be longer than you'd think is needed, but remember that it needs to work for any mishmash of road labelings.

Here are a few Universal Traversal Sequences:

N UTS for N cities
2 0
3 0, 0, 1
3 0, 1, 1
4 0, 0, 1, 1, 2, 0
4 0, 1, 1, 1, 0, 2
4 0, 1, 2, 2, 0, 2
5 0, 1, 2, 1, 0, 1, 0 , 3, 3

It is well worth drawing three dots on a piece of paper, drawing three edges so they form a triangle, assigning outgoing labelings to each and trying each of the UTSs for 3 cities to see how they work. If you have the time and patience, try the 4’s, too.

I became interested in this problem when a teaching colleague gave a presentation on it around 1991. I haven’t done any work on it recently, but I still think it is a lovely problem.

I discovered these UTS results by brute force – nothing elegant, I’m afraid. Still, its nice to have some real sequences to play with.

 

Problem - Ramsey Theory - The Party Problem

This is another problem involving graph theory.

It comes from a branch of mathematics called Ramsey Theory, after the mathematician Frank Ramsey. Ramsey Theory has to do with limits to disorder - it deals with situations in which some order must occur.

There is a quote in a book on Ramsey Theory that captures the essense of the theory well, I think:

    "Complete disorder is impossible."

        - T.S. Motzkin

As you've probably notice, I like problems in graph theory. Ramsey Theory can be investigated through many areas of mathematics, one of which (the most natural for me) is graph theory.

The central problem in Ramsey Theory is known as the Party Problem, which I'll state as:

What is the fewest number of people who need to be in a room before you are guaranteed that either m people are mutual acquaintances or n people are mutual strangers?

We'll use the notation R(m, n) to denote the Ramsey number that is the answer to the previous question.

I like to focus on a variant of this problem in which m = n, which I'll call the Symmetric Party Problem :

What is the fewest number of people who need to be in a room before you are guaranteed that either m people are mutual acquaintances or m people are mutual strangers?

The answer to this question is R(m, m).

A great deal of work has been done on this problem, and there are many theoretical results, some quite beautiful, but few actual Ramsey numbers are known.

Let's start with the simplest cases.

What is R(2, 2)? That is, how many people need to be in a room to ensure that 2 of them are mutual acquaintences or 2 of them are mutual strangers?

That's pretty simple: 2

What is R(3, 3)? That is, how many people need to be in a room to ensure that 3 of them are mutual acquaintences or 3 of them are mutual strangers?

This one isn't as simple - the answer is 6

What is R(4, 4)? 18

What is R(5, 5)? No one knows !

At least, no one knows the exact number. Researchers have narrowed down the range that R(5, 5) has to be in the range [43 through 49]. No one has gotten closer.

Think about it. I can guarantee you that some quality researchers have sweated on this one and that a great many hours of computer computation have been burned in pursuit of it, but no one can yet answer the following question:

How many people need to be in a room to ensure that 5 of them are mutual acquaintances or 5 of them are mutual strangers?

This just floors me. It is also one of the reasons that I'm fascinated by these types of problems - it seems so wrong that this can't be figured out quickly.

Here are a few more results for symmetric Ramsey numbers:

R(6, 6) is in the range [102 through 165]

R(7, 7) is in the range [205 through 540]

R(8, 8) is in the range [282 through 1870]

R(9, 9) is in the range [565 through 6588]

R(10, 10) is in the range [580 through 12677]

R(11, 11) is in the range [1597 through 184755]

R(12, 12) is in the range [1637 through 705431]

Other Ramsey numbers, exact and otherwise, can be found at: http://mathworld.wolfram.com/RamseyNumber.html

What about graph theory? First, a new term...

An 'independent set' is like the complement of a clique: a clique is a graph that contains every possible edge, while an independent set is a graph that contains no edges. I like to call independent sets 'anticliques', but I think I'm the only person to do so.

The graph-theory version of the Party Problem is: How many nodes must a graph have before it must contain an m-clique or an n-independent set?

I find this to be a beautiful problem, but I haven't yet invested much time on it.

 
 
 
  Home > Jabberwocky           Previous      Next
 
  Craig Holman   Research   Jabberwocky   Quotations   Contact  
 
 
 
  Patterncraft™ and Constraint and Consequence™
are trademarks of Craig Stewart Holman

Copyright © 2008 Craig Stewart Holman

Copyright and Legal Notices

All Rights Reserved.

 
   
 
  Patterncraft logo