![]() |
|
| | |
| Craig Holman Research Jabberwocky Quotations Contact | |
| Home >
Jabberwocky      |
|
Jabberwocky for May 27, 2009 |
|
Problem-Solving Services - An Overview     Introduction Many problems that arise in industry and research are challenging to solve correctly, on schedule, and within budget. Reasons for these challenges include the complexity of the problem, the complexity of an algorithm, the availability of several algorithms for solving instances of a problem, and the rate of innovation and improvement in the development of algorithms. Many problems are NP-complete and are thought to present any algorithm with instances that require exponential time to solve, even though most instances may be solved inexpensively. Other problems (e.g. graph isomorphism, factoring) have instances that are expensive to solve using currently available algorithms. Before undertaking the solution of a problem instance, estimation of the resources that may be required and the risk of the solution effort failing may be desired. There are often several parameterized algorithms that may be used in attempts to solve a problem instance. Efficient analysis of the instance may permit the selection of those algorithms and parameters that are most likely to be effective in the solution effort. Some algorithms are complex enough to be difficult to implement correctly and efficiently, and many algorithms are difficult to implement so that they can handle large instances of problems. Adapting algorithms for parallel or distributed processing, so that they can take advantage of frequent improvements in processing platforms, requires expertise in design, implementation, and testing. Few development shops and few library providers are able to provide certifiable implementations of algorithms and few if any are able to provide certifiable implementations of state-of-the-art algorithms. Few are able to make use of algorithms that evaluate a problem instance and dynamically plan for its efficient solution, making use of the best tools available. Few make use of prior experience to inform and tune the development of solution plans. I suggest that offering a problem-solving service is the best means of making problem-solving software for challenging problems available and propose that the design and implementation of such a service be undertaken by a company adequate to the task. I believe that the potential advantages of such a service to both the service provider and its clients are great and that it would be profitable and beneficial to all concerned if well executed. The customer's point of view At its simplest, a client sends to the service an instance of a problem, any parameters or constraints, and a description of the information that it wants returned. The service sends back a proposed budget for the solution effort and a final solution request / budget is negotiated. The service attempts the solution and returns the requested information, including solution(s) if found, to the client. In more detail, here is how a successful solution effort might proceed as seen by a client:
A more complete version of this scenario will be provided later, including some high-level descriptions of what the service does. Which problems are appropriate? Not every problem is appropriate for sending to a problem-solving service for solution. Problems that may not be appropriate for solution by a problem-solving service have one or more of the following characteristics:
The challenges of solving instances of challenging problems The central problem is this: Once we have more than one way of attacking a problem, how do we know which one to use? Let's assume that we have two algorithms, A and B, for solving problem P. Neither is clearly more efficient than the other in general - each is expensive for some problem instances and inexpensive for others. Our goal, then, is to minimize the cost of solving a sequence of problems on an engine. Which algorithm, A or B, should we pick to solve instance I of problem P? It really depends upon what we've learned about the behavior of A and B and what a characterization of I suggests in light of what we know about A and B. We'll work with a problem-instance space - a space of one or more dimensions, each of which is a metric of a problem instance. Each problem instance corresponds to a point in this space. We'll be looking into identifying regions of this space that are 'safe' and ones that are 'risky'. - regions in which all instances are easy for algorithm X and regions in which some instances that are difficult for X are found. Consider boolean expressions and the problem of determining whether a boolean expression is satisfiable (has an interpretation under which it evalutes to True). There are many obvious metrics of boolean expressions (e.g. the number of literals, variables, negators in the Negative Normal Form), several that are less obvious (e.g. the number of clauses in the CNF and DNF equivalents of the expression) and some that I've made up (e.g. the degree of coincidence). Which of these descriptive metrics are good at predicting a property that we're interested in?
An example from boolean expressions
This example is borrowed from my May 31, 2008 post:
I found that a metric that I called 'degree of coincidence', based upon an aspect of the structure of the expression, was a great predictor for whether expressions were tautologies or contradictions. The Java applet (hopefully) running above displays four pieces of information about several hundred randomly generated boolean expressions. The color of each dot is the expression's logical class: tautologies, contradictions, and expressions that are both satisfiable and falsifiable. This is a static view of similar data:
In this view, the horizontal axis is the log10 of the number of clauses in the Conjunctive Normal Form (CNF) equivalent of the expression and the vertical axis is the log10 of the number of clauses in the Disjunctive Normal Form (DNF) equivalent of the expression. The third spatial dimension, projecting out from the screen for the static view, is the Degree of Coincidence for the expression. The important thing to note is how well the Degree of Coincidence separates tautologies from contradictions. This instance space has four dimensions, one for each of the four boolean expression metrics. The problem in this case is to determine the logical class of an expression and three other metrics were chosen as being possible predictive metrics of logical class. A set of metrics doesn't have to be a good predictor in all regions of an instance space to be useful. Other sets of metrics may be better predictors in other regions of the space. Even if some regions of the the space have no good predictors of a property, we'll benefit from the existence of the ones that do. I have often fallen into the trap of searching the perfect, golden algorithm for a challenging problem. To be sure, that remains my ultimate goal, but I've begun to figure out that judging an algorithm on that basis is misguided and disheartening. An algorithm is designed to solve a problem. The algorithm will usually work better for some instances of that problem than others. Some instances may be solvable more efficiently than others, while other instances may not be solvable at all by the algorithm. Many problems that interest me are considered very challenging because they belong to a complexity class called 'NP-complete'. No one knows whether the NP-complete problems have polynomial or exponential time complexity, but we do know that if any of these problems is polynomial, then all of them are polynomial, and that if any of these problems is exponential, then all of them are exponential. Determining and proving the complexity of this class of problems is one of the great problems in mathematics and computer science. We can construct a problem-instance space (of one or more dimensions) that contains all instances of the problem as points. For simplicity, we'll assume for the moment that this space consists of three dimensions, two of which are spatial. The third dimension is a color that reflects the time required to solve the instance. Let tough problem instances (can't be solved within a reasonable time) be colored red and the other instances be colored blue. Our goal is to come up with an algorithm for which the instance space is solid blue. A problem-instance space for Algorithm A and Algorithm B might look like this:
while the problem-instance space to the right shows what happens when we use whichever algorithm is most efficient for each instance. If we follow the simple rule that we should use Algorithm A for any instance in the top half of the space and Algorithm B for any instance in the bottom half of the space, we are guaranteed to use an efficient algorithm on each instance. The world is seldom so kind or simple, but many algorithms can make performance guarantees for regions of a specified problem-instance space. By developing several very different algorithms for solving the same problem and determining the regions of the space for which they are effective, we may be able to greatly reduce, if not eliminate, the region in which red points can appear for an algorithm that makes uses the best algorithm for each instance. In the quest for finding polynomial-time worst-case algorithms for some challenging problems, I have taken to referring to the goal of finding such a patchwork algorithm as the goal of finding 'polynomial coverage'. Similarly for 'tractable coverage'. In this way, algorithms that fail to provide the desired coverage throughout a problem-instance space, but which are able to guarantee the desired coverage for some region of that space, may prove to be quite valuable as part of a patchwork algorithm. An approach involving metrics and algorithms We're faced with the problem of using a collection A of algorithms and a collection B of bound algorithms (algorithms in A bound to a sequence of non-instance parameters for each algorithm, if any) to efficiently and effectively solve instances of a problem P. P and B have the following characteristics:
We need to figure out how use information derived from an instance I of P to solve I as efficiently as we can and, if I is very likely to be intractable to algorithms in B, to determine this and report failure as quickly as possible. We can usually derive many metrics (MI) from an instance of a problem. We may derive additional metrics (MI,b) when we apply an algorithm b in B to the instance, including the amount of work (W) that was required to find a solution, if one was found within a specified work limit. Some metrics in MI may be good predictors of some metrics in MI,b and of W, in particular. They may not be uniformly good predictors, however - they may only be good predictors for some regions of a problem instance space. We often have no idea, however, of which metrics may be good predictors of metrics of interest and the ranges of a problem-instance space for which they are useful predictors. The approach that I propose is:
Most of this problem-solving approach can be fully automated today, and the devising and gathering of new metrics should be automatable in the future. Monte Carlo studies with comprehensive algorithm sampling is time consuming and expensive, but it needs to be done once well and can be augmented with data from the processing of real problem instances and their off-line exploration in the future. Once we have a sufficient understanding of our metric database, we can devise branching plans of attack for problem instances based upon their characteristics. We'll want to use inexpensive metrics on our instance that will begin to move us down the branches of this plan tree, reserving the more expensive metrics for when they are needed, if at all. Next, we take a problem instance that we've been asked to solve, we can begin to follow the plan tree, making measurements of metrics and branching as guided by the plan and its underlying statistical database. Suppose that algorithms C and D each take one real parameter in the range [0.0 .. 1.0]. Assuming that parallel processing is not available to us, we might, guided by our statistical database, execute a sequence of actions as follows for instance I:
The Adaptive Market pattern I've frequently encountered the need to manage the use of available tools to perform a task effectively. I've generalized this into the Adaptive Market design pattern. This pattern regards a community of agents as a free market in which those offering services and those desiring services are free to enter into contracts for the provision of services and for compensation for having done so. The market is adaptive in that many properties of the participants will change over time, including capabilities, capacities, resources, standards, reputation, metrics, tactics, and strategies. Adaptive markets may be nested as deeply as necessary. The knowledge that the many participants in an adaptive market possess is localized and may be largely derived from experience. Implementations of adaptive markets for use in a problem-solving service are likely to be based upon a dynamic statistical database that may be initialized and periodically enriched by significant Monte Carlo studies. A typical sequence of interactions with a market includes putting a task up for bid, evaluating submissions, selecting a contractor, negotiating a contract, executing the contract, and evaluating performance. Problem-solving engines A problem-solving engine is a software component that:
Knowledge about an instance and solution efforts for that instance can be obtained from a blackboard, which should be updated by the engine at the end of its work on the instance or upon suspension. Information from a blackboard may obviate some of the work that the engine would normally perform. Knowledge from the blackboard may be used by adaptive markets to update the statistical and other models maintained at various levels of the system, including the problem-solving engines. More than one occurrence of a problem-solving engine may be running at a time, each of which is tasked with work on a different instance of the problem. Transforming instances between problems All problems in the NP-Complete complexity class have the following characteristics:
It isn't that we transform one problem into another but that we transform an instance of one problem into an instance of another problem, solve that other-problem instance, and transform the solution back into the original problem context. For example, suppose that we have a boolean expression whose satisfiability we want to determine but our program for determining boolean satisfiability is weak or missing. We do, however, have a very good program for solving instances of the k-Clique Exists problem. We can transform the boolean satisfiability problem instance into a k-clique exists problem instance, solve it, and port the solution back into the boolean satisfiabity realm. There are a great many NP-complete problems, including some of my favorites (e.g. Boolean Satisfiability, k-Clique Exists), as well as classics such as the Traveling Salesman problem, Bin Packing, Multiprocessor Scheduling, Timetable Design, and Clustering. Transformation of instances between problems isn't restricted to the NP-complete problems, but it is especially useful for them, for polynomial-time transformability is guaranteed. An instance-transformation engine is a software component that:
An instance-transformation engine may, but is not required to:
Problem graphs A problem graph consists of nodes, each of which represents a different problem, and directed edges between nodes, each of which represents a different transformation of instances of one problem to another. Associated with each node is an adaptive market for solvers of the node's problem, either directly through solution engines at this node or indirectly through transformers and solution engines at other nodes. Associated with each directed edge is an adaptive market for solvers of the source node's problem that includes the cost of transformation. Modification of this graph is straightforward. Due to the localization of knowledge in the graph, nodes, edges, solution engines, and transformation engines may easily be added, removed, or modified. A problem-solving system I'd like to have a problem-solving system that can solve any of the NP-complete problems, as well as many other challenging problems. This system would offer a catalog of problems that it can solve, although it may not have a good engine for many problems. The heart of the system is a problem graph that contains a node for each problem in the catalog. This system would accept a problem instance to be solved, put the solution effort up for bid in the problem's node's adaptive market, choose the best bidder, negotiate a budget with the client, and, if approved, award the effort to the best bidder, returning the results and any solutions to the client. Using a problem-solving service - an example
Some challenging questions Imagine a company that develops and sells access to software that solves challenging problems for customers. It offers a catalog of problems that its software can solve. What are some of the questions that need to be answered satisfactorily before a potential customer decides to use this software?
Some objectives of commercialization There are many objectives that must be considered when designing a company that would provide problem-solving services. I'll focus on a few objectives that I consider fundamental, that address some of the questions that I have raised, and that support a decision to offer software as a service:
Some answered questions Here are some answers to questions that potential customers should ask of a provider of problem-solving services:
Conclusion As software-as-a-service becomes a viable means of providing access to software, providing access to algorithms and problem-solving services through the internet emerges as a promising business opportunity. Many commonly occurring problems present enough challenging instances and financial risk to benefit from the expertise, resources, and innovation available from a premier problem-solving service provider. The approach to a problem-solving service that was sketched in this document, including the Adaptive Market pattern and a problem graph that hosts problem-solving engines and problem-instance-transforming engines, offers a coherent and comprehensive framework could be the basis of a commercially successful enterprise. The path to launching a problem-solving service includes planning, acquiring competence, achieving performance and availability, testing sufficiently to inspire confidence, gaining trust, learning from experience, and ongoing research and innovation.      |
| Home >
Jabberwocky      |
|
| Craig Holman Research Jabberwocky Quotations Contact | |
| Patterncraft and Constraint and Consequence are trademarks of Craig Stewart Holman Copyright © 2009 Craig Stewart Holman All Rights Reserved. |
|
| | |
![]() |
|