Basics of NP P NPC Nphard
Basics of NP P NPC Nphard
NP-Hard Problem:
A Problem X is NP-Hard if there is an NP-Complete problem Y, such that Y is
reducible to X in polynomial time. NP-Hard problems are as hard as
NP-Complete problems. NP-Hard Problem need not be in NP class.
NP-Complete Problem:
A problem X is NP-Complete if there is an NP problem Y, such that Y is reducible
to X in polynomial time. NP-Complete problems are as hard as NP problems. A
problem is NP-Complete if it is a part of both NP and NP-Hard Problem. A
non-deterministic Turing machine can solve NP-Complete problem in polynomial
time.
Difference between NP-Hard and NP-Complete:
NP-hard NP-Complete
To solve this problem, it do not have to To solve this problem, it must be both NP and
be in NP . NP-hard problems.
P (Polynomial) problems
P problems refer to problems where an algorithm would take a
polynomial amount of time to solve, or where Big-O is a
polynomial (i.e. O(1), O(n), O(n²), etc). These are problems that
would be considered ‘easy’ to solve, and thus do not generally
have immense run times.
Reduction
I can’t really explain this one outside of using examples, so: we
have two problems, A and B, and we know problem B is a P class
problem. If problem A can be reduced, or converted to problem B,
and this reduction takes a polynomial amount of time, then we
can say that A is also a P class problem (A is reducible to B).
NP-Hard Problems
A problem is classified as NP-Hard when an algorithm for solving
it can be translated to solve any NP problem. Then we can say,
this problem is at least as hard as any NP problem, but it could be
much harder or more complex.
NP-Complete Problems
NP-Complete problems are problems that live in both the NP and
NP-Hard classes. This means that NP-Complete problems can be
verified in polynomial time and that any NP problem can be
reduced to this problem in polynomial time.
In Computer Science, many problems are solved where the objective is to maximize or
minimize some values, whereas in other problems we try to find whether there is a
solution or not. Hence, the problems can be categorized as follows −
Optimization Problem
Optimization problems are those for which the objective is to maximize or minimize
some values. For example,
● Finding the minimum number of colors needed to color a given graph.
● Finding the shortest path between two vertices in a graph.
Decision Problem
There are many problems for which the answer is a Yes or a No. These types of
problems are known as decision problems. For example,
● Whether a given graph can be colored by only 4-colors.
● Finding Hamiltonian cycle in a graph is not a decision problem, whereas
checking a graph is Hamiltonian or not is a decision problem.
What is Language?
Every decision problem can have only two answers, yes or no. Hence, a decision
problem may belong to a language if it provides an answer ‘yes’ for a specific input. A
language is the totality of inputs for which the answer is Yes. Most of the algorithms
discussed in the previous chapters are polynomial time algorithms.
For input size n, if worst-case time complexity of an algorithm is O(nk), where k is a
constant, the algorithm is a polynomial time algorithm.
Algorithms such as Matrix Chain Multiplication, Single Source Shortest Path, All Pair
Shortest Path, Minimum Spanning Tree, etc. run in polynomial time. However there are
many problems, such as traveling salesperson, optimal graph coloring, Hamiltonian
cycles, finding the longest path in a graph, and satisfying a Boolean formula, for which
no polynomial time algorithms is known. These problems belong to an interesting class
of problems, called the NP-Complete problems, whose status is unknown.
In this context, we can categorize the problems as follows −
P-Class
The class P consists of those problems that are solvable in polynomial time, i.e. these
problems can be solved in time O(nk) in worst-case, where k is constant.
These problems are called tractable, while others are called intractable or
superpolynomial.
Formally, an algorithm is polynomial time algorithm, if there exists a
polynomial p(n) such that the algorithm can solve any instance of size n in a
time O(p(n)).
Problem requiring Ω(n50) time to solve are essentially intractable for large n. Most
known polynomial time algorithm run in time O(nk) for fairly low value of k.
The advantages in considering the class of polynomial-time algorithms is that all
reasonable deterministic single processor model of computation can be simulated
on each other with at most a polynomial slow-d
NP-Class
The class NP consists of those problems that are verifiable in polynomial time. NP is
the class of decision problems for which it is easy to check the correctness of a
claimed answer, with the aid of a little extra information. Hence, we aren’t asking for a
way to find a solution, but only to verify that an alleged solution really is correct.
Every problem in this class can be solved in exponential time using exhaustive search.
P versus NP
Every decision problem that is solvable by a deterministic polynomial time algorithm is
also solvable by a polynomial time non-deterministic algorithm.
All problems in P can be solved with polynomial time algorithms, whereas all problems
in NP - P are intractable.
It is not known whether P = NP. However, many problems are known in NP with the
property that if they belong to P, then it can be proved that P = NP.
If P ≠ NP, there are problems in NP that are neither in P nor in NP-Complete.
The problem belongs to class P if it’s easy to find a solution for the problem. The
problem belongs to NP, if it’s easy to check a solution that may have been very tedious
to find.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
● B is in NP
● Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the
language B is known as NP-Hard. Informally, a search problem B is NP-Hard if there
exists some NP-Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a
problem is proved to be NPC, there is no need to waste time on trying to find an
efficient algorithm for it. Instead, we can focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is
known.
NP-Hard Problems
The following problems are NP-Hard
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to
the same city. The challenge of the problem is that the traveling salesman wants to
minimize the total length of the trip
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In
TSP, we find a tour and check that the tour contains each vertex once. Then the total
cost of the edges of the tour is calculated. Finally, we check if the cost is minimum. This
can be completed in polynomial time. Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show
that Hamiltonian cycle ≤p TSP (as we know that the Hamiltonian cycle problem is
NPcomplete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'),
where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j
Thus, the cost function is defined as follows −
t(i,j)={01if(i,j)∈Eotherwiset(i,j)={0if(i,j)∈E1otherwise
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each
edge in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus,
if graph G has a Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges
in E' are 0 and 1 by definition. Hence, each edge must have a cost of 0 as the cost
of h' is 0. We therefore conclude that h' contains only edges in E.
We have thus proven that G has a Hamiltonian cycle, if and only if G' has a tour of cost
at most 0. TSP is NP-complete.
P, NP, NP-Complete
and NP-Hard
Problems in
Computer Science
● Algorithms
● Core Concepts
● NP-Complete
1. Introduction
In computer science, there exist several famous unresolved problems, and is one of the most
studied ones. Until now, the answer to that problem is mainly “no”. And, this is accepted by the
majority of the academic world. We probably wonder why this problem is still not resolved.
In this tutorial, we explain the details of this academic problem. Moreover, we also show
both and problems. Then, we also add definitions of and . And in the end, hopefully, we would
have a better understanding of why is still an open problem.
2. Classification
To explain , , and others, let’s use the same mindset that we use to classify problems in real life.
While we could use a wide range of terms to classify problems, in most cases we use an
“Easy-to-Hard” scale.
Now, in theoretical computer science, the classification and complexity of common problem
definitions have two major sets; which is “Polynomial” time and which “Non-deterministic
Polynomial” time. There are also and sets, which we use to express more sophisticated
problems. In the case of rating from easy to hard, we might label these as “easy”, “medium”,
“hard”, and finally “hardest”:
● Easy
● Medium
● Hard
● Hardest
Using the diagram, we assume that and are not the same set, or, in other words, we assume
that . This is our apparently-true, but yet-unproven assertion. Of course, another interesting
aspect of this diagram is that we’ve got some overlap between and . We call when the problem
belongs to both of these sets.
Alright, so, we’ve mapped , , and to “easy”, “medium”, “hard” and “hardest”, but how does we
place a given algorithm in each category? For that, we’ll need to get a bit more formal through
the next section.
Through the rest of the article, we generally prefer not to use units like “seconds” or
“milliseconds”. Instead, we prefer proportional expressions like , , , and , using Big-O notation.
Those mathematical expressions give us a clue about the algorithmic complexity of a problem.
3. Problem Definitions
Let’s quickly review some common Big-O values:
● – constant-time
● – logarithmic-time
● – linear-time
● – quadratic-time
● – polynomial-time
● – exponential-time
● – factorial-time
where is a constant and is the input size. The size of also depends on the problem definition.
For example, using a number set with a size of , the search problem has an average complexity
between linear-time and logarithmic-time depending on the data structure in use.
As we talked about earlier, all of these have a complexity of for some , and that fact places them
all in . Of course, we don’t always have just one input, . But, so long as each input is a
polynomial, multiplying them will still be a polynomial. For example, in graphs, we use for
edges and for vertices, which gives us for Bellman-Ford’s shortest path algorithm. Even if the
size of the edge set is , the time complexity is still a polynomial, , so we’re still in .
We can’t always pinpoint the Big-O for an algorithm. Outside of Big-O, we can think about the
problem description. Consider, for example, the game of checkers. What is the complexity of
determining the optimal move on a given turn? If we constrain the size of the board to , then this
is believed to be a polynomial-time problem, placing it in . But if we say it’s an board, it’s no
longer in . In this case, how we constrain the search space affects where we place it. Similarly,
the Hamiltonian-Path problem has polynomial-time solutions for only some types of input
graphs.
Or another example is the stable roommate problem; it’s polynomial-time to match without a tie,
but not when ties are allowed or when we include roommate preferences like married couples.
(These variants are actually , which we’ll cover in a moment.) Still another factor to consider is
the size of relative to . If the input size is going to be near , then the algorithm is going to behave
more like an exponential.
3.2. NP Algorithms
The second set of problems cannot be solved in polynomial time. However, they can be verified
(or certified) in polynomial time. We expect these algorithms to have an exponential complexity,
which we’ll define as: where , and where , and are constants and is the input size. is a
function of exponential-time when at least and . As a result, we get . For example, we’ll see
complexities like , , in this set of problems. There are several algorithms that fit this description.
Among them are:
Both of these have two important characteristics: Their complexity is for some and their results
can be verified in polynomial time. Those two facts place them all in , that is, the set of
“Non-deterministic Polynomial” algorithms. Now, formally, we also state that these problems
must be decision problems – have a yes or no answer – though note that practically speaking,
all function problems can be transformed into decision problems. This distinction helps us to nail
down what we mean by “verified”.
To speak precisely, then, an algorithm is in if it can’t be solved in polynomial time and the set of
solutions to any decision problem can be verified in polynomial time by a “Deterministic Turing
Machine“. What makes Integer Factorization and Graph Isomorphism interesting is that while
we believe they are in , there’s no proof of whether they are in and . Normally, all algorithms
are in , but they have another property that makes them more complex compared to problems.
Let’s continue with that difference in the next section.
● Traveling Salesman
● Knapsack, and
● Graph Coloring
Curiously, what they have in common, aside from being in , is that each can be reduced into the
other in polynomial time. These facts together place them in . The major and primary work
of belongs to Karp. And his problems are fundamental to this theoretical computer science
topics. These works are founded on the Cook-Levin theorem and prove that the Satisfiability
(SAT) problem is :
● K-means Clustering
● Traveling Salesman Problem, and
● Graph Coloring
These algorithms have a property similar to ones in – they can all be reduced to any problem in .
Because of that, these are in and are at least as hard as any other problem in . A problem can be
both in and , which is another aspect of being .
This characteristic has led to a debate about whether or not Traveling Salesman is indeed .
Since and problems can be verified in polynomial time, proving that an algorithm cannot be
verified in polynomial time is also sufficient for placing the algorithm in .
to : It’s an
interesting problem because it would mean, for one, that any or problem can be solved in
polynomial time. So far, proving that as proven elusive. Because of the intrigue of this problem,
it’s one of the Millennium Prize Problems for which there is a $1,000,000 prize.
For our definitions, we assumed that , however, may be possible. If it were so, aside
from or problems being solvable in polynomial time, certain algorithms in would also
dramatically simplify. For example, if their verifier is or , then it follows that they must also be
solvable in polynomial time, moving them into as well.
We can conclude that means a radical change in computer science and even in the real-world
scenarios. Currently, some security algorithms have the basis of being a requirement of too long
calculation time. Many encryption schemes and algorithms in cryptography are based on
the number factorization which the best-known algorithm with exponential complexity. If we
find a polynomial-time algorithm, these algorithms become vulnerable to attacks.
5. Conclusion
Within this article, we have an introduction to a famous problem in computer science. Through
the article, we focused on the different problem sets; , , , and . We also provided a good starting
point for future studies and what-if scenarios when . Briefly after reading, we can conclude a
generalized classification as follows:
As a final note, if has proof in the future, humankind has to construct a new way of security
aspects of the computer era. When this happens, there has to be another complexity level to
identify new hardness levels than we have currently.