First, some definitions:
A particular problem is in P if you can compute a solution in time less than n^k
for some k
, where n
is the size of the input. For instance, sorting can be done in n log n
which is less than n^2
, so sorting is polynomial time.
A problem is in NP if there exists a k
such that there exists a solution of size at most n^k
which you can verify in time at most n^k
. Take 3-coloring of graphs: given a graph, a 3-coloring is a list of (vertex, color) pairs which has size O(n)
and you can verify in time O(m)
(or O(n^2)
) whether all neighbors have different colors. So a graph is 3-colorable only if there is a short and readily verifiable solution.
An equivalent definition of NP is "problems solvable by a Nondeterministic Turing machine in Polynomial time". While that tells you where the name comes from, it doesn't give you the same intuitive feel of what NP problems are like.
Note that P is a subset of NP: if you can find a solution in polynomial time, there is a solution which can be verified in polynomial time--just check that the given solution is equal to the one you can find.
Why is the question P =? NP
interesting? To answer that, one first needs to see what NP-complete problems are. Put simply,
Note that the instance of L must be polynomial-time computable and have polynomial size, in the size of L'; that way, solving an NP-complete problem in polynomial time gives us a polynomial time solution to all NP problems.
Here's an example: suppose we know that 3-coloring of graphs is an NP-hard problem. We want to prove that deciding the satisfiability of boolean formulas is an NP-hard problem as well.
For each vertex v, have two boolean variables v_h and v_l, and the requirement (v_h or v_l): each pair can only have the values {01, 10, 11}, which we can think of as color 1, 2 and 3.
For each edge (u, v), have the requirement that (u_h, u_l) != (v_h, v_l). That is,
not ((u_h and not u_l) and (v_h and not v_l) or ...)
enumerating all the equal configurations and stipulation that neither of them are the case.
AND
'ing together all these constraints gives a boolean formula which has polynomial size (O(n+m)
). You can check that it takes polynomial time to compute as well: you're doing straightforward O(1)
stuff per vertex and per edge.
If you can solve the boolean formula I've made, then you can also solve graph coloring: for each pair of variables v_h and v_l, let the color of v be the one matching the values of those variables. By construction of the formula, neighbors won't have equal colors.
Hence, if 3-coloring of graphs is NP-complete, so is boolean-formula-satisfiability.
We know that 3-coloring of graphs is NP-complete; however, historically we have come to know that by first showing the NP-completeness of boolean-circuit-satisfiability, and then reducing that to 3-colorability (instead of the other way around).