[computer-science] What's "P=NP?", and why is it such a famous question?

First, some definitions:

  • A particular problem is in P if you can compute a solution in time less than n^k for some k, where n is the size of the input. For instance, sorting can be done in n log n which is less than n^2, so sorting is polynomial time.

  • A problem is in NP if there exists a k such that there exists a solution of size at most n^k which you can verify in time at most n^k. Take 3-coloring of graphs: given a graph, a 3-coloring is a list of (vertex, color) pairs which has size O(n) and you can verify in time O(m) (or O(n^2)) whether all neighbors have different colors. So a graph is 3-colorable only if there is a short and readily verifiable solution.

An equivalent definition of NP is "problems solvable by a Nondeterministic Turing machine in Polynomial time". While that tells you where the name comes from, it doesn't give you the same intuitive feel of what NP problems are like.

Note that P is a subset of NP: if you can find a solution in polynomial time, there is a solution which can be verified in polynomial time--just check that the given solution is equal to the one you can find.

Why is the question P =? NP interesting? To answer that, one first needs to see what NP-complete problems are. Put simply,

  • A problem L is NP-complete if (1) L is in P, and (2) an algorithm which solves L can be used to solve any problem L' in NP; that is, given an instance of L' you can create an instance of L that has a solution if and only if the instance of L' has a solution. Formally speaking, every problem L' in NP is reducible to L.

Note that the instance of L must be polynomial-time computable and have polynomial size, in the size of L'; that way, solving an NP-complete problem in polynomial time gives us a polynomial time solution to all NP problems.

Here's an example: suppose we know that 3-coloring of graphs is an NP-hard problem. We want to prove that deciding the satisfiability of boolean formulas is an NP-hard problem as well.

For each vertex v, have two boolean variables v_h and v_l, and the requirement (v_h or v_l): each pair can only have the values {01, 10, 11}, which we can think of as color 1, 2 and 3.

For each edge (u, v), have the requirement that (u_h, u_l) != (v_h, v_l). That is,

not ((u_h and not u_l) and (v_h and not v_l) or ...) enumerating all the equal configurations and stipulation that neither of them are the case.

AND'ing together all these constraints gives a boolean formula which has polynomial size (O(n+m)). You can check that it takes polynomial time to compute as well: you're doing straightforward O(1) stuff per vertex and per edge.

If you can solve the boolean formula I've made, then you can also solve graph coloring: for each pair of variables v_h and v_l, let the color of v be the one matching the values of those variables. By construction of the formula, neighbors won't have equal colors.

Hence, if 3-coloring of graphs is NP-complete, so is boolean-formula-satisfiability.

We know that 3-coloring of graphs is NP-complete; however, historically we have come to know that by first showing the NP-completeness of boolean-circuit-satisfiability, and then reducing that to 3-colorability (instead of the other way around).

Examples related to computer-science

HTML5 Canvas background image What exactly does big ? notation represent? Fixed point vs Floating point number What are the differences between a program and an application? What do we mean by Byte array? How to determine the longest increasing subsequence using dynamic programming? What is "entropy and information gain"? What are the differences between NP, NP-Complete and NP-Hard? What is the difference between statically typed and dynamically typed languages? What is “2's Complement”?

Examples related to theory

while-else-loop Using IS NULL or IS NOT NULL on join conditions - Theory question Why are C++ inline functions in the header? Difference Between Cohesion and Coupling What is a database transaction? What good are SQL Server schemas? How to program a fractal? What is an NP-complete in computer science? Way to go from recursion to iteration What's "P=NP?", and why is it such a famous question?

Examples related to complexity-theory

Differences between time complexity and space complexity? Determining complexity for recursive functions (Big O notation) How to find time complexity of an algorithm How can building a heap be O(n) time complexity? HashMap get/put complexity Big-oh vs big-theta Is log(n!) = T(n·log(n))? Time complexity of accessing a Python dict What are the differences between NP, NP-Complete and NP-Hard? What's the fastest algorithm for sorting a linked list?

Examples related to np-complete

What are the differences between NP, NP-Complete and NP-Hard? What is an NP-complete in computer science? What's "P=NP?", and why is it such a famous question?

Examples related to p-np

What's "P=NP?", and why is it such a famous question?