What's the definition of NP mean in Complexity and Computability? - time-complexity

We're learning Complexity in our CS class and one of the topics is NP and P complexity. I know that NP stands for Non-deterministic Polynomial time and P stands for Polynomial time. Now I'm trying to grasp what these mean.
What does "Non-deterministic Polynomial" mean? I'm pretty sure that it relates to the run-time of a problem but I'm not a 100% sure. I've reads Wikipedia articles and some other blogs but they're explanation is at a higher level than what we did, so if the answer can be "dumbed down" a little bit.

Dumbed down version: If you have both a problem and a solution and you can check if the solution is right in polynomial time, it's in NP. If you have only a problem and you can find the right solution in polynomial time, it's in P. P is a subset of NP.
If you really want to understand what it means, you need some basics in automata theory and find out what deterministic and non-deterministic Turing machines are. That's not a thing someone can really dumb down for you easily.

Related

Time complexity of log

I was studying for runtime analysis.
One of the question that I encountered is this.
Which function would grow the slowest?
n^0.5
log(n^0.5)
(log(n))^0.5
log(n) * log(n)
I thought the answer was 2 after I draw the graph.
However, the answer is 3. I am not sure why this is the answer.
Could someone explain why this is the case?
Thank you
2.log(n^0.5)=0.5log(n)=log(n)
3.(log(n))^0.5= sqrt(log(n)) which is smaller in magnitude than 2.
I have explained in very layman terms in the picture below with a little mathematics.
Tell me if you still have confusion.
PS: Both are log so I don't think we worry about time complexity of such functions in this modern age.

Complexity of Integer vs. Binary Constraints in CPLEX

Recently, I have been trying to learn a bit about CPLEX and was hoping someone could help me understand the complexity when solving for integer vs. binary constraints.
For example, say we are trying to allocate a pie around 10 people for maximum utility, where each person has a utility that is linear with the amount of pie they receive. However, we want to introduce the constraint that at least 3 people have to get a bit of pie.
What's the difference between thinking of this as a single integer constraint (number_of_people_with_pie >= 3) vs. 10 binary variables (person_1_has_pie + person_2_has_pie + ... person_10_has_pie >= 3)? I would imagine the former is simplest but wonder if there is any benefits to forming the problem in terms of binary variables?
In addition to this, any recommended reading for better understanding MIP and CPLEX would be greatly appreciated, especially in better understanding where the problem becomes NP or in what situations simplex struggles to find the global minima.
Thanks!
I agree with Alex and Erwin's comment that this really depends on what you want to model. For this particular model I disagree with Alex: to me it makes more sense to use one decision variable per person, otherwise it may become hard to figure out which person gets how much of the pie.
A problem becomes NP hard as soon as you add integrality or SOS constraints. A good reading for MIP in general is Alex Schrijver's "Theory of Integer and Linear Programming". That should cover all the topics you need for an in-depth understanding of things.
It really depends on the case but in yours I would use 1 decision variable rather than 10.
Sometimes, that's not obvious and trying and measuring can prove oneself right or wrong. And that's one of the reason why using high modeling languages can help. (Abstract modeling languages such as OPL)
I recommend a MOOC on cognitive class : https://cognitiveclass.ai/courses/mathematical-optimization-for-business-problems/
and the OPL language manual : https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.studio.help/pdf/opl_languser.pdf

Confusion about NP-hard and NP-Complete in Traveling Salesman problems

Traveling Salesman Optimization(TSP-OPT) is a NP-hard problem and Traveling Salesman Search(TSP) is NP-complete. However, TSP-OPT can be reduced to TSP since if TSP can be solved in polynomial time, then so can TSP-OPT(1). I thought for A to be reduced to B, B has to be as hard if not harder than A. As I can see in the below references, TSP-OPT can be reduced to TSP. TSP-OPT is supposed to be harder than TSP. I am confused...
References: (1)Algorithm, Dasgupta, Papadimitriou, Vazirani Exercise 8.1 http://algorithmics.lsi.upc.edu/docs/Dasgupta-Papadimitriou-Vazirani.pdf https://cseweb.ucsd.edu/classes/sp08/cse101/hw/hw6soln.pdf
http://cs170.org/assets/disc/dis10-sol.pdf
I took a quick look at the references you gave, and I must admit there's one thing I really dislike in your textbook (1st pdf) : they address NP-completeness while barely mentioning decision problems. The provided definition of an NP-complete problem also somewhat deviates from what I'd expect from a textbook. I assume that was a conscious decision to make the introduction more appealing...
I'll provide a short answer, followed by a more detailed explanation about related notions.
Short version
Intuitively (and informally), a problem is in NP if it is easy to verify its solutions.
On the other hand, a problem is NP-hard if it is difficult to solve, or find a solution.
Now, a problem is NP-complete if it is both in NP, and NP-hard. Therefore you have two key, intuitive properties to NP-completeness. Easy to verify, but hard to find solutions.
Although they may seem similar, verifying and finding solutions are two different things. When you use reduction arguments, you're looking at whether you can find a solution. In that regard, both TSP and TSP-OPT are NP-hard, and it is difficult to find their solutions. In fact, the third pdf provides a solution to excercise 8.1 of your textbook, which directly shows that TSP and TSP-OPT are equivalently hard to solve.
Now, one major distinction between TSP and TSP-OPT is that the former is (what your textbook call) a search problem, whereas the latter is an optimization problem. The notion of verifying the solution of a search problem makes sense, and in the case of TSP, it is easy to do, therefore it is NP-complete. For optimization problems, the notion of verifying a solution becomes weird, because I can't think of any way to do that without first computing the size of an optimal solution, which is roughly equivalent to solving the problem itself. Since we do not know an efficient way to verify a solution for TSP-OPT, we cannot say that it is in NP, thus we cannot say that it is NP-complete. (More on this topic in the detailed explanation.)
The tl;dr is that for TSP-OPT, it is both hard to verify and hard to find solutions, while for TSP it is easy to verify and hard to find solutions.
Reductions arguments only help when it comes to finding solutions, so you need other arguments to distinguish them when it comes to verifying solutions.
More details
One thing your textbook is very brief about is what a decision problem is.
Formally, the notion of NP-completeness, NP-hardness, NP, P, etc, were developed in the context of decision problems, and not optimization or search problems.
Here's a quick example of the differences between these different types of problems.
A decision problem is a problem whose answer is either YES or NO.
TSP decision problem
Input: a graph G, a budget b
Output: Does G admit a tour of weight at most b ? (YES/NO)
Here is the search version
TSP search problem
Input: a graph G, a budget b
Output: Find a tour of G of weight at most b, if it exists.
And now the optimization version
TSP optimization problem
Input: a graph G
Output: Find a tour of G with minimum weight.
Out of context, the TSP problem could refer to any of these. What I personally refer to as the TSP is the decision version. Here your textbook use TSP for the search version, and TSP-OPT for the optimization version.
The problem here is that those various problems are strictly distinct. The decision version only ask for existence, while the search version asks for more, it needs one example of such a solution. In practice, we often want to have the actual solution, so more practical approaches may omit to mention decision problems.
Now what about it? The definition of an NP-complete problem was meant for decision problems, so it technically does not apply directly to search or optimization problems. But because the theory behind it is well defined and useful, it is handy to still apply the term NP-complete/NP-hard to search/optimization problem, so that you have an idea of how hard these problems are to solve. So when someone says the travelling salesman problem is NP-complete, formally it should be the decision problem version of the problem.
Obviously, many notions can be extended so that they also cover search problems, and that is how it is presented in your textbook. The differences between decision, search, and optimization, are precisely what the exercises 8.1 and 8.2 try to cover in your textbook. Those exercises are probably meant to get you interested in the relationship between these different types of problems, and how they relate to one another.
Short Version
The decision problem is NP-complete because you can both have a polynomial time verifier for the solution, as well as the fact that the hamiltonian cycle problem is reducible to TSP_DECIDE in polynomial time.
However, the optimization problem is strictly NP-hard, because even though TSP_OPTIMIZE is reducible from the hamiltonian (HAM) cycle problem in polynomial time, you don't have a poly time verifier for a claimed hamiltonian cycle C, whether it is the shortest or not, because you simply have to enumerate all possibilities (which consumes the factorial order space & time).
What the given reference define is, bottleneck TSP
The Bottleneck traveling salesman problem (bottleneck TSP) is a problem in discrete or combinatorial optimization. The problem is to find the Hamiltonian cycle in a weighted graph which minimizes the weight of the most weighty edge of the cycle.
The problem is known to be NP-hard. The decision problem version of this, "for a given length x is there a Hamiltonian cycle in a graph G with no edge longer than x?", is NP-complete. NP-completeness follows immediately by a reduction from the problem of finding a Hamiltonian cycle.
This problem can be solved by performing a binary search or sequential search for the smallest x such that the subgraph of edges of weight at most x has a Hamiltonian cycle. This method leads to solutions whose running time is only a logarithmic factor larger than the time to find a Hamiltonian cycle.
Long Version
The mistake is to say that the TSP is NP complete. Truth is that TSP is NP hard. Let me explain a bit:
The TSP is a problem defined by a set of cities and the distances
between each city pair. The problem is to find a circuit that goes
through each city once and that ends where it starts. This in itself
isn't difficult. What makes the problem interesting is to find the
shortest circuit among all those that are possible.
Solving this problem is quite simple. One merely need to compute the length of all possible circuits, then keep the shortest one. Issue is that the number of such circuits grows very quickly with the number of cities. If there are n cities then this number is factorial of n-1 = (n-1)(n-2)...3.2.
A problem is NP if one can easily (in polynomial time) check that a proposed solution is indeed a solution.
Here is the trick.
In order to check that a proposed tour is a solution of the TSP we need to check two things, namely
That each city is is visited only once
That there is no shorter tour than the one we are checking
We didn't check the second condition! The second condition is what makes the problem difficult to solve. As of today, no one has found a way to check condition 2 in polynomial time. It means that the TSP isn't in NP, as far as we know.
Therefore, TSP isn't NP complete as far as we know. We can only say that TSP is NP hard.
When they write that TSP is NP complete, they mean that the following decision problem (yes/no question) is NP complete:
TSP_DECISION : Given a number L, a set of cities, and distance between all city pairs, is there a tour visiting each city exactly once of length less than L?
This problem is indeed NP complete, as it is easy (polynomial time) to check that a given tour leads to a yes answer to TSPDECISION.

P NP and NP complete clarfication?

This is an answer I found on stack overflow
NP is a complexity class that represents the set of all decision problems for which the instances where the answer is "yes" have proofs that can be verified in polynomial time.
This means that if someone gives us an instance of the problem and a certificate (sometimes called a witness) to the answer being yes, we can check that it is correct in polynomial time.
My question is who is the "we" that checks to see if the solution is correct in polynomial time? Is it a program or does it literally mean a human sitting down and working it out on paper?
In the classical definition, it is a Turing machine. I believe it has been shown that the computers we use today are more-or-less the same to Turing machines in the complexity theory sense (polynomial time on one is polynomial time on the other), see: https://en.wikipedia.org/wiki/Turing_machine#Comparison_with_real_machines

NP verifier-based definition

i'm a computer science student and i'm having some problem understanding the verifier based definition of NP problems.
The definition says that a problem is in NP if can be verified in polinomial time by a deterministic turing machine, given a "certificate".
But what happens, if the certificate is exactly the problem solution? It's only a bit, and it's obviuosly polinomially limited by the input size, and it's obviously verifiable in constant, thus polinomial time.
Therefore, each decision problem would belong to NP.
Where am i wrong?
But what happens, if the certificate is exactly the problem solution? It's only a bit, and it's obviuosly polinomially limited by the input size, and it's obviously verifiable in constant, thus polinomial time.
Why "obviously"? You might have to spend an exponential amount of time verifying the solution, in which case the problem need not be in NP. The point is that, even though the certificate is a single bit for a decision problem, you don't know whether that bit should be zero or one to solve the problem. (If you always did know that, then any decision problem in P or in NP would be solvable in constant time.)
Not all problems can be verified in polynomial time even though the solution is polynomial in length. Lets consider the Travelling Salesman Problem. Given a solution you can only verify whether the given solution is a tour of the cities but you cannot tell whether it is the minimum length tour, unless you explore all possible tours.
Hence, in most of the cases the decision problem is NP-Complete (e.g. to find whether the set of cities contain a tour) while the optimization problems are NP-Hard (e.g. finding the minimum length tour)