Optimization of "static" loops - optimization

I'm writing a compiled language for fun, and I've recently gotten on a kick for making my optimizing compiler very robust. I've figured out several ways to optimize some things, for instance, 2 + 2 is always 4, so we can do that math at compile time, if(false){ ... } can be removed entirely, etc, but now I've gotten to loops. After some research, I think that what I'm trying to do isn't exactly loop unrolling, but it is still an optimization technique. Let me explain.
Take the following code.
String s = "";
for(int i = 0; i < 5; i++){
s += "x";
}
output(s);
As a human, I can sit here and tell you that this is 100% of the time going to be equivalent to
output("xxxxx");
So, in other words, this loop can be "compiled out" entirely. It's not loop unrolling, but what I'm calling "fully static", that is, there are no inputs that would change the behavior of the segment. My idea is that anything that is fully static can be resolved to a single value, anything that relies on input or makes conditional output of course can't be optimized further. So, from the machine's point of view, what do I need to consider? What makes a loop "fully static?"
I've come up with three types of loops that I need to figure out how to categorize. Loops that will always end up with the same machine state after every run, regardless of inputs, loops that WILL NEVER complete, and loops that I can't figure out one way or the other. In the case that I can't figure it out (it conditionally changes how many times it will run based on dynamic inputs), I'm not worried about optimizing. Loops that are infinite will be a compile error/warning unless specifically suppressed by the programmer, and loops that are the same every time should just skip directly to putting the machine in the proper state, without looping.
The main case of course to optimize is the static loop iterations, when all the function calls inside are also static. Determining if a loop has dynamic components is easy enough, and if it's not dynamic, I guess it has to be static. The thing I can't figure out is how to detect if it's going to be infinite or not. Does anyone have any thoughts on this? I know this is a subset of the halting problem, but I feel it's solvable; the halting problem is a problem due to the fact that for some subsets of programs, you just can't tell it may run forever, it may not, but I don't want to consider those cases, I just want to consider the cases where it WILL halt, or it WILL NOT halt, but first I have to distinguish between the three states.

This looks like a kind of a symbolic solver that can be defined for several classes, but not generally.
Let's restrict the requirements a bit: no number overflow, just for loops (while can be sometimes transformed to full for loop, except when using continue etc.), no breaks, no modifications of the control variable inside the for loop.
for (var i = S; E(i); i = U(i)) ...
where E(i) and U(i) are expressions that can be symbolically manipulated. There are several classes that are relatively easy:
U(i) = i + CONSTANT : n-th cycle the value of i is S + n * CONSTANT
U(i) = i * CONSTANT : n-th cycle the value of i is S * CONSTANT^n
U(i) = i / CONSTANT : n-th cycle the value of i is S * CONSTANT^-n
U(i) = (i + CONSTANT) % M : n-th cycle the value of i is (S + n * CONSTANT) % M
and some other quite easy combinations (and some very difficult ones)
Determining whether the loop terminates is searching for n where E(i(n)) is false.
This can be done by some symbolic manipulation for a lot of cases, but there is a lot of work involved in making the solver.
E.g.
for(int i = 0; i < 5; i++),
i(n) = 0 + n * 1 = n, E(i(n)) => not(n < 5) =>
n >= 5 => stops for n = 5
for(int i = 0; i < 5; i--),
i(n) = 0 + n * -1 = -n, E(i(n)) => not(-n < 5) => -n >= 5 =>
n < -5 - since n is a non-negative whole number this is never true - never stops
for(int i = 0; i < 5; i = (i + 1) % 3),
E(i(n)) => not(n % 3 < 5) => n % 3 >= 5 => this is never true => never stops
for(int i = 10; i + 10 < 500; i = i + 2 * i) =>
for(int i = 10; i < 480; i = 3 * i),
i(n) = 10 * 3^n,
E(i(n)) => not(10 * 3^n < 480) => 10 * 3^n >= 480 => 3^n >= 48 => n >= log3(48) => n >= 3.5... =>
since n is whole => it will stop for n = 4
for other cases it would be good if they can get transformed to the ones you can already solve...
Many tricks for symbolic manipulation come from Lisp era, and are not too difficult. Although the ones described (or variants) are the most common types practice, there are many more difficult and/or impossible to solve scenarios.

Related

What is the time complexity (Big-O) of this while loop (Pseudocode)?

This is written in pseudocode.
We have an Array A of length n(n>=2)
int i = 1;
while (i < n) {
if (A[i] == 0) {
terminates the while-loop;
}
doubles i
}
I am new to this whole subject and coding, so I am having a hard time grasping it and need an "Explain like im 5".
I know the code doesnt make a lot of sense but it is just an exercise, I have to determine best case and worst case.
So in the Best case Big O would be O(1) if the value in [1] is 0.
For the worst-case scenario I thought the time complexity of this loop would be O(log(n)) as i doubles.
Is that correct?
Thanks in advance!
For Big O notation you take the worse case scenario. For the case where A[i] never evaluates to zero then your loop is like this:
int i = 1;
while(i < n) {
i *= 2;
}
i is doubled on each iteration, ie exponential growth.
Given an example of n=16
the values of i would be:
1
2
4
8
wouldn't get to 16
4 iterations
and 2^4 = 16
to work out the power, you would take log to base 2 of n, ie log(16) = 4
So the worst case would be log(n)
So the complexity would be stated as O(log(n))

What is Pseudo-polynomial complexity?

Yes, I've seen this answer - What is pseudopolynomial time? How does it differ from polynomial time? - but I still don't understand.
Why does the representation in bits make a difference only sometimes?
For this program for example
function isPrime(n):
for i from 2 to n - 1:
if (n mod i) = 0, return false
return true
it says the complexity is not polynomial, because n requires log n bits to write out so the complexity is O(2^(4*log n)) but if i use that on every other problem then it could also be pseudopolynomial, right? (unless im getting it all wrong here). What makes this program so special to be measured in the amount of bits required to write out n?
You have linked to other questions where this is explained fairly well for someone who understands the concept, so here comes a very brief version.
for i from 2 to n - 1:
can be rewritten as
i = 2
while(i < n - 1):
if (n mod i) == 0:
return false
i = i + 1
Very often, we assume that the operations i < n - 1, i = i + 1 and n mod i are O(1). But this is not necessarily true. It is usually true for small values. And on a 32 bit machine, a "small value" is in the order of a billion.
Number that requires more than 32 bits to be represented will take more time to perform operations on than a number that fits in 32 bit. And it will take even more if it required more than 64 bit.
In practice, this rarely matters.
A very simple way to visualize this is to imagine that you get the task to implement the common mathematical operations where the operands are represented as strings. Here is a simple python function that takes two strings representing binary numbers and returns the sum as a string. It was quickly hacked together and assumes both strings has the same length. It may contain bugs and can most likely be refined. But it demonstrate the point. This function adds two numbers, but it will take longer time for longer numbers.
def binadd(a, b):
carry = '0'
result = list('0'*(len(a)+1))
for i in range(len(a)-1,-1, -1):
xor = '1' if (a[i] == '1') != (b[i] == '1') else '0'
val = '1' if (xor == '1') != (carry == '1') else '0'
carry = '1' if (carry == '1' and xor == '1') or (a[i] == '1' and b[i] == '1') else '0'
result[i] = val
result[0]=carry
return ''.join(result)
What makes this program so special to be measured in the amount of bits required to write out n?
There's nothing special about this particular program. At least not theoretical. In practice it is special in the sense that determining if a VERY big number is a prime is a common problem. Or to be more accurate, it would have been a much more common problem if there existed a very fast algorithm to do it. If it did, it would basically break encryption as we know it today.

Determining when to stop a possibly infinite iteration?

As a beginner programmer, a common problem I encounter is when to stop an iteration. For example, if I were to program a function to determine if an integer was happy or not (by brute-force ), when would I stop? Another example, concerning something like the Mandelbrot set, how would I know to stop an iteration and firmly say that a number diverges or converges? Does it depend on the problem you're dealing with, or is there a method to do things like this?
Brute-force method you have to find your base case to terminate your program as in case of recursion. For Happy Prime number the base case is finding loop in your iteration.
Code
# sum of square of digit of n
def numSquareSum(n):
squareSum = 0
while(n):
squareSum += (n % 10) * (n % 10)
n = int(n / 10)
return squareSum
#method return true if n is Happy Number
def isHappyNumber(n):
li = []
while (1):
n = numSquareSum(n)
if (n == 1):
return True
if (n in li):
return False
li.append(n)
# Main Code
n = 7;
if (isHappyNumber(n)):
print(n , "is a Happy number")
else:
print(n , "is not a Happy number")
Hope this will be helpful.
I believe what you are describing is the Halting Problem. The main takeaway to Halting Problem is that computers cannot solve every problem. One of which being detecting generic infinite loops. It depends on the specific problem you are trying to solve, whether there is a solution to detecting infinite loops.
If you have anymore questions please refer to Wiki on the Halting Problem
In NASA's The Power of 10: Rules for Developing Safety-Critical Code there is a rule "All loops must have fixed bounds. This prevents runaway code."
In this trivial example this would be:
# sum of square of digit of n
def numSquareSum(n):
count = 0
squareSum = 0
while(n):
squareSum += (n % 10) * (n % 10)
n = int(n / 10)
count += 1
if count > 100000000:
raise Exception('loop bound exceeded')
return squareSum

Time complexity of for loops, I cannot really understand a thing

So these are the for loops that I have to find the time complexity, but I am not really clearly understood how to calculate.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
For the first line, (int i = n; i > 1; i /= 3), keeps diving i by 3 and if i is less than 1 then the loop stops there, right?
But what is the time complexity of that? I think it is n, but I am not really sure.
The reason why I am thinking it is n is, if I assume that n is 30 then i will be like 30, 10, 3, 1 then the loop stops. It runs n times, doesn't it?
And for the last for loop, I think its time complexity is also n because what it does is
k starts as 2 and keeps multiplying itself to itself until k is greater than n.
So if n is 20, k will be like 2, 4, 16 then stop. It runs n times too.
I don't really think I am understanding this kind of questions because time complexity can be log(n) or n^2 or etc but all I see is n.
I don't really know when it comes to log or square. Or anything else.
Every for loop runs n times, I think. How can log or square be involved?
Can anyone help me understanding this? Please.
Since all three loops are independent of each other, we can analyse them separately and multiply the results at the end.
1. i loop
A classic logarithmic loop. There are countless examples on SO, this being a similar one. Using the result given on that page and replacing the division constant:
The exact number of times that this loop will execute is ceil(log3(n)).
2. j loop
As you correctly figured, this runs O(n / 2) times;
The exact number is floor(n / 2).
3. k loop
Another classic known result - the log-log loop. The code just happens to be an exact replicate of this SO post;
The exact number is ceil(log2(log2(n)))
Combining the above steps, the total time complexity is given by
Note that the j-loop overshadows the k-loop.
Numerical tests for confirmation
JavaScript code:
T = function(n) {
var m = 0;
for (var i = n; i > 1; i /= 3) {
for (var j = 0; j < n; j += 2)
m++;
for (var k = 2; k < n; k = k * k)
m++;
}
return m;
}
M = function(n) {
return ceil(log(n)/log(3)) * (floor(n/2) + ceil(log2(log2(n))));
}
M(n) is what the math predicts that T(n) will exactly be (the number of inner loop executions):
n T(n) M(n)
-----------------------
100000 550055 550055
105000 577555 577555
110000 605055 605055
115000 632555 632555
120000 660055 660055
125000 687555 687555
130000 715055 715055
135000 742555 742555
140000 770055 770055
145000 797555 797555
150000 825055 825055
M(n) matches T(n) perfectly as expected. A plot of T(n) against n log n (the predicted time complexity):
I'd say that is a convincing straight line.
tl;dr; I describe a couple of examples first, I analyze the complexity of the stated problem of OP at the bottom of this post
In short, the big O notation tells you something about how a program is going to perform if you scale the input.
Imagine a program (P0) that counts to 100. No matter how often you run the program, it's going to count to 100 as fast each time (give or take). Obviously right?
Now imagine a program (P1) that counts to a number that is variable, i.e. it takes a number as an input to which it counts. We call this variable n. Now each time P1 runs, the performance of P1 is dependent on the size of n. If we make n a 100, P1 will run very quickly. If we make n equal to a googleplex, it's going to take a little longer.
Basically, the performance of P1 is dependent on how big n is, and this is what we mean when we say that P1 has time-complexity O(n).
Now imagine a program (P2) where we count to the square root of n, rather than to itself. Clearly the performance of P2 is going to be worse than P1, because the number to which they count differs immensely (especially for larger n's (= scaling)). You'll know by intuition that P2's time-complexity is equal to O(n^2) if P1's complexity is equal to O(n).
Now consider a program (P3) that looks like this:
var length= input.length;
for(var i = 0; i < length; i++) {
for (var j = 0; j < length; j++) {
Console.WriteLine($"Product is {input[i] * input[j]}");
}
}
There's no n to be found here, but as you might realise, this program still depends on an input called input here. Simply because the program depends on some kind of input, we declare this input as n if we talk about time-complexity. If a program takes multiple inputs, we simply call those different names so that a time-complexity could be expressed as O(n * n2 + m * n3) where this hypothetical program would take 4 inputs.
For P3, we can discover it's time-complexity by first analyzing the number of different inputs, and then by analyzing in what way it's performance depends on the input.
P3 has 3 variables that it's using, called length, i and j. The first line of code does a simple assignment, which' performance is not dependent on any input, meaning the time-complexity of that line of code is equal to O(1) meaning constant time.
The second line of code is a for loop, implying we're going to do something that might depend on the length of something. And indeed we can tell that this first for loop (and everything in it) will be executed length times. If we increase the size of length, this line of code will do linearly more, thus this line of code's time complexity is O(length) (called linear time).
The next line of code will take O(length) time again, following the same logic as before, however since we are executing this every time execute the for loop around it, the time complexity will be multiplied by it: which results in O(length) * O(length) = O(length^2).
The insides of the second for loop do not depend on the size of the input (even though the input is necessary) because indexing on the input (for arrays!!) will not become slower if we increase the size of the input. This means that the insides will be constant time = O(1). Since this runs in side of the other for loop, we again have to multiply it to obtain the total time complexity of the nested lines of code: `outside for-loops * current block of code = O(length^2) * O(1) = O(length^2).
The total time-complexity of the program is just the sum of everything we've calculated: O(1) + O(length^2) = O(length^2) = O(n^2). The first line of code was O(1) and the for loops were analyzed to be O(length^2). You will notice 2 things:
We rename length to n: We do this because we express
time-complexity based on generic parameters and not on the ones that
happen to live within the program.
We removed O(1) from the equation. We do this because we're only
interested in the biggest terms (= fastest growing). Since O(n^2)
is way 'bigger' than O(1), the time-complexity is defined equal to
it (this only works like that for terms (e.g. split by +), not for
factors (e.g. split by *).
OP's problem
Now we can consider your program (P4) which is a little trickier because the variables within the program are defined a little cloudier than the ones in my examples.
for (int i = n; i > 1; i /= 3) {
for (int j = 0; j < n; j += 2) {
... ...
}
for (int k = 2; k < n; k = (k * k) {
...
}
}
If we analyze we can say this:
The first line of code is executed O(cbrt(3)) times where cbrt is the cubic root of it's input. Since i is divided by 3 every loop, the cubic root of n is the number of times the loop needs to be executed before i is smaller or equal to 1.
The second for loop is linear in time because j is executed
O(n / 2) times because it is increased by 2 rather than 1 which
would be 'normal'. Since we know that O(n/2) = O(n), we can say
that this for loop is executed O(cbrt(3)) * O(n) = O(n * cbrt(n)) times (first for * the nested for).
The third for is also nested in the first for, but since it is not nested in the second for, we're not going to multiply it by the second one (obviously because it is only executed each time the first for is executed). Here, k is bound by n, however since it is increased by a factor of itself each time, we cannot say it is linear, i.e. it's increase is defined by a variable rather than by a constant. Since we increase k by a factor of itself (we square it), it will reach n in 2log(n) steps. Deducing this is easy if you understand how log works, if you don't get this you need to understand that first. In any case, since we analyze that this for loop will be run O(2log(n)) time, the total complexity of the third for is O(cbrt(3)) * O(2log(n)) = O(cbrt(n) *2log(n))
The total time-complexity of the program is now calculated by the sum of the different sub-timecomplexities: O(n * cbrt(n)) + O(cbrt(n) *2log(n))
As we saw before, we only care about the fastest growing term if we talk about big O notation, so we say that the time-complexity of your program is equal to O(n * cbrt(n)).

Please explain this code for Merkle–Hellman knapsack cryptosystem?

This is the code snippet from a program that implements Merkle–Hellman knapsack cryptosystem.
// Generates keys based on input data size
private void generateKeys(int inputSize)
{
// Generating values for w
// This first value of the private key (w) is set to 1
w.addNode(new BigInteger("1"));
for (int i = 1; i < inputSize; i++)
{
w.addNode(nextSuperIncreasingNumber(w));
}
// Generate value for q
q = nextSuperIncreasingNumber(w);
// Generate value for r
Random random = new Random();
// Generate a value of r such that r and q are coprime
do
{
r = q.subtract(new BigInteger(random.nextInt(1000) + ""));
}
while ((r.compareTo(new BigInteger("0")) > 0) && (q.gcd(r).intValue() != 1));
// Generate b such that b = w * r mod q
for (int i = 0; i < inputSize; i++)
{
b.addNode(w.get(i).getData().multiply(r).mod(q));
}
}
Just tell me what is going on in the following lines:
do
{
r = q.subtract(new BigInteger(random.nextInt(1000) + ""));
}
while ((r.compareTo(new BigInteger("0")) > 0) && (q.gcd(r).intValue() != 1));
(1) Why is random number generated with upper bound 1000?
(2) Why is it subtracted from q?
The code is searching for a value that is co-prime with the already selected value q. In my opinion, it's doing so rather poorly, but you mention it's a simulator? I'm not sure what that means, but maybe it just means the code is quick and dirty rather than slow and secure.
Answering your questions directly:
Why is random number generated with upper bound 1000?
The Merkle-Hellman algorithm does indicate that r should be 'random'. The implementation for doing so is pretty haphazard; that might be what's thrown you off. The code is not technically an algorithm because the loop is not guaranteed to terminate. In theory, the psuedo-random candidate selection of r could be an arbitrarily long sequence of numbers which aren't co-prime to q, resulting in an infinite loop.
The upper bound of 1000 could be to ensure that the chosen r is sufficiently large. In general, large keys are harder to break than small keys, so if q is large, then this code will only find large r.
A more deterministic way to get a random co-prime would be to test each number lower than q, generating a list of co-primes and select one at random. This would probably be more secure, as an attacker knowing that q and r are within 1000 of each other would have a significantly reduced search space.
Why is it subtracted from q?
The subtraction is important because r must be less than q. The Merkle-Hellmen algorithm specifies it that way. I'm not convince that it needs to be that way. The public key is generated by multiplying each element in w by r and taking the modulus q. If r were very large, larger than q, it seems like it would further obfuscate q and each element in w.
The decryption step of Merkle-Hellmen, on the other hand, depends on the modular inverse of each encrypted letter a x r−1 mod q. This operation might be hampered by having r > q; it seems like it could still work out.
However, if nextInt can return 0, that iteration of the loop is a waste as a q and r must be different (gcd(a,a) is just a).
Breaking down the code:
do
Try it at least once. r is probably null or undefined before the method is called.
r = q.subtract(new BigInteger(random.nextInt(1000) + ""));
Find a candidate value that's between q and q - 1000.
while ((r.compareTo(new BigInteger("0")) > 0) && (q.gcd(r).intValue() != 1));
Keep going until you've found an r that is:
Greater than 0 r.compareTo(new BigInteger("0")) > 0, and
Is co-prime with q, q.gcd(r).intValue() != 1. Obviously, a randomly selected number is not guaranteed to be co-prime with another other number, so the randomly generated candidate might not be work for this q.
Does that clear it up? I have to admit that I'm not an expert on Merkle-Hellman.