How to optimize code for finding Amicable Pairs - optimization

Please see the code I've used to find what I believe are all Amicable Pairs (n, m), n < m, 2 <= n <= 65 million. My code: http://tutoree7.pastebin.com/wKvMAWpT. The found pairs: http://tutoree7.pastebin.com/dpEc0RbZ.
I'm finding that each additional million now takes 24 minutes on my laptop. I'm hoping there are substantial numbers of n that can be filtered out in advance. This comes close, but no cigar: odd n that don't end in '5'. There is only one counterexample pair so far, but that's one too many: (34765731, 36939357). That as a filter would filter out 40% of all n.
I'm hoping for some ideas, not necessarily the Python code for implementing them.

Here is a nice article that summarizes all optimization techniques for finding amicable pairs
with sample C++ code
It finds all amicable numbers up to 10^9 in less than a second.

#include<stdio.h>
#include<stdlib.h>
int sumOfFactors(int );
int main(){
int x, y, start, end;
printf("Enter start of the range:\n");
scanf("%d", &start);
printf("Enter end of the range:\n");
scanf("%d", &end);
for(x = start;x <= end;x++){
for(y=end; y>= start;y--){
if(x == sumOfFactors(y) && y == sumOfFactors(x) && x != y){
printf("The numbers %d and %d are Amicable pair\n", x,y);
}
}
}
return 0;
}
int sumOfFactors(int x){
int sum = 1, i, j;
for(j=2;j <= x/2;j++){
if(x % j == 0)
sum += j;
}
return sum;
}

def findSumOfFactors(n):
sum = 1
for i in range(2, int(n / 2) + 1):
if n % i == 0:
sum += i
return sum
start = int(input())
end = int(input())
for i in range(start, end + 1):
for j in range(end, start + 1, -1):
if i is not j and findSumOfFactors(i) == j and findSumOfFactors(j) == i and j>1:
print(i, j)

Related

Time complexity for an intersection (worst case)

Having trouble finding the time complexity for the worst-case time complexity. This case is for an intersection of two sort arrays of the same size (n).
Not sure how to count the while loop with two conditions or how to count the if and else if statements
I know the big 0 would be N+N but no idea how to show the worst case.
int printIntersection(int arr1[], int arr2[]) {
int i = 0, j = 0;
while (i < n && j < n) {
if (arr1[i] < arr2[j])
i++;
else if (arr2[j] < arr1[i])
j++;
else /* if arr1[i] == arr2[j] */ {
cout << arr2[j] << " ";
i++;
j++;
}
}
}
To prove that in the worst case the loop will make 2N iterations you can use the following argument.
Given two indices i and j at each step:
if arr1[i] < arr2[j] then i is incremented by 1
if arr2[i] > arr1[j] then j is incremented by 1
if arr2[i] = arr1[j] then both i and j are incremented by 1
At each iteration at least one between i and j is incremented by one and the maximum number of iterations is bounded by 2N (both i and j go from 0 to n-1),
you get your resulting worst case time complexity.

Algorithm to group consecutive words minimizing length per group

From an input of space-delimited words, how to concatenate consecutive words so that:
each group has a minimum length L (spaces don't count)
longest group length is minimal (spaces don't count)
Example input:
would a cat eat a mouse
Example minimum length:
L = 5
Naive algorithm that solves the first condition but not the second one:
while length of a group is less than L, concatenate next word to group
if last group is shorter than L, concatenate last two groups together
This naive algorithm produces:
group 1: would
group 2: acateat
group 3: amouse
longest group length: 7
Second condition is not solved because a better solution would be:
group 1: woulda
group 2: cateat
group 3: amouse
longest group length: 6
Which algorithm would solve the second condition (minimal longest group) with relatively fast execution as a program? (by fast, I'd like to avoid testing all possible combinations)
I know C, ObjC, Swift, Javascript, Python, but pseudocode is fine.
This can be done with dynamic programming approach. Let's count a function F(i) - the minimum length of the longest group among correct divisions of the first i words into groups.
F(0) = 0
F(i) = Min(Max(F(j), totalLen(j+1, i))), for j in [0..i-1]
Where
totalLen(i, j) = total length of words from i to j, if the length is at least L
totalLen(i, j) = MAX, if total length is less than L
The answer is the value of F(n). To get the groups themselves we can save the indices of the best j for every i.
There is a implementation from the scratch in c++:
const vector<string> words = {"would", "a", "cat", "eat", "a", "mouse"};
const int L = 5;
int n = words.size();
vector<int> prefixLen = countPrefixLen(words);
vector<int> f(n+1);
vector<int> best(n+1, -1);
int maxL = prefixLen[n];
f[0] = 0;
for (int i = 1; i <= n; ++i) {
f[i] = maxL;
for (int j = 0; j < i; ++j) {
int totalLen = prefixLen[i] - prefixLen[j];
if (totalLen >= L) {
int maxLen = max(f[j], totalLen);
if (f[i] > maxLen) {
f[i] = maxLen;
best[i] = j;
}
}
}
}
output(f[n], prev, words);
Preprocessing and output details:
vector<int> countPrefixLen(const vector<string>& words) {
int n = words.size();
vector<int> prefixLen(n+1);
for (int i = 1; i <= n; ++i) {
prefixLen[i] = prefixLen[i-1] + words[i-1].length();
}
return prefixLen;
}
void output(int answer, const vector<int>& best, const vector<string>& words) {
cout << answer << endl;
int j = best.size()-1;
vector<int> restoreIndex(1, j);
while (j > 0) {
int i = best[j];
restoreIndex.push_back(i);
j = i;
}
reverse(restoreIndex.begin(), restoreIndex.end());
for (int i = 0; i+1 < restoreIndex.size(); ++i) {
for (int j = restoreIndex[i]; j < restoreIndex[i+1]; ++j) {
cout << words[j] << ' ';
}
cout << endl;
}
}
Output:
6
would a
cat eat
a mouse
Runnable: https://ideone.com/AaV5C8
Further improvement
The complexity of this algorithm is O(N^2). If it is too slow for your data I can suggest a simple optimization:
Let's inverse the inner loop. First, this allows to get rid of the prefixLen array and it's preprocessing, because now we add words one by one to the group (actually, we could get rid of this preprocessing in the initial version, but at the expense of simplicity). What is more important we can break our loop when totalLen would be not less than already computed f[i] because further iterations will never lead to an improvement. The code of the inner loop could be changed to:
int totalLen = 0;
for (int j = i-1; j >= 0; --j) {
totalLen += words[j].length();
if (totalLen >= L) {
int maxLen = max(f[j], totalLen);
if (f[i] > maxLen) {
f[i] = maxLen;
best[i] = j;
}
}
if (totalLen >= f[i]) break;
}
This can drastically improve the performance for not very big values of L.

TWIN PRIMES BETWEEN 2 VALUES wrong results

I've been working on this program to count how many twin primes between two values and it's been specified that twin primes come in the (6n-1, 6n+1) format, with the exception of (3, 5). My code seems to work fine, but it keeps giving me the wrong result....1 less couple of twin primes than i should get. Between 1 and 40, we should have 5 twin primes, but I'm always getting 4. é
What am I doing wrong? Am I not taking into account (3, 5)?
Here's my code:
#include <stdio.h>
int prime (int num) {
int div;
if (num == 2) return 1;
if (num % 2 == 0) return 0;
div = 3;
while (div*div <= num && num%div != 0)
div = div + 2;
if (num%div == 0)
return 0;
else
return 1;
}
int main(void) {
int low, high, i, count, n, m;
printf("Please enter the values for the lower and upper limits of the interval\n");
scanf("%d%d", &low, &high);
printf("THIS IS THE LOW %d\n AND THIS IS THE HIGH %d\n", low, high);
i = low;
count = 0;
while (6*i-1>=low && 6*i+1<=high) {
n = 6*i-1;
m = 6*i+1;
if (prime(n) && prime(m)) ++count;
i = i + 1;
}
printf("Number of twin primes is %d\n", count);
return 0;
}
Your program misses (3 5) because 3 is not trapped as a prime number, and because 4 is not a multiple of 6. Rather than the main loop stepping by (effectively) 6, this answer steps by 1.
#include <stdio.h>
int prime (int num) {
int div;
if (num == 1) return 0; // excluded 1
if (num == 2 || num == 3) return 1; // included 3 too
if (num % 2 == 0) return 0;
div = 3;
while (div*div <= num) {
if (num % div == 0) // moved to within loop
return 0;
div += 2;
}
return 1;
}
int main(void) {
int low, high, i, count, n, m;
printf("Please enter the values for the lower and upper limits of the interval\n");
scanf("%d%d", &low, &high);
printf("THIS IS THE LOW %d\n AND THIS IS THE HIGH %d\n", low, high);
count = 0;
for (i=low; i<=high; i++) {
n = i-1;
m = i+1;
if (prime(n) && prime(m)) {
printf ("%2d %2d\n", n, m);
++count;
}
}
printf("Number of twin primes is %d\n", count);
return 0;
}
Program output
1
40
THIS IS THE LOW 1
AND THIS IS THE HIGH 40
3 5
5 7
11 13
17 19
29 31
Number of twin primes is 5
Next run:
3
10
THIS IS THE LOW 3
AND THIS IS THE HIGH 10
3 5
5 7
Number of twin primes is 2
https://primes.utm.edu/lists/small/100ktwins.txt
The five twin primes under forty are (3,5), (5,7), (11,13), (17,19), (29,31) so if you know that your code isn't counting (3,5) then it is working correctly, counting (5,7), (11,13), (17,19), and (29,31).
A possible fix would be to add an if-statement which adds 1 to "count" if the starting number is less than 4. I'm not really that used to reading C syntax so I had trouble getting my head around your formulas, sorry.
edit: since comments don't format code snippets:
i = low;
count = 0;
if (low <= 3 && high >= 3){
count ++; // accounts for (3,5) twin primes if the range includes 3
}
You have a problem in your prime function, this is the output of your prime function for the first ten prime evaluations
for(i=1;i<=10;i++) printf("%d\t%d",i,prime(i));
1 1
2 1
3 0
4 0
5 1
6 0
7 1
8 0
Note the prime() function from Weather Vane, you should include 3 as prime (and exclude 1).
From [1], twin primes are the ones that have a prime gap of two, differing by two from another prime.
Examples are (3,5) , (5,7), (11,13). The format (6n-1,6n+1) is true but for (3,5) as you stated. Your program runs almost ok since it shows the number of twin primes that are in the interval AND follows the rule mentioned above. This doesn't include (3,5). You can make a kind of exception (like if low<=3 add 1 to total count), or use another algorithm to count twin primes (like verify if i is prime, then count distance from i to next prime, if distance=2 then they are twin primes)
[1] http://en.wikipedia.org/wiki/Twin_prime

Finding the big theta bound

Give big theta bound for:
for (int i = 0; i < n; i++) {
if (i * i < n) {
for (int j = 0; j < n; j++) {
count++;
}
}
else {
int k = i;
while (k > 0) {
count++;
k = k / 2;
}
}
}
So here's what I think..Not sure if it's right though:
The first for loop will run for n iterations. Then the for for loop within the first for loop will run for n iterations as well, giving O(n^2).
For the else statement, the while loop will run for n iterations and the k = k/ 2 will run for logn time giving O(nlogn). So then the entire thing will look like n^2 + nlogn and by taking the bigger run time, the answer would be theta n^2 ?
I would say the result is O(nlogn) because i*i is typically not smaller than n for a linear n. The else branch will dominate.
Example:
n= 10000
after i=100 the else part will be calculated instead of the inner for loop

Total number of values - sum

I need to write a program "long int sum(int n)" which sum the total number of values like this:
1! − 2! + 3! − ... ± n!
I'm succesful with writing the sum for:
1-3 + 5 - ... ± (2n + 1)
float sum (int n) {
int max = 2*n +1, i = 1, sum = 0, ch = 2;
for (i = 1; i <= max; i+2; ){
if ((ch%2) == 0){
sum += i;
}
else{
sum = sum - i;
}
ch++;
return sum;
}
But I don't know/have an idea how to make it for a factorial sum.
it's useful to make another function that does the factorial and one that does the sum of the alternating series . . .
int factorial(int n)
{
int sum = 1;
if (n > 0)
for (int i = n; i > 1; --i)
sum *= i;
else if (n <= 0)
return 0;
return sum;
}
int alernatingSeriesSum(int nStart)
{
if(nStart < 1) return 0;
int sum = 0;
for(int i=1; i<nStart; ++i)
sum += (factorial(i) * ((i%2)==0 ? -1 : 1)); //multiply -1 if its an even #s
return sum;
}
the factorial is pretty straightforward, multiply by the value, decrement by one and iterate until it reaches 1.
the altnerating series sum is similiar, it calls factorial for reach iterating (except this time the index increases), and creates an alternating sign by mulitplying by -1 every time the index is even. this is how we produce 1! - 2! + 3! - 4! + . . . + (n+1)! - (n+2)!
i hope that helps . . .
if you cannot split it into functions, try writing this all in one main function . . . i tested this code in C and it works. feel free to play with the code and try to read what each line does. good luck.
Split it into two functions. Instead of
sum += i;
and
sum = sum - i;
try:
sum += factorial(i);
and
sum = sum - factorial(i)
where factorial is some method that computes factorial:
long int factorial(int n) {
long int fact = n;
while ( n > 1) {
n--;
fact *= n;
}
return fact;
}