How to solve Euler Project Prblem 303 faster? - optimization

The problem is:
For a positive integer n, define f(n) as the least positive multiple of n that, written in base 10, uses only digits ≤ 2.
Thus f(2)=2, f(3)=12, f(7)=21, f(42)=210, f(89)=1121222.
To solve it in Mathematica, I wrote a function f which calculates f(n)/n :
f[n_] := Module[{i}, i = 1;
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i = i + 1];
Return[FromDigits[IntegerDigits[i, 3]]/n]]
The principle is simple: enumerate all number with 0, 1, 2 using ternary numeral system until one of those number is divided by n.
It correctly gives 11363107 for 1~100, and I tested for 1~1000 (calculation took roughly a minute, and gives 111427232491), so I started to calculate the answer of the problem.
However, this method is too slow. The computer has been calculating the answer for two hours and hasn't finished computing.
How can I improve my code to calculate faster?

hammar's comment makes it clear that the calculation time is disproportionately spent on values of n that are a multiple of 99. I would suggest finding an algorithm that targets those cases (I have left this as an exercise for the reader) and use Mathematica's pattern matching to direct the calculation to the appropriate one.
f[n_Integer?Positive]/; Mod[n,99]==0 := (* magic here *)
f[n_] := (* case for all other numbers *) Module[{i}, i = 1;
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i = i + 1];
Return[FromDigits[IntegerDigits[i, 3]]/n]]
Incidentally, you can speed up the fast easy ones by doing it a slightly different way, but that is of course a second-order improvement. You could perhaps set the code up to use ff initially, breaking the While loop if i reaches a certain point, and then switching to the f function you have already provided. (Notice I'm returning n i not i here - that was just for illustrative purposes.)
ff[n_] :=
Module[{i}, i = 1; While[Max[IntegerDigits[n i]] > 2, i++];
Return[n i]]
Table[Timing[ff[n]], {n, 80, 90}]
{{0.000125, 1120}, {0.001151, 21222}, {0.001172, 22222}, {0.00059,
11122}, {0.000124, 2100}, {0.00007, 1020}, {0.000655,
12212}, {0.000125, 2001}, {0.000119, 2112}, {0.04202,
1121222}, {0.004291, 122220}}
This is at least a little faster than your version (reproduced below) for the short cases, but it's much slower for the long cases.
Table[Timing[f[n]], {n, 80, 90}]
{{0.000318, 14}, {0.001225, 262}, {0.001363, 271}, {0.000706,
134}, {0.000358, 25}, {0.000185, 12}, {0.000934, 142}, {0.000316,
23}, {0.000447, 24}, {0.006628, 12598}, {0.002633, 1358}}

A simple thing that you can do to is compile your function to C and make it parallelizable.
Clear[f, fCC]
f[n_Integer] := f[n] = fCC[n]
fCC = Compile[{{n, _Integer}}, Module[{i = 1},
While[Mod[FromDigits[IntegerDigits[i, 3]], n] != 0, i++];
Return[FromDigits[IntegerDigits[i, 3]]]],
Parallelization -> True, CompilationTarget -> "C"];
Total[ParallelTable[f[i]/i, {i, 1, 100}]]
(* Returns 11363107 *)
The problem is that eventually your integers will be larger than a long integer and Mathematica will revert to the non-compiled arbitrary precision arithmetic. (I don't know why the Mathematica compiler does not include a arbitrary precision C library...)
As ShreevatsaR commented, the project Euler problems are often designed to run quickly if you write smart code (and think about the math), but take forever if you want to brute force it. See the about page. Also, spoilers posted on their message boards are removed and it's considered bad form to post spoilers on other sites.
Aside:
You can test that the compiled code is using 32bit longs by running
In[1]:= test = Compile[{{n, _Integer}}, {n + 1, n - 1}];
In[2]:= test[2147483646]
Out[2]= {2147483647, 2147483645}
In[3]:= test[2147483647]
During evaluation of In[53]:= CompiledFunction::cfn: Numerical error encountered at instruction 1; proceeding with uncompiled evaluation. >>
Out[3]= {2147483648, 2147483646}
In[4]:= test[2147483648]
During evaluation of In[52]:= CompiledFunction::cfsa: Argument 2147483648 at position 1 should be a machine-size integer. >>
Out[4]= {2147483649, 2147483647}
and similar for the negative numbers.

I am sure there must be better ways to do this, but this is as far as my inspiration got me.
The following code finds all values of f[n] for n 1-10,000 except the most difficult one, which happens to be n = 9999. I stop the loop when we get there.
ClearAll[f];
i3 = 1;
divNotFound = Range[10000];
While[Length[divNotFound] > 1,
i10 = FromDigits[IntegerDigits[i3++, 3]];
divFound = Pick[divNotFound, Divisible[i10, divNotFound]];
divNotFound = Complement[divNotFound, divFound];
Scan[(f[#] = i10) &, divFound]
] // Timing
Divisible may work on lists for both arguments, and we make good use of that here. The whole routine takes about 8 min.
For 9999 a bit of thinking is necessary. It is not brute-forceable in a reasonable time.
Let P be the factor we are looking for and T (consisting only of 0's, 1's and 2's) the result of multiplication P with 9999, that is,
9999 P = T
then
P(10,000 - 1) = 10,000 P - P = T
==> 10,000 P = P + T
Let P1, ...PL be the digits of P, and Ti the digits of T then we have
The last four zeros in the sum originate of course from the multiplication by 10,000. Hence TL+1,...,TL+4 and PL-3,...,PL are each others complement. Where the former only consists of 0,1,2 the latter allows:
last4 = IntegerDigits[#][[-4 ;; -1]] & /# (10000 - FromDigits /# Tuples[{0, 1, 2}, 4])
==> {{0, 0, 0, 0}, {9, 9, 9, 9}, {9, 9, 9, 8}, {9, 9, 9, 0}, {9, 9, 8, 9},
{9, 9, 8, 8}, {9, 9, 8, 0}, {9, 9, 7, 9}, ..., {7, 7, 7, 9}, {7, 7, 7, 8}}
There are only 81 allowable sets, with 7's, 8's, 9's and 0's (not all possible combinations of them) instead of 10,000 numbers, a speed gain of a factor of 120.
One can see that P1-P4 can only have ternary digits, being the sum of ternary digit and naught. You can see there can be no carry over from the addition of T5 and P1. A further reduction can be gained by realizing that P1 cannot be 0 (the first digit must be something), and if it were a 2 multiplication with 9999 would cause a 8 or 9 (if a carry occurs) in the result for T which is not allowed either. It must be a 1 then. Two's may also be excluded for P2-P4.
Since P5 = P1 + T5 it follows that P5 < 4 as T5 < 3, same for P6-P8.
Since P9 = P5 + T9 it follows that P9 < 6, same for P10-P11
In all these cases the additions don't need to include a carry over as they can't occur (Pi+Ti always < 8). This may not be true for P12 if L = 16. In that case we can have a carry over from the addition of the last 4 digits . So P12 <7. This also excludes P12 from being in the last block of 4 digits. The solution must therefore be at least 16 digits long.
Combining all this we are going to try to find a solution for L=16:
Do[
If[Max[IntegerDigits[
9999 FromDigits[{1, 1, 1, 1, i5, i6, i7, i8, i9, i10, i11, i12}~
Join~l4]]
] < 3,
Return[FromDigits[{1, 1, 1, 1, i5, i6, i7, i8, i9, i10, i11, i12}~Join~l4]]
],
{i5, 0, 3}, {i6, 0, 3}, {i7, 0, 3}, {i8, 0, 3}, {i9, 0, 5},
{i10, 0, 5}, {i11, 0, 5}, {i12, 0, 6}, {l4,last4}
] // Timing
==> {295.372, 1111333355557778}
and indeed 1,111,333,355,557,778 x 9,999 = 11,112,222,222,222,222,222
We could have guessed this as
f[9] = 12,222
f[99] = 1,122,222,222
f[999] = 111,222,222,222,222
The pattern apparently being the number of 1's increasing with 1 each step and the number of consecutive 2's with 4.
With 13 min, this is over the 1 min limit for project Euler. Perhaps I'll look into it some time soon.

Try something smarter.
Build a function F(N) which finds out the smallest number with {0, 1, 2} digits which is divisible by N.
So for a given N the number which we are looking for can be written as SUM = 10^n * dn + 10^(n-1) * dn-1 .... 10^1 * d1 + 1*d0 (where di are the digits of the number).
so you have to find out the digits such that SUM % N == 0
basically each digits contributes to the SUM % N with (10^i * di) % N
I am not giving any more hints, but the next hint would be to use DP. Try to figure out how to use DP to find out the digits.
for all numbers between 1 and 10000 it took under 1sec in C++. (in total)
Good luck.

Related

Octave: summing indexed elements

The easiest way to describe this is via example:
data = [1, 5, 3, 6, 10];
indices = [1, 2, 2, 2, 4];
result = zeroes(1, 5);
I want result(1) to be the sum of all the elements in data whose index is 1, result(2) to be the sum of all the elements in data whose index is 2, etc.
This works but is really slow when applied (changing 5 to 65535) to 64K element vectors:
result = result + arrayfun(#(x) sum(data(index==x)), 1:5);
I think it's creating 64K vectors with 64K elements that's taking up the time. Is there a faster way to do this? Or do I need to figure out a completely different approach?
for i = [1:5]
idx = indices(i);
result(idx) = result(idx) + data(i);
endfor
But that's a very non-octave-y way to do it.
Seeing how MATLAB is very similar to Octave, I will provide an answer that was tested on MATLAB R2016b. Looking at the documentation of Octave 4.2.1 the syntax should be the same.
All you need to do is this:
result = accumarray(indices(:), data(:), [5 1]).'
Which gives:
result =
1 14 0 10 0
Reshaping to a column vector (arrayName(:) ) is necessary because of the expected inputs to accumarray. Specifying the size as [5 1] and then transposing the result was done to avoid some MATLAB error.
accumarray is also described in depth in the MATLAB documentation

Maximum contigous subsequence sum of x elements

So I came up with a question that I've looked and searched but with no answer found... What's the best (and by saying the best, I mean the fastest) way to get the maximum contigous subsequence sum of x elements?
Imagine that I've: A[] = {2, 4, 1, 10, 40, 50, 22, 1, 24, 12, 40, 11, ...}.
And then I ask:
"What is the maximum contigous subsequence on array A with 3 elements?"
Please imagine this in a array with more than 100000 elements... Can someone help me?
Thank you for your time and you help!
I Googled it and found this:
Using Divide and Conquer approach, we can find the maximum subarray sum in O(nLogn) time. Following is the Divide and Conquer algorithm.
The Kadane’s Algorithm for this problem takes O(n) time. Therefore the Kadane’s algorithm is better than the Divide and Conquer approach
See the code:
Initialize:
max_so_far = 0
max_ending_here = 0
Loop for each element of the array
(a) max_ending_here = max_ending_here + a[i]
(b) if(max_ending_here < 0)
max_ending_here = 0
(c) if(max_so_far < max_ending_here)
max_so_far = max_ending_here
return max_so_far

Create a new generation using replication and crossover in genetic algorthm

Hi all i am studying genetic algorithm to create a new generation. I got a problem for the following one:
This question refers to Genetic Algorithms. Assume you have a population made of 10 individuals. Each individual is made of 5 bits. Here is the initial population.
x1 = (1, 0, 0, 1, 1)
x2 = (1, 1, 0, 0, 1)
x3 = (1, 1, 0, 1, 1)
x4 = (1, 1, 1, 1, 1)
x5 = (0, 0, 0, 1, 1)
x6 = (0, 0, 1, 1, 1)
x7 = (0, 0, 0, 0, 1)
x8 = (0, 0, 0, 0, 0)
x9 = (1, 0, 1, 1, 1)
x10 = (1, 0, 0, 1, 0)
Individuals are ranked according to fitness value (x1 has the greatest fitness value, x2 the second best, etc.). Assume that when sampling, you get individuals in the same order as they are ranked. Create a new generation of solutions assuming the following:
Replication is 20%. Cross over is 80% (assume a crossover mask as follows: 11100; pair examples in the same order as ranked). No mutation is done.
My solution: replication is 20% that means first two population is unchanged.Next the given the crossover mask given 11100 I choose randomly 3 words from crossover(11100) mask so start from x3 and x4 and here i keep first 3 words same both x3 and x4 and finally swap last two remaining words for x3 and x4 and generate new population. I follow same rule for x5 and x6, x7 and x8 and x9 and x10.I am not sure this answer is correct or wrong. Any body can help me please?
I don't know the background of the implementation you are using so I may not be correct, but from a genetic algorithm point of view most of your answer seems correct.
As far as I can see, the only issue in your reasoning is with the crossover. After replication has taken place you use the remaining chromosomes for crossover. This seems inherently flawed from a genetic algorithms point of view. Genetic algorithms generally use the best chromosome in crossover. You've already saved the best and seem to then exclude them from any recombination. This idea goes against the idea of genetic algorithms, which is to evolve the population by means of recombination of the fittest individuals. At the very least, the fittest chromosomes should be included.
Generally, most implementations involve an element of randomness in selection with the fittest chromosomes given more weighting. Since your question explicitly states that pairs are selected in order of ranking, and therefore no randomness, I'd assume crossover is to be performed on chromosomes 1 to 8.
Your understanding of the crossover mask seems correct from the question.
Again, I know nothing of the implementation in question so I'm not sure how good my understanding is. I'd be interested to know the source since the genetic algorithm seems highly unusual.

Modulo expression arithmetic equation

how do i solve:
(2a^2-195)mod26=1
I have tried the next way: x=(2a^2-195)
and if x mod26=1 then x=27,53,79,105....
but could not find an answer, how can i solve this mathematically?
Thank you!
Well, 2a²-195 = 1 (26) is the same as
2a² = 196 (26) <==> 2a² = 14 (26) <==> a² = 7 (13).
I'm sure you can take it from there...
Spoiler: no value of a satisfies the congruence since 7 is not a square mod 13. You can check this fact by calculating 7⁶ and finding that 7⁶ = -1 (13), or by enumerating the squares mod 13 (there are six: 1, 4, 9, 3, 12 and 10) and observing that 7 is not in the list.
You could have found there are no solutions also by testing the original equation with a = 0, 1, .., 25. None of them satisfy the congruence, and you arrive at the same conclusion.

Modular arithmetic in VB

I have this number x which i need to find in the (40 mod x) = 1
a possible answer for x is 3, or 39 as it goes into the number 40 and leaves a remainder of 1.
What kind of code would I need if I was to find all possible answers of x?
Mathematically, to solve (a mod x) = b, just find all of the divisors of a-b that aren't divisors of a. e.g. for (40 mod x) = 1, find the divisors of 40 - 1 (i.e. 39), which are 3, 13, and 39. The divisors of 40 are 2, 4, 5, 8, 10, 20, 40. None of the numbers in the first set are in the second, so the solutions are 3, 13, and 19.
For (40 mod x) = 5, you find the divisors of 40 - 5 (i.e. 35), which are 5, 7, and 35. 5 is on the list of divisors of 40, but the other two aren't, so the solutions are 7 and 35.
Of course, for such small numbers, it's more work to find all of the factors of a and a-b than it is to simply do all of the trial divisions of a by x, so the right way to solve your problem is to take exactly the question you asked and put it into code (forgive my VB, I haven't written any in the past 15 years or so...)
for x = 2 to 39
if (40 % x) = 1
MsgBox(x)
end if
next
Enumerable.Range(1, 40).Where(Function(x) 40 Mod x = 1)
The answer to that question is the set of unique integer factors of 39.
You can find them by looping from 1 to Math.Sqrt(39) and checking divisibility.