Project euler question #38 wrong answer? i found an answer that proves better than the one the question provides - project

Using my program, i arrived at an answer of 9999 which provides a pandigital product of 999919998 and this is the largest possible pandigital number, however this answer is wrong, can someone explain why?
https://projecteuler.net/problem=38 a link to the problem statement

Your number is not pandigital:
In mathematics, a pandigital number is an integer that in a given base has among its significant digits each digit used in the base at least once.
Your answer is missing the numbers 2, 3, 4, 5, 6 and 7 to be pandigital.
Be aware, that Project Euler calls this the concatenated product not the pandigital product as stated in your question. I guess this is where the confusion came from.

Related

Number of divisors for very large number

Is there any fast way to find total divisors of a very large number supposedly 10^18.
I have tried a method which is of o(n^(1/3))
Forgive me asking direct question without providing any background or something else.
The fastest algorithm for that would be General number field sieve, probably fits your question.
On this question there is some discussion ongoing about how to efficiently find all divisors. GNFS is used for factorization, so only accounts prime numbers. You'll have to derive all divisors from that, if you need to. How to I find all divisors from prime factorization?
Further reading: A beginner's guide to the Number Field Sieve; Factoring Integers with the Number Field Sieve

Limiting chosen variables solved for in opensolver [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I've got a linear system of 17 equations, 506 variables that solve for a minimum summation of the total variables. This works fine, so far, but the solution is a result of a combination of 19 variables.
But in the end I want to limit the amount of chosen variables to 10, without knowing in advance which ones are the optimal ones (The solver figures that out for me, as well as their ratio).
I figured I can set a boolean = 1 if the value becomes larger than 0: (meaning the variable is picked), and 0 if the variable is not picked for an optimal solution.
And then have the sum of the booleans be 10 at most.
However this seems a bit elaborate, and I was wondering whether there was a built in option in the opensolver, for I think it is quite a common problem to solve a large set with a subset.
So does anyone have a suggestion on:
How my elaborate way drastically decreases performance? (*I have no intrinsic comprehension of the opensolver algorithms, yet.)
A suggestion to more easily/within the opensolver options account for my desire of max. 10 solution variables?
Based on the information provided below, I first scaled down the size of the problem:
I have three lists of data with 18 entries in columns:
W7:W23,AC7:AD23
which manually (with: W28 = 6000, AC28=600,W29 = 1,AC29 =1), in a linear combination,equal/exceed the target list:
EGM34:EG50
So what I did was put the descion variables in W28:W29, AC28:AD29
Where I added the constraint W28,AC28:AD28 = integer in the solver (both the original excel solver as in opensolver)
And I added the constraint W29,AC29:AD29 = Boolean in the solver (both the original excel solver as in opensolver)
Then I have a multiplication of the integer*boolean = the actual multiplication factor for the above lists in (W7:W23 etc)
In order to limit the nr of chosen variables I have also tried, in addition to the described constraints, to limit the cell with =sum(W29,AC29:AD29) to <= 10 (effectively reducing the amount of booleans set to true below 11, or so I thought, but the booleans aren't evaluated as booleans by the solver).
These new multiplied lists are placed in W34:W50,AC34:AD50, and the summation is situated in: EGY34:EGY50 Hence the final check is added as a constraint as:
EGY34:EGY50 =>EGM34:EGM50
And I had a question about how the linear solver evaluates these constraints, does it:
a. Think the sum of EGY34:EGY50 must be larger or equal than/to EGM34:EGM50
or
b. Does it think: "for every row x EGYx must be larger or equal than/to EGMx
So far I've noted b. but I would like to make sure.
But my main question concerns:
After using the Evolutionary algorithm as was kindly suggested in the comments below, how/why does it try values as 0.99994 for the desicion variables designated as booleans?
The introduction of binary variables is indeed the standard way to implement such constraints. Unfortunately, it transforms the problem from being a linear programming problem to being an integer programming problem (specifically a mixed integer linear programming problem). A standard approach to such problems is the branch and bound algorithm. This is what Excel's built-in solver seems to use, I'm not sure about the open solver that you are using. In the best case (where there is a lot of bounding) it will run fairly rapidly, even with problems of your size. In the worst case, for your problem it could be little better than what you would get by running the simplex algorithm C(506,10) = 2.8 x 10^20 times (once for each possible set of 10 decision variables). In other words, it might be infeasible. Integer programming is known to be NP-hard.
If an exact solution is infeasible, you could always use a heuristic algorithm such as an evolutionary approach.

Neural Network Input and Output Data formatting

and thanks for reading my thread.
I have read some of the previous posts on formatting/normalising input data for a Neural Network, but cannot find something that addresses my queries specifically. I apologise for the long post.
I am attempting to build a radial basis function network for analysing horse racing data. I realise that this has been done before, but the data that I have is "special" and I have a keen interest in racing/sportsbetting/programming so would like to give it a shot!
Whilst I think I understand the principles for the RBFN itself, I am having some trouble understanding the normalisation/formatting/scaling of the input data so that it is presented in a "sensible manner" for the network, and I am not sure how I should formulate the output target values.
For example, in my data I look at the "Class change", which compares the class of race that the horse is running in now compared to the race before, and can have a value between -5 and +5. I expect that I need to rescale these to between -1 and +1 (right?!), but I have noticed that many more runners have a class change of 1, 0 or -1 than any other value, so I am worried about "over-representation". It is not possible to gather more data for the higher/lower class changes because thats just 'the way the data comes'. Would it be best to use the data as-is after scaling, or should I trim extreme values, or something else?
Similarly, there are "continuous" inputs - like the "Days Since Last Run". It can have a value between 1 and about 1000, but values in the range of 10-40 vastly dominate. I was going to scale these values to be between 0 and 1, but even if I trim the most extreme values before scaling, I am still going to have a huge representation of a certain range - is this going to cause me an issue? How are problems like this usually dealt with?
Finally, I am having trouble understanding how to present the "target" values for training to the network. My existing results data has the "win/lose" (0 or 1?) and the odds at which the runner won or lost. If I just use the "win/lose", it treats all wins and loses the same when really they're not - I would be quite happy with a network that ignored all the small winners but was highly profitable from picking 10-1 shots. Similarly, a network could be forgiven for "losing" on a 20-1 shot but losing a bet at 2/5 would be a bad loss. I considered making the results (+1 * odds) for a winner and (-1 / odds) for a loser to capture the issue above, but this will mean that my results are not a continuous function as there will be a "discontinuity" between short price winners and short price losers.
Should I have two outputs to cover this - one for bet/no bet, and another for "stake"?
I am sorry for the flood of questions and the long post, but this would really help me set off on the right track.
Thank you for any help anyone can offer me!
Kind regards,
Paul
The documentation that came with your RBFN is a good starting point to answer some of these questions.
Trimming data aka "clamping" or "winsorizing" is something I use for similar data. For example "days since last run" for a horse could be anything from just one day to several years but tends to centre in the region of 20 to 30 days. Some experts use a figure of say 63 days to indicate a "spell" so you could have an indicator variable like "> 63 =1 else 0" for example. One clue is to look at outliers say the upper or lower 5% of any variable and clamp these.
If you use odds/dividends anywhere make sure you use the probabilities ie 1/(odds+1) and a useful idea is to normalize these to 100%.
The odds or parimutual prices tend to swamp other predictors so one technique is to develop separate models, one for the market variables (the market model) and another for the non-market variables (often called the "fundamental" model).

explain based on the knowledge of hardware why "Certain floating-point values cannot be exactly represented inside the computer’s memory"?

In the second line of the program’s output, notice that the value of
331.79, which is assigned to floatingVar, is actually displayed as 331.790009.The reason for this inaccuracy is the particular way in which numbers are internally represented inside the computer.You
have probably come across the same type of inaccuracy when dealing
with numbers on your calculator. If you divide 1 by 3 on your
calculator, you get the result .33333333, with perhaps some additional
3s tacked on at the end.The string of 3s is the calculator’s
approximation to one third.Theoretically, there should be an infinite
number of 3s. But the calculator can hold only so many digits, thus
the inherent inaccuracy of the machine.The same type of inaccuracy
applies here: Certain floatingpoint values cannot be exactly
represented inside the computer’s memory.
the above quote comes from Programming in Objective-C – 4th edition
And this post answered a little part but not the kind of answer i'm trying to look for.
Will try to find another book about this later in the day.
Anyway if anyone would like to answer this question, thanks!

Abstract Binary Search

In the article http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binarySearch,
he says
Careful readers may note that binary search can also be used when a
predicate yields a series of yes answers followed by a series of no
answers. This is true and complementing that predicate will satisfy
the original condition. For simplicity we'll deal only with predicates
described in the theorem.
I couldn't get what he meant could some one please explain?
Thanks
Imagine you're doing a binary search on a set of numbers: for the search to work, you need to put the numbers in order, so that the question "is this number less than the number I'm searching for" gives yeses followed by nos.
Example: searching for number 8 in the sequence [1,1,2,3,5,8,13,21]
is 1 less than 8 ? "yes"
is 1 less than 8 ? "yes"
is 2 less than 8 ? "yes"
is 3 less than 8 ? "yes"
is 5 less than 8 ? "yes"
is 8 less than 8 ? "no"
is 13 less than 8 ? "no"
is 21 less than 8 ? "no"
This means if you looked at, say the middle number in the sequence, you could tell instantly whether your target number was before or after this mid-point (if you get a 'no' look before, if you get a 'yes' look after). You can then exclude the unwanted half of the series and repeat the process with the remaining half...
This way of halving the search field at each step is the key to binary search, and guarantees you will find the target in O(log n) time.
Looking at the second part of your paragraph:
complementing that predicate will satisfy the original condition
To complement the predicate means to swap 'yes' and 'no', which would give us 'a series of no answers followed by a series of yes answers', which is referred to in the previous paragraph (the original condition).
So in summary, your quote is saying 'yes followed by no' will work equally as well as 'no followed by yes'
He is talking about formal logic, and using terms from formal logic.
Directly from the article:
"Behind the cryptic mathematics I am really stating that if you had a yes or no question (the predicate), getting a yes answer for some potential solution x means that you'd also get a yes answer for any element after x. Similarly, if you got a no answer, you'd get a no answer for any element before x. As a consequence, if you were to ask the question for each element in the search space (in order), you would get a series of no answers followed by a series of yes answers."
I think you're going to need to use some elbow grease and buff up on those terms. I'm not sure which part you're having trouble with.