I have been struggling with this for several hours now, and I can't seem to find the answers here either. (there are many posts about Binary Heap, but I did not this particular problem).
The problem is:
For a Binary Heap with 1492 nodes, the number of nodes of height two is _187_.
I understand that with 1492 nodes, the binary heap has the depth log(1492)/log(2) = 10
height two should have 2^(10-2) nodes which should be 256
Why is the answer 187?
In case someone needs to know. I found out the formula is n / 2^(h+1), so 1492 / 2^(2+1) = 186.5.
This is a part of the engine-log output that I get from a small-scale mixed integer linear optimization problem that I solved in CPLEX 12.7.0
Nodes Cuts/
Node Left Objective IInf Best Integer Best Bound ItCnt Gap
0 0 280.0338 78 280.0338 72
0 0 428.8558 28 Cuts: 89 137
0 0 429.5221 34 Cuts: 2 142
0 0 429.7745 34 MIRcuts: 2 143
* 0+ 0 460.9166 429.7745 6.76%
0 2 429.7745 34 460.9166 429.8666 143 6.74%
Elapsed time = 0.49 sec. (31.07 ticks, tree = 0.01 MB, solutions = 1)
* 35 8 integral 0 438.1448 435.6381 211 0.57%
Cover cuts applied: 17
Implied bound cuts applied: 10
Flow cuts applied: 11
Mixed integer rounding cuts applied: 9
Gomory fractional cuts applied: 24
Root node processing (before b&c):
Real time = 0.45 sec. (31.09 ticks)
Sequential b&c:
Real time = 0.08 sec. (20.80 ticks)
Total (root+branch&cut) = 0.53 sec. (51.89 ticks)
What I understand from this, is that the best integer solution (for the objective function) found has the value of 438.1448, whereas the relaxed solution (non integer values) has the value of 435.6381 as best bound solution.
( 438.1448 / 435.6381 ) - 1 = 0.57% GAP
Does this mean that the solution still has that small gap, however it is proven to be the optimal solution? I had the (maybe wrong) idea that optimality is proven by a 0% gap.
I'm not sure how to interpret it correctly. Thanks for your help in advance.
Your understanding of the best bound isn't 100% correct. You can think of the best bound as the best objective value an integer solution could potentially have, based on information the solver has discovered so far. In your case there might actually be a better solution than the one you found, but if there is, it won't have an objective value better than 435.6381.
A more technical definition of the best bound is the best relaxed-but-region-constrained solution for any region that has not yet been eliminated from the search space. Solvers like CPLEX search for an optimal solution by splitting the search space into sub-regions and then ruling out sub-regions that can't possibly contain the optimal integer-feasible solution. These sub-regions get split into sub-sub-regions, and so on. Within each region, the original problem is modified to force variables to fall within the region. The relaxed solution to this modified problem is the best bound for the region. The best of these region-specific best bounds is the best bound for the problem as a whole.
The best bound changes as regions are ruled out. If the best bound does not equal the best solution, then by definition, there is still at least one region other than the region holding the current incumbent that could potentially hold a better solution. Exploring one of these regions might uncover an even better solution than your current incumbent, or it might lead to the region being ruled out. You don't know which until the region is explored. Only when the best solution equals the best bound do you know for sure that there isn't a better solution hiding in a remaining region.
Yes you are right. The optimality is proven if the upper bound and the lower bound evaluate the same value, i.e. CPLEX could prove an optimality gap of 0%.
Since CPLEX stops with a solution that has a gap of 0.57%, I would assume that you configured an MIP-gap <1%. If you are interested in a solution with proven optimal, you should change the MIPGap parameter to zero. See also here.
I have an assignment with the following prompt:
The page size for a virtual memory system is 8KB.
The instruction TLB is direct-mapped with 2 sets and each block contains one translation.
^(I don't believe this is relevant for the following 3 questions, as there are two more questions about the TLB)
The number of bits in a virtual address is 20.
The number of bits in a physical address is 15.
(1) What is the number of virtual pages?
I think I have this one figured out.
Page size = 8 * 2^10 = 8192, so the offset is 13 bits.
Virtual page number = 20 - 13 = 7 bits
Virtual pages = 2^7 pages
(2) What is the number of physical pages?
Here's where I'm a little confused. I think I'm supposed to add in the valid, dirty, and reference bits to the physical page number (which is 2, from 15 - 13). However 5 * 2^7 = 640 bytes, which seems incredibly small.
(3) How many bits are used in the virtual address for the page offset?
Answered above, it appears to be 13 bits.
Could anyone point me in the right direction? Thanks!
The valid, dirty, and reference bits are in a page table entry but are not part of the address bits. Therefore using your results there are 2^2 or 4 physical pages.
Yes this does seem small, but realize that there is only 2^15 or 32K bytes of physical memory.
I was trying to solve Google Code Jam problems and there is one of them that I don't understand. Here is the question (World Finals 2013 - problem C): https://code.google.com/codejam/contest/2437491/dashboard#s=p2&a=2
And here follows the problem analysis: https://code.google.com/codejam/contest/2437491/dashboard#s=a&a=2
I don't understand why we can use binary search. In order to use binary search the elements have to be sorted. In order words: for a given element e, we can't have any element less than e at its right side. But that is not the case in this problem. Let me give you an example:
Suppose we do what the analysis tells us to do: we start with a left bound angle of 90° and a right bound angle of 0°. Our first search will be at angle of 45°. Suppose we find that, for this angle, X < N. In this case, the analysis tells us to make our left bound 45°. At this point, we can have discarded a viable solution (at, let's say, 75°) and at the same time there can be no more solutions between 0° and 45°, leading us to say that there's no solution (wrongly).
I don't think Google's solution is wrong =P. But I can't figure out why we can use a binary search in this case. Anyone knows?
I don't understand why we can use binary search. In order to use
binary search the elements have to be sorted. In order words: for a
given element e, we can't have any element less than e at its right
side. But that is not the case in this problem.
A binary search works in this case because:
the values vary by at most 1
we only need to find one solution, not all of them
the first and last value straddle the desired value (X .. N .. 2N-X)
I don't quite follow your counter-example, but here's an example of a binary search on a sequence with the above constraints. Looking for 3:
1 2 1 1 2 3 2 3 4 5 4 4 3 3 4 5 4 4
[ ]
[ ]
[ ]
[ ]
I have read the problem and in the meantime thought about the solution. When I read the solution I have seen that they have mostly done the same as I would have, however, I did not thought about some minor optimizations they were using, as I was still digesting the task.
Step1: They choose a median so that each of the line splits the set into half, therefore there will be two provinces having x mines, while the other two provinces will have N - x mines, respectively, because the two lines each split the set into half and
2 * x + 2 * (2 * N - x) = 2 * x + 4 * N - 2 * x = 4 * N.
If x = N, then we were lucky and accidentally found a solution.
Step2: They are taking advantage of the "fact" that no three lines are collinear. I believe they are wrong, as the task did not tell us this is the case and they have taken advantage of this "fact", because they assumed that the task is solvable, however, in the task they were clearly asking us to tell them if the task is impossible with the current input. I believe this part is smelly. However, the task is not necessarily solvable, not to mention the fact that there might be a solution even for the case when three mines are collinear.
Thus, somewhere in between X had to be exactly equal to N!
Not true either, as they have stated in the task that
You should output IMPOSSIBLE instead if there is no good placement of
Step 3: They are still using the "fact" described as un-true in the previous step.
So let us close the book and think ourselves. Their solution is not bad, but they assume something which is not necessarily true. I believe them that all their inputs contained mines corresponding to their assumption, but this is not necessarily the case, as the task did not clearly state this and I can easily create a solvable input having three collinear mines.
Their idea for median choice is correct, so we must follow this procedure, the problem gets more complicated if we do not do this step. Now, we could search for a solution by modifying the angle until we find a solution or reach the border of the period (this was my idea initially). However, we know which provinces have too much mines and which provinces do not have enough mines. Also, we know that the period is pi/2 or, in other terms 90 degrees, because if we move alpha by pi/2 into either positive (counter-clockwise) or negative (clockwise) direction, then we have the same problem, but each child gets a different province, which is irrelevant from our point of view, they will still be rivals, I guess, but this does not concern us.
Now, we try and see what happens if we rotate the lines by pi/4. We will see that some mines might have changed borders. We have either not reached a solution yet, or have gone too far and poor provinces became rich and rich provinces became poor. In either case we know in which half the solution should be, so we rotate back/forward by pi/8. Then, with the same logic, by pi/16, until we have found a solution or there is no solution.
Back to the question, we cannot arrive into the situation described by you, because if there was a valid solution at 75 degrees, then we would see that we have not rotated the lines enough by rotating only 45 degrees, because then based on the number of mines which have changed borders we would be able to determine the right angle-interval. Remember, that we have two rich provinces and two poor provinces. Each rich provinces have two poor bordering provinces and vice-versa. So, the poor provinces should gain mines and the rich provinces should lose mines. If, when rotating by 45 degrees we see that the poor provinces did not get enough mines, then we will choose to rotate more until we see they have gained enough mines. If they have gained too many mines, then we change direction.
I'm currently working on a project that involves a lot of bit level manipulation of data such as comparison, masking and shifting. Essentially I need to search through chunks of bitstreams between 8kbytes - 32kbytes long for bit patterns between 20 - 40bytes long.
Does anyone know of general resources for optimizing for such operations in CUDA?
There has been a least a couple of questions on SO on how to do text searches with CUDA. That is, finding instances of short byte-strings in long byte-strings. That is similar to what you want to do. That is, a byte-string search is much like a bit-string search where the number of bits in the byte-string can only be a multiple of 8, and the algorithm only checks for matches every 8 bits. Search on SO for CUDA string searching or matching, and see if you can find them.
I don't know of any general resources for this, but I would try something like this:
Start by preparing 8 versions of each of the search bit-strings. Each bit-string shifted a different number of bits. Also prepare start and end masks:
Then, essentially, perform byte-string searches with the different bit-strings and masks.
If you're using a device with compute capability >= 2.0, store the shifted bit-strings in global memory. The start and end masks can probably just be constants in your program.
Then, for each byte position, launch 8 threads that each checks a different version of the 8 shifted bit-strings against the long bit-string (which you now treat like a byte-string). In each block, launch enough threads to check, for instance, 32 bytes, so that the total number of threads per block becomes 32 * 8 = 256. The L1 cache should be able to hold the shifted bit-strings for each block, so that you get good performance.
At http://cr.yp.to/primegen.html you can find sources of program that uses Atkin's sieve to generate primes. As the author says that it may take few months to answer an e-mail sent to him (I understand that, he sure is an occupied man!) I'm posting this question.
The page states that 'primegen can generate primes up to 1000000000000000'. I am trying to understand why it is so. There is of course a limitation up to 2^64 ~ 2 * 10^19 (size of long unsigned int) because this is how the numbers are represented. I know for sure that if there would be a huge prime gap (> 2^31) then printing of numbers would fail. However in this range I think there is no such prime gap.
Either the author overestimated the bound (and really it is around 10^19) or there is a place in the source code where the arithmetic operation can overflow or something like that.
The funny thing is that you actually MAY run it for numbers > 10^15:
./primes 10000000000000000 10000000000000100
and if you believe Wolfram Alpha, it is correct.
Some facts I had "reverse-engineered":
numbers are sifted in batches of 1,920 * PRIMEGEN_WORDS = 3,932,160 numbers (see primegen_fill function in primegen_next.c)
PRIMEGEN_WORDS controls how big a single sifting is - you can adjust it in primegen_impl.h to fit your CPU cache,
the implementation of the sieve itself is in primegen.c file - I assume it is correct; what you get is a bitmask of primes in pg->buf (see primegen_fill function)
The bitmask is analyzed and primes are stored in pg->p array.
I see no point where the overflow may happen.
I wish I was on my computer to look, but I suspect you would have different success if you started at 1 as your lower bound.
Just from the algorithm, I would conclude that the upper bound comes from the 32 bit numbers.
The page mentiones Pentium-III as CPU so my guess it is very old and does not use 64 bit.
2^32 are approx 10^9. Sieve of Atkins (which the algorithm uses) requires N^(1/2) bits (it uses a big bitfield). Which means in 2^32 big memory you can make (conservativ) N approx 10^15. As this number is a rough conservative upper bound (you have system and other programs occupying memory, reserving address ranges for IO,...) the real upper bound is/might be higher.