Say I have some sorted result from a SQL query that looks like:
x y z
0 0 0
0 0 1
0 0 2
0 1 0
0 1 1
0 2 0
0 2 1
Where x, y and z are sort ranks. These sort ranks are always greater than 0, and smaller than 500mil.
Is there a way to combine the values from x, y and z into one "master" sort rank? Sorting the dataset using this "master" sort rank should result in the same ordering.
I'm thinking I can do something with bit shifting but I am not sure...
Assuming that every value in each of the three columns in between 1 and 500 million, you could use the following formula to generate a unique rank:
1000000
z + (500 x 10^6)*y + (500 x 10^6)*(500 x 10^6)*x
To generate this rank you could use the following query:
SELECT
x, y, z,
z + (500 * 1000000)*y + (500 * 1000000)*(500 * 1000000)*x AS master_rank
FROM yourTable;
The reason this works can be seen by examining say the z and y columns. The largest value from z is 500 million, which is guaranteed to be smaller than the smallest value in y, which is 1 billion. This logic applies to the whole formula. This approach is similar to using a bit mask, on a larger scale.
Note that I assume that your version of SQL can tolerate numbers this large. If it doesn't, then you might want to consider another approach here, possibly just ordering as #Gordon mentioned in his answer. Besides this, having 1 bil x 1 bil records would make for a very large table and would have other problems.
Do you mean something like this?
order by x * 10000 + y * 100 + z
(You would adjust the numbers for the width you need.)
I'm not sure why you would want to do that instead of:
order by x, y, z
If you do combine into a single value, be careful about integer overflow.
So there was a puzzle:
This equation is incomplete: 1 2 3 4 5 6 7 8 9 = 100. One way to make
it accurate is by adding seven plus and minus signs, like so: 1 + 2 +
3 – 4 + 5 + 6 + 78 + 9 = 100.
How can you do it using only 3 plus or minus signs?
I'm quite new to Prolog, solved the puzzle, but i wonder how to optimize it
makeInt(S,F,FinInt):-
getInt(S,F,0,FinInt).
getInt(Start, Finish, Acc, FinInt):-
0 =< Finish - Start,
NewAcc is Acc*10 + Start,
NewStart is Start +1,
getInt(NewStart, Finish, NewAcc, FinInt).
getInt(Start, Finish, A, A):-
0 > Finish - Start.
itCounts(X,Y,Z,Q):-
member(XLastDigit,[1,2,3,4,5,6]),
FromY is XLastDigit+1,
numlist(FromY, 7, ListYLastDigit),
member(YLastDigit, ListYLastDigit),
FromZ is YLastDigit+1,
numlist(FromZ, 8, ListZLastDigit),
member(ZLastDigit,ListZLastDigit),
FromQ is ZLastDigit+1,
member(YSign,[-1,1]),
member(ZSign,[-1,1]),
member(QSign,[-1,1]),
0 is XLastDigit + YSign*YLastDigit + ZSign*ZLastDigit + QSign*9,
makeInt(1, XLastDigit, FirstNumber),
makeInt(FromY, YLastDigit, SecondNumber),
makeInt(FromZ, ZLastDigit, ThirdNumber),
makeInt(FromQ, 9, FourthNumber),
X is FirstNumber,
Y is YSign*SecondNumber,
Z is ZSign*ThirdNumber,
Q is QSign*FourthNumber,
100 =:= X + Y + Z + Q.
Not sure this stands for an optimization. The code is just shorter:
sum_123456789_eq_100_with_3_sum_or_sub(L) :-
append([G1,G2,G3,G4], [0'1,0'2,0'3,0'4,0'5,0'6,0'7,0'8,0'9]),
maplist([X]>>(length(X,N), N>0), [G1,G2,G3,G4]),
maplist([G,F]>>(member(Op, [0'+,0'-]),F=[Op|G]), [G2,G3,G4], [F2,F3,F4]),
append([G1,F2,F3,F4], L),
read_term_from_codes(L, T, []),
100 is T.
It took me a while, but I got what your code is doing. It's something like this:
itCounts(X,Y,Z,Q) :- % generate X, Y, Z, and Q s.t. X+Y+Z+Q=100, etc.
generate X as a list of digits
do the same for Y, Z, and Q
pick the signs for Y, Z, and Q
convert all those lists of digits into numbers
verify that, with the signs, they add to 100.
The inefficiency here is that the testing is all done at the last minute. You can improve the efficiency if you can throw out some possible solutions as soon as you pick one of your numbers, that is, testing earlier.
itCounts(X,Y,Z,Q) :- % generate X, Y, Z, and Q s.t. X+Y+Z+Q=100, etc.
generate X as a list of digits, and convert it to a number
if it's so big or small the rest can't possibly bring the sum back to 100, fail
generate Y as a list of digits, convert to number, and pick it sign
if it's so big or so small the rest can't possibly bring the sum to 100, fail
do the same for Z
do the same for Q
Your function is running pretty fast already, even if I search all possible solutions. It only picks 6 X's; 42 Y's; 224 Z's; and 15 Q's. I don't think optimizing will be worth your while.
But if you really wanted to: I tested this by putting a testing function immediately after selecting an X. It reduced the 6 X's to 3 (all before finding the solution); 42 Y's to 30; 224 Z's to 184; and 15 Q's to 11. I believe we could reduce it further by testing immediately after a Y is picked, to see whether X YSign Y is already so large or small there can be no solution.
In PROLOG programs that are more computationally intensive, putting parts of the 'test' earlier in 'generate and test' algorithms can help a lot.
Suppose we have N (in this example N = 3) events that can happen depending on some variables. Each of them can generate certain profit or loses (event1 = 300, event2 = -100, event3 = 200), they are constrained by rules when they happen.
event 1 happens only when x > 5,
event 2 happens only when x = 2 and y = 3
event 3 happens only when x is odd.
The problem is to know the maximum profit.
Assume x, y are integer numbers >= 0
In the real problem there are many events and many dimensions.
(the solution should not be specific)
My question is:
Is this linear programming problem? If yes please provide solution to the example problem using this approach. If no please suggest some algorithms to optimize such problem.
This can be formulated as a mixed integer linear program. This is a linear program where some of the variables are constrained to be integer. Contrary to linear programs, solving the general integer program is NP-hard. However, there are many commercial or open source solvers that can solve efficiently large-scale problems. For up to 300 variables and constraints, you can use excel's solver.
Here is a way to formulate the above constraints:
If you go down this route, you might find this document useful.
the last constraint in an interesting one. I am assuming that x has to be integer, but if x can be either integer or continuous I will edit the answer accordingly.
I hope this helps!
Edit: L and U above should be interpreted as L1 and U1.
Edit 2: z2 needs to changed to (1-z2) on the 3rd and 4th constraint.
A specific answer:
seems more like a mathematical calculation than a programming problem, can't you just run a loop for x= 1->1000 to see what results occur?
for the example:
as x = 2 or 3 = -200 then x > 2 or 3, and if x < 5 doesn't get the 300, so all that is really happening is x > 5 and x = odd = maximum results.
x = 7 = 300 + 200 . = maximum profit for x
A general answer:
I don't see how to answer the question without seeing what the events are and how the events effect X ? Weather it's a linear or functional (mathematical) answer seems rather beside the point of finding the desired solution.
Guys I'm trying to implement deletion algorithm for a Red Black tree and I'm having problem with understanding line three of this algorithm (from a book "Introduction to Algorithms" second edition):
1 if left[z] = nil[T] or right[z] = nil[T]
2 then y ← z
3 else y ← TREE-SUCCESSOR(z)
4 if left[y] ≠ nil[T]
5 then x ← left[y]
6 else x ← right[y]
7 p[x] ← p[y]
8 if p[y] = nil[T]
9 then root[T] ← x
10 else if y = left[p[y]]
11 then left[p[y]] ← x
12 else right[p[y]] ← x
13 if y 3≠ z
14 then key[z] ← key[y]
15 copy y's satellite data into z
16 if color[y] = BLACK
17 then RB-DELETE-FIXUP(T, x)
18 return y
First of all there is nowhere in this book explained what the TREE-SUCCESSOR is suppose to look like (no algorithm for that) but I found this page: and if I feed this algorithm whit 11,2,1,7,5,8,14,15,4 and then try to delete 7 it finds predecessor but if I try to delete 11 it finds successor. And that what I can't understand. Why sometimes it takes predecessor and sometimes successor? What criteria are taken into consideration while making this decision? A node's color?
Thank you.
P.S. I also do not quite understand what it's written in line number 13. Does it mean that if y has three colors (neither black nor red) or something else?
tree successor (as the opposite of tree-predecessor [which is in that book i believe]) is generally defined for binary search trees as the node with the next highest key. How it determines it is dependent on the type (red-black in this case) and Im almost positive your book leaves the successor method as an exercise. (i remember the problem :P)
I think you're reading CLRS 2nd edition.
TREE-SUCCESSOR is introduced in Chapter 12 section 2 - "12.2 Querying binary search tree". And contrary to what Jesse Naugher says, it's not dependent on the type of binary search trees.
Line 13 you quoted is a typo. It should be "if y != z".
You can refer to following code.
http://code.google.com/p/cstl/source/browse/src/c_rb.c
I have a question about the SQL standard which I'm hoping a SQL language lawyer can help with.
Certain expressions just don't work. 62 / 0, for example. The SQL standard specifies quite a few ways in which expressions can go wrong in similar ways. Lots of languages deal with these expressions using special exceptional flow control, or bottom psuedo-values.
I have a table, t, with (only) two columns, x and y each of type int. I suspect it isn't relevant, but for definiteness let's say that (x,y) is the primary key of t. This table contains (only) the following values:
x y
7 2
3 0
4 1
26 5
31 0
9 3
What behavior is required by the SQL standard for SELECT expressions operating on this table which may involve division(s) by zero? Alternatively, if no one behavior is required, what behaviors are permitted?
For example, what behavior is required for the following select statements?
The easy one:
SELECT x, y, x / y AS quot
FROM t
A harder one:
SELECT x, y, x / y AS quot
FROM t
WHERE y != 0
An even harder one:
SELECT x, y, x / y AS quot
FROM t
WHERE x % 2 = 0
Would an implementation (say, one that failed to realize on a more complex version of this query that the restriction could be moved inside the extension) be permitted to produce a division by zero error in response to this query, because, say it attempted to divide 3 by 0 as part of the extension before performing the restriction and realizing that 3 % 2 = 1? This could become important if, for example, the extension was over a small table but the result--when joined with a large table and restricted on the basis of data in the large table--ended up restricting away all of the rows which would have required division by zero.
If t had millions of rows, and this last query were performed by a table scan, would an implementation be permitted to return the first several million results before discovering a division by zero near the end when encountering one even value of x with a zero value of y? Would it be required to buffer?
There are even worse cases, ponder this one, which depending on the semantics can ruin boolean short-circuiting or require four-valued boolean logic in restrictions:
SELECT x, y
FROM t
WHERE ((x / y) >= 2) AND ((x % 2) = 0)
If the table is large, this short-circuiting problem can get really crazy. Imagine the table had a million rows, one of which had a 0 divisor. What would the standard say is the semantics of:
SELECT CASE
WHEN EXISTS
(
SELECT x, y, x / y AS quot
FROM t
)
THEN 1
ELSE 0
END AS what_is_my_value
It seems like this value should probably be an error since it depends on the emptiness or non-emptiness of a result which is an error, but adopting those semantics would seem to prohibit the optimizer for short-circuiting the table scan here. Does this existence query require proving the existence of one non-bottoming row, or also the non-existence of a bottoming row?
I'd appreciate guidance here, because I can't seem to find the relevant part(s) of the specification.
All implementations of SQL that I've worked with treat a division by 0 as an immediate NaN or #INF. The division is supposed to be handled by the front end, not by the implementation itself. The query should not bottom out, but the result set needs to return NaN in this case. Therefore, it's returned at the same time as the result set, and no special warning or message is brought up to the user.
At any rate, to properly deal with this, use the following query:
select
x, y,
case y
when 0 then null
else x / y
end as quot
from
t
To answer your last question, this statement:
SELECT x, y, x / y AS quot
FROM t
Would return this:
x y quot
7 2 3.5
3 0 NaN
4 1 4
26 5 5.2
31 0 NaN
9 3 3
So, your exists would find all the rows in t, regardless of what their quotient was.
Additionally, I was reading over your question again and realized I hadn't discussed where clauses (for shame!). The where clause, or predicate, should always be applied before the columns are calculated.
Think about this query:
select x, y, x/y as quot from t where x%2 = 0
If we had a record (3,0), it applies the where condition, and checks if 3 % 2 = 0. It does not, so it doesn't include that record in the column calculations, and leaves it right where it is.