Find index in bitmask - sql

Halo Binary experts out there,
I have a bitmask representing ON options of a room, say 11001 (Opt #1, 4, 5)
Another user tries to search with his own bitmask (say 111000 (Opt #4, 5, 6)
Doing a (Room | Search) != 0 means it compares the mask and see if there's a same switch ON (In this case #4)
The thing is, switch #4 is the 2nd 'active' switch of the Room, and the 1st for the 'Search'.
I need the user to also find that #4 is the 2nd switch of the Room.
User have the separate bitmask to check against.
Both Room and User can only have 3 switch on
My approach can know which is the last(biggest) index, using UserMask2 for example(3rd option):
if(RoomMask | UserMask2 != 0) A = (RoomMask - UserMask2 = 11001 - 10000 = 01001).
if (A < RoomMask) UserMask2 is the 3rd switch because minus-ing by UserMask2 makes it lower than UserMask2
This only finds if a certain UserMask is indeed on the RoomMask' biggest bit.
But I'm not sure how to continue.
This is used for Matchmaking search using Photon, which uses SQL and I probably have to manage to do this with a single WHERE query(not sure if it can store variables.)
https://doc.photonengine.com/en-us/pun/current/lobby-and-matchmaking/matchmaking-and-lobby#sql_lobby_type
I hope I'm clear enough
Cheers!

Your question is not very clear and hence I'm not 100% that this answer is what you are looking for. Still, here you have.
You can search for contents of columns using the like key, where you have two wildcards:
% means anything,
_ means any character.
Now, suppose that you want to find a match for switches 1, 4 and 7 in a map of, say, 10 possible switches (being 1 the leftmost in this example), you could use the clause:
WHERE Bitmap like '1__1__1___`
of course, this is just an example and you would need to adapt it to your specific context.

Related

Binary search start or end is target

Why is it that when I see example code for binary search there is never an if statement to check if the start of the array or end is the target?
import java.util.Arrays;
public class App {
public static int binary_search(int[] arr, int left, int right, int target) {
if (left > right) {
return -1;
}
int mid = (left + right) / 2;
if (target == arr[mid]) {
return mid;
}
if (target < arr[mid]) {
return binary_search(arr, left, mid - 1, target);
}
return binary_search(arr, mid + 1, right, target);
}
public static void main(String[] args) {
int[] arr = { 3, 2, 4, -1, 0, 1, 10, 20, 9, 7 };
Arrays.sort(arr);
for (int i = 0; i < arr.length; i++) {
System.out.println("Index: " + i + " value: " + arr[i]);
}
System.out.println(binary_search(arr, arr[0], arr.length - 1, -1));
}
}
in this example if the target was -1 or 20 the search would enter recursion. But it added an if statement to check if target is mid, so why not add two more statements also checking if its left or right?
EDIT:
As pointed out in the comments, I may have misinterpreted the initial question. The answer below assumes that OP meant having the start/end checks as part of each step of the recursion, as opposed to checking once before the recursion even starts.
Since I don't know for sure which interpretation was intended, I'm leaving this post here for now.
Original post:
You seem to be under the impression that "they added an extra check for mid, so surely they should also add an extra check for start and end".
The check "Is mid the target?" is in fact not a mere optimization they added. Recursively checking "mid" is the whole point of a binary search.
When you have a sorted array of elements, a binary search works like this:
Compare the middle element to the target
If the middle element is smaller, throw away the first half
If the middle element is larger, throw away the second half
Otherwise, we found it!
Repeat until we either find the target or there are no more elements.
The act of checking the middle is fundamental to determining which half of the array to continue searching through.
Now, let's say we also add a check for start and end. What does this gain us? Well, if at any point the target happens to be at the very start or end of a segment, we skip a few steps and end slightly sooner. Is this a likely event?
For small toy examples with a few elements, yeah, maybe.
For a massive real-world dataset with billions of entries? Hm, let's think about it. For the sake of simplicity, we assume that we know the target is in the array.
We start with the whole array. Is the first element the target? The odds of that is one in a billion. Pretty unlikely. Is the last element the target? The odds of that is also one in a billion. Pretty unlikely too. You've wasted two extra comparisons to speed up an extremely unlikely case.
We limit ourselves to, say, the first half. We do the same thing again. Is the first element the target? Probably not since the odds are one in half a billion.
...and so on.
The bigger the dataset, the more useless the start/end "optimization" becomes. In fact, in terms of (maximally optimized) comparisons, each step of the algorithm has three comparisons instead of the usual one. VERY roughly estimated, that suggests that the algorithm on average becomes three times slower.
Even for smaller datasets, it is of dubious use since it basically becomes a quasi-linear search instead of a binary search. Yes, the odds are higher, but on average, we can expect a larger amount of comparisons before we reach our target.
The whole point of a binary search is to reach the target with as few wasted comparisons as possible. Adding more unlikely-to-succeed comparisons is typically not the way to improve that.
Edit:
The implementation as posted by OP may also confuse the issue slightly. The implementation chooses to make two comparisons between target and mid. A more optimal implementation would instead make a single three-way comparison (i.e. determine ">", "=" or "<" as a single step instead of two separate ones). This is, for instance, how Java's compareTo or C++'s <=> normally works.
BambooleanLogic's answer is correct and comprehensive. I was curious about how much slower this 'optimization' made binary search, so I wrote a short script to test the change in how many comparisons are performed on average:
Given an array of integers 0, ... , N
do a binary search for every integer in the array,
and count the total number of array accesses made.
To be fair to the optimization, I made it so that after checking arr[left] against target, we increase left by 1, and similarly for right, so that every comparison is as useful as possible. You can try this yourself at Try it online
Results:
Binary search on size 10: Standard 29 Optimized 43 Ratio 1.4828
Binary search on size 100: Standard 580 Optimized 1180 Ratio 2.0345
Binary search on size 1000: Standard 8987 Optimized 21247 Ratio 2.3642
Binary search on size 10000: Standard 123631 Optimized 311205 Ratio 2.5172
Binary search on size 100000: Standard 1568946 Optimized 4108630 Ratio 2.6187
Binary search on size 1000000: Standard 18951445 Optimized 51068017 Ratio 2.6947
Binary search on size 10000000: Standard 223222809 Optimized 610154319 Ratio 2.7334
so the total comparisons does seem to tend to triple the standard number, implying the optimization becomes increasingly unhelpful for larger arrays. I'd be curious whether the limiting ratio is exactly 3.
To add some extra check for start and end along with the mid value is not impressive.
In any algorithm design the main concerned is moving around it's complexity either it is time complexity or space complexity. Most of the time the time complexity is taken as more important aspect.
To learn more about Binary Search Algorithm in different use case like -
If Array is not containing any repeated
If Array has repeated element in this case -
a) return leftmost index/value
b) return rightmost index/value
and many more point

Accessing chain position or multi-entity subchain in optaplanner?

We are using Optaplanner to come up with music playlists that follow a set of sound musical principles and rules (with respect to key changes, etc):
https://www.youtube.com/watch?v=Iqya6xqc1jY
We’re using chained planning variables to avoid disrupting an otherwise solid “streak” of playlist tracks, but many of our rules involve some aspect of temporal reasoning across a subchain of X previous tracks. As an example,I’m trying to implement a rule that requires the key to change at least every five songs (to keep the playlist from getting too boring/monotonous). What I’ve come up with works, but I’m wondering if there’s a less awkward way of doing it.
Here’s the rule as we have it right now, which I feel is ugly from a DRY and configurability perspective:
https://github.com/spotfire-io/spotfire-solver/blob/1c0fcda5256c337e214b33043a27fc25f615d0ef/src/main/resources/io/spotfire/solver/rules/rules.drl#L79-L88
rule "Should change key at least once every five songs"
when
$t0: RestPlaylistTrack(keyDistance == 0) // previousTrack is a chained variable
$t1: RestPlaylistTrack(keyDistance == 0, previousTrack == $t0)
$t2: RestPlaylistTrack(keyDistance == 0, previousTrack == $t1)
$t3: RestPlaylistTrack(keyDistance == 0, previousTrack == $t3)
$t4: RestPlaylistTrack(keyDistance == 0, previousTrack == $t3)
then
scoreHolder.addSoftConstraintMatch(kcontext, 0, new BigDecimal(-2));
end
Another example would be implementing a rule that batches tracks of the same genre together (e.g. play 4 jazz tracks in a row followed by 4 rock tracks), or ensuring that we avoid playing the same artist 5 tracks from the last time we played that artist.
In this example, Is there a better way to keep track of the distance between two tracks and then specify a constraint on that? Some potential options we’ve considered include…
Provide a way to extract X-length sub-chains programmatically and apply the rules to that subchain.
Create a shadow variable that represents the position of the track relative to the anchor. Then we could create constraints like RestPlaylistTrack(position < $t.position, position > $t.position - 5) to apply to any tracks within 5 tracks of $t.
Using some sort of Drools aggregate expression that accumulates previous tracks via a map-reducey thing until reaching a certain maximum number of tracks.
The challenge we perceive with the first two solutions is that a chain swap move involves changes to three planning variables. If we have a chain that looks like A <- B <- C <- D, a swap between B and D involves a change to point D to A, B to C, and C to D. At the Drools or shadow variable level, I think there’s a risk doing a bunch of intermediate calculations before the move is complete. This might make score calculation pretty inefficient. For the third option, we’re just not sure how something like that would work mechanically.
If anyone (especially #geoffrey-de-smet) has examples on how this could be done, that would be greatly appreciated. If this is legitimately tricky in the current version of Optaplanner, we think adding a native position mechanism to chained planning modes would be super helpful as a future feature.
It sounds like consecutive shift constraints in nurse rostering. Detecting "no more than n shifts in a row" is non trivial in hand-written DRL. In nurse rostering, we use insertLogicals to deal with those, but I would recommend not to use that (it kills performance). I guesstimate that approach 1) (which gives up incremental calculation) is still faster than any insertLogical approach, unless you're queuing up thousands of songs.
In ConstraintStreams, approach 1 could maybe one day look like this:
constraintFactory.from(Shift.class)
.groupBy(Shift::getEmployee,
sort(Comparable.from(Shift::getStartDateTime, Shift::getId)) // BiConstraintStream<Employee, List<Shift>))
.penalize((employee, sortedShiftList) -> ...); // One match for all bad subsequences of 1 employee
Approach 2) is interesting. Try it out and let us know if it works well enough for you.
Approach 3) is what I aim thinking of in ConstraintStreams at some point. This is incremental. Something like:
constraintFactory.from(Shift.class)
.forEachSortedSubList(Shift::getEmployee,
Comparable.from(Shift::getStartDateTime, Shift::getId,
(employee, sortedShiftSubList) -> ...)
.penalize(...); // One match per bad subsequence
If you have any suggestions on how you 'd like to use the API for approach 3) or how using it could look like, please put them on our google group discussion forum. It could help move the work along.

Enumerable.Count not working

I am monitoring the power of a laser and I want to know when n consecutive measurements are outside a safe range. I have a Queue(Of Double) which has n items (2 in my example) at the time it's being checked. I want to check that all items in the queue satisfy a condition, so I pass the items through a Count() with a predicate. However, the count function always returns the number of items in the queue, even if they don't all satisfy the predicate.
ElseIf _consecutiveMeasurements.AsEnumerable().Count(Function(m) m <= Me.CriticalLowLevel) = ConsecutiveCount Then
_ownedISetInstrument.Disable()
' do other things
A view of the debugger with the execution moving into the If.
Clearly, there are two measurements in the queue, and they are both greater than the CriticalLowLevel, so the count should be zero. I first tried Enumerable.Where(predicate).Count() and I got the same result. What's going on?
Edit:
Of course the values are below the CriticalLowLevel, which I had mistakenly set to 598 instead of 498 for testing. I had over-complicated the problem by focusing my attention on the code when it was my test case which was faulty. I guess I couldn't see the forest for the trees, so they say. Thanks Eric for pointing it out.
Based on your debug snapshot, it looks like both of your measurements are less than the critical level of 598.0, so I would expect the count to match the queue length.
Both data points are <= Me.CriticalLowLevel.
Can you share an example where one of the data points is > Me.CriticalLowLevel that still exhibits this behavior?

Understanding Google Code Jam 2013 - X Marks the Spot

I was trying to solve Google Code Jam problems and there is one of them that I don't understand. Here is the question (World Finals 2013 - problem C): https://code.google.com/codejam/contest/2437491/dashboard#s=p2&a=2
And here follows the problem analysis: https://code.google.com/codejam/contest/2437491/dashboard#s=a&a=2
I don't understand why we can use binary search. In order to use binary search the elements have to be sorted. In order words: for a given element e, we can't have any element less than e at its right side. But that is not the case in this problem. Let me give you an example:
Suppose we do what the analysis tells us to do: we start with a left bound angle of 90° and a right bound angle of 0°. Our first search will be at angle of 45°. Suppose we find that, for this angle, X < N. In this case, the analysis tells us to make our left bound 45°. At this point, we can have discarded a viable solution (at, let's say, 75°) and at the same time there can be no more solutions between 0° and 45°, leading us to say that there's no solution (wrongly).
I don't think Google's solution is wrong =P. But I can't figure out why we can use a binary search in this case. Anyone knows?
I don't understand why we can use binary search. In order to use
binary search the elements have to be sorted. In order words: for a
given element e, we can't have any element less than e at its right
side. But that is not the case in this problem.
A binary search works in this case because:
the values vary by at most 1
we only need to find one solution, not all of them
the first and last value straddle the desired value (X .. N .. 2N-X)
I don't quite follow your counter-example, but here's an example of a binary search on a sequence with the above constraints. Looking for 3:
1 2 1 1 2 3 2 3 4 5 4 4 3 3 4 5 4 4
[ ]
[ ]
[ ]
[ ]
*
I have read the problem and in the meantime thought about the solution. When I read the solution I have seen that they have mostly done the same as I would have, however, I did not thought about some minor optimizations they were using, as I was still digesting the task.
Solution:
Step1: They choose a median so that each of the line splits the set into half, therefore there will be two provinces having x mines, while the other two provinces will have N - x mines, respectively, because the two lines each split the set into half and
2 * x + 2 * (2 * N - x) = 2 * x + 4 * N - 2 * x = 4 * N.
If x = N, then we were lucky and accidentally found a solution.
Step2: They are taking advantage of the "fact" that no three lines are collinear. I believe they are wrong, as the task did not tell us this is the case and they have taken advantage of this "fact", because they assumed that the task is solvable, however, in the task they were clearly asking us to tell them if the task is impossible with the current input. I believe this part is smelly. However, the task is not necessarily solvable, not to mention the fact that there might be a solution even for the case when three mines are collinear.
Thus, somewhere in between X had to be exactly equal to N!
Not true either, as they have stated in the task that
You should output IMPOSSIBLE instead if there is no good placement of
borders.
Step 3: They are still using the "fact" described as un-true in the previous step.
So let us close the book and think ourselves. Their solution is not bad, but they assume something which is not necessarily true. I believe them that all their inputs contained mines corresponding to their assumption, but this is not necessarily the case, as the task did not clearly state this and I can easily create a solvable input having three collinear mines.
Their idea for median choice is correct, so we must follow this procedure, the problem gets more complicated if we do not do this step. Now, we could search for a solution by modifying the angle until we find a solution or reach the border of the period (this was my idea initially). However, we know which provinces have too much mines and which provinces do not have enough mines. Also, we know that the period is pi/2 or, in other terms 90 degrees, because if we move alpha by pi/2 into either positive (counter-clockwise) or negative (clockwise) direction, then we have the same problem, but each child gets a different province, which is irrelevant from our point of view, they will still be rivals, I guess, but this does not concern us.
Now, we try and see what happens if we rotate the lines by pi/4. We will see that some mines might have changed borders. We have either not reached a solution yet, or have gone too far and poor provinces became rich and rich provinces became poor. In either case we know in which half the solution should be, so we rotate back/forward by pi/8. Then, with the same logic, by pi/16, until we have found a solution or there is no solution.
Back to the question, we cannot arrive into the situation described by you, because if there was a valid solution at 75 degrees, then we would see that we have not rotated the lines enough by rotating only 45 degrees, because then based on the number of mines which have changed borders we would be able to determine the right angle-interval. Remember, that we have two rich provinces and two poor provinces. Each rich provinces have two poor bordering provinces and vice-versa. So, the poor provinces should gain mines and the rich provinces should lose mines. If, when rotating by 45 degrees we see that the poor provinces did not get enough mines, then we will choose to rotate more until we see they have gained enough mines. If they have gained too many mines, then we change direction.

Practice of checking 'trueness' or 'equality' in conditional statements - does it really make sense?

I remember many years back, when I was in school, one of my computer science teachers taught us that it was better to check for 'trueness' or 'equality' of a condition and not the negative stuff like 'inequality'.
Let me elaborate - If a piece of conditional code can be written by checking whether an expression is true or false, we should check the 'trueness'.
Example: Finding out whether a number is odd - it can be done in two ways:
if ( num % 2 != 0 )
{
// Number is odd
}
or
if ( num % 2 == 1 )
{
// Number is odd
}
(Please refer to the marked answer for a better example.)
When I was beginning to code, I knew that num % 2 == 0 implies the number is even, so I just put a ! there to check if it is odd. But he was like 'Don't check NOT conditions. Have the practice of checking the 'trueness' or 'equality' of conditions whenever possible.' And he recommended that I use the second piece of code.
I am not for or against either but I just wanted to know - what difference does it make? Please don't reply 'Technically the output will be the same' - we ALL know that. Is it a general programming practice or is it his own programming practice that he is preaching to others?
NOTE: I used C#/C++ style syntax for no reason. My question is equally applicable when using the IsNot, <> operators in VB etc. So readability of the '!' operator is just one of the issues. Not THE issue.
The problem occurs when, later in the project, more conditions are added - one of the projects I'm currently working on has steadily collected conditions over time (and then some of those conditions were moved into struts tags, then some to JSTL...) - one negative isn't hard to read, but 5+ is a nightmare, especially when someone decides to reorganize and negate the whole thing. Maybe on a new project, you'll write:
if (authorityLvl!=Admin){
doA();
}else{
doB();
}
Check back in a month, and it's become this:
if (!(authorityLvl!=Admin && authorityLvl!=Manager)){
doB();
}else{
doA();
}
Still pretty simple, but it takes another second.
Now give it another 5 to 10 years to rot.
(x%2!=0) certainly isn't a problem, but perhaps the best way to avoid the above scenario is to teach students not to use negative conditions as a general rule, in the hopes that they'll use some judgement before they do - because just saying that it could become a maintenance problem probably won't be enough motivation.
As an addendum, a better way to write the code would be:
userHasAuthority = (authorityLvl==Admin);
if (userHasAuthority){
doB();
else{
doA();
}
Now future coders are more likely to just add "|| authorityLvl==Manager", userHasAuthority is easier to move into a method, and even if the conditional is reorganized, it will only have one negative. Moreover, no one will add a security hole to the application by making a mistake while applying De Morgan's Law.
I will disagree with your old professor - checking for a NOT condition is fine as long as you are checking for a specific NOT condition. It actually meets his criteria: you would be checking that it is TRUE that a value is NOT something.
I grok what he means though - mostly the true condition(s) will be orders of magnitude smaller in quantity than the NOT conditions, therefore easier to test for as you are checking a smaller set of values.
I've had people tell me that it's to do with how "visible" the ping (!) character is when skim reading.
If someone habitually "skim reads" code - perhaps because they feel their regular reading speed is too slow - then the ! can be easily missed, giving them a critical mis-understanding of the code.
On the other hand, if a someone actually reads all of the code all of the time, then there is no issue.
Two very good developers I've worked with (and respect highily) will each write == false instead of using ! for similar reasons.
The key factor in my mind is less to do with what works for you (or me!), and more with what works for the guy maintaining the code. If the code is never going to be seen or maintained by anyone else, follow your personal whim; if the code needs to be maintained by others, better to steer more towards the middle of the road. A minor (trivial!) compromise on your part now, might save someone else a week of debugging later on.
Update: On further consideration, I would suggest factoring out the condition as a separate predicate function would give still greater maintainability:
if (isOdd(num))
{
// Number is odd
}
You still have to be careful about things like this:
if ( num % 2 == 1 )
{
// Number is odd
}
If num is negative and odd then depending on the language or implementation num % 2 could equal -1. On that note, there is nothing wrong with checking for the falseness if it simplifies at least the syntax of the check. Also, using != is more clear to me than just !-ing the whole thing as the ! may blend in with the parenthesis.
To only check the trueness you would have to do:
if ( num % 2 == 1 || num % 2 == -1 )
{
// Number is odd
}
That is just an example obviously. The point is that if using a negation allows for fewer checks or makes the syntax of the checks clear then that is clearly the way to go (as with the above example). Locking yourself into checking for trueness does not suddenly make your conditional more readable.
I remember hearing the same thing in my classes as well. I think it's more important to always use the more intuitive comparison, rather than always checking for the positive condition.
Really a very in-consequential issue. However, one negative to checking in this sense is that it only works for binary comparisons. If you were for example checking some property of a ternary numerical system you would be limited.
Replying to Bevan (it didn't fit in a comment):
You're right. !foo isn't always the same as foo == false. Let's see this example, in JavaScript:
var foo = true,
bar = false,
baz = null;
foo == false; // false
!foo; // false
bar == false; // true
!bar; // true
baz == false; // false (!)
!baz; // true
I also disagree with your teacher in this specific case. Maybe he was so attached to the generally good lesson to avoid negatives where a positive will do just fine, that he didn't see this tree for the forest.
Here's the problem. Today, you listen to him, and turn your code into:
// Print black stripe on odd numbers
int zebra(int num) {
if (num % 2 == 1) {
// Number is odd
printf("*****\n");
}
}
Next month, you look at it again and decide you don't like magic constants (maybe he teaches you this dislike too). So you change your code:
#define ZEBRA_PITCH 2
[snip pages and pages, these might even be in separate files - .h and .c]
// Print black stripe on non-multiples of ZEBRA_PITCH
int zebra(int num) {
if (num % ZEBRA_PITCH == 1) {
// Number is not a multiple of ZEBRA_PITCH
printf("*****\n");
}
}
and the world seems fine. Your output hasn't changed, and your regression testsuite passes.
But you're not done. You want to support mutant zebras, whose black stripes are thicker than their white stripes. You remember from months back that you originally coded it such that your code prints a black stripe wherever a white strip shouldn't be - on the not-even numbers. So all you have to do is to divide by, say, 3, instead of by 2, and you should be done. Right? Well:
#define DEFAULT_ZEBRA_PITCH 2
[snip pages and pages, these might even be in separate files - .h and .c]
// Print black stripe on non-multiples of pitch
int zebra(int num, int pitch) {
if (num % pitch == 1) {
// Number is odd
printf("*****\n");
}
}
Hey, what's this? You now have mostly-white zebras where you expected them to be mostly black!
The problem here is how think about numbers. Is a number "odd" because it isn't even, or because when dividing by 2, the remainder is 1? Sometimes your problem domain will suggest a preference for one, and in those cases I'd suggest you write your code to express that idiom, rather than fixating on simplistic rules such as "don't test for negations".