Query Search Algorithm using priority array and ignoring conditions

Query Search Algorithm using priority array and ignoring conditions - sql

I am facing this problem at my companies logistics:
Given array of mixed values
arr = [
v1 => a,
v2 => b,
v3 => c,
v4 => d
]
ordered by priority ASC (v1 is more important than v2, ... etc)
I need to search values in a table t like this:
select * from t where
... And
t.v1 = a And
t.v2 = b And
t.v3 = c And
t.v4 = d
The 3 dots in query are the fixed conditions
If I cant find any value from query, perform same query ignoring the least important value from array
select * from t where
... And
t.v1 = a And
t.v2 = b And
t.v3 = c
Again if value was not found. perform a query ignoring the next least important value
select * from t where
... And
t.v1 = a And
t.v2 = b And
t.v4 = d
The query can ignore all elements from array as last query
select * from t where ...
Repeat that until find at least 1 query result or all elements in array are nulled and no result were found. (return false)
I did my search algorithm using binary numbers to find the value which works like this
binaryValue = (2 ^ arr.lenght) - 1 //(In this case: 2^4-1 = 15)
transform binary 15 in array of boolean exploded = [1, 1, 1, 1] (15 in binary is 1111)
Than make the query considering position 0 of booleanArray exploded if true then Consider value in arr of mixed values and so on
loop from 15 to 0 (the 0 iteration is when the boolean array is [0,0,0,0] and ignores all cases
The sequence will be like this with this logic:
[a, b, c, d] // [1,1,1,1] = 15 (select * from t where ... and v1=a and v2=b and v3=c and v4=d)
[a, b, c] // [1,1,1,0] = 14 (select * from t where ... and v1=a and v2=b and v3=c)
[a, b, d] // [1,1,0,1] = 13 (select * from t where ... and v1=a and v2=b and v4=d)
...
[] // [0,0,0,0] = 0 (select * from t where ...)
The search algorithm works fine but I have a huge problem to performance.
The number of queries performed in this search is arr.length! (fatorial) So when the array is length 4, the worst case scenario performs 24 queries.
if the arr.length is 6 (what is the length I am dealing now in my code in production) the worst case performs 720 queries which is unacceptable.
I need a way to Improve this search. Can someone help me?
Thanks in advance.

I came up to the solution using the concept of row's desirability suggested by #RBarryYoung
Instead of performing multiple queries, I fetch all relevant rows into a datatable (1 query) and for each rows I applied a score based in desirability. (Code side)
Dim dicFields As New Dictionary(Of String, String) From { _
{"v1", a}, _
{"v2", b}, _
{"v3", c}, _
{"v4", d}
}
Dim intScore As Integer = dicFields.Keys.Count
For Each pair As KeyValuePair(Of String, String) In dicFields
lstRow _
.Where(Function(p) p.Item(pair.Key) = pair.Value) _
.ToList() _
.ForEach(Sub(p) p.Item("__Score") += intScore)
intScore -= 1
Next
return lstRow _
.OrderByDescending(Function(p) p.Item("__Score")) _
.FirstOrDefault()
It reduced the complexity of n! to just 1 query and n filters. System are up and running smoothly. Thanks for the help #RBarryYoung

Related

Fortran indexing similar as Matlab

I have a T(60000x8) matrix in which I want to do sorting operation.
In Matlab I can create a sub-matrix where I am sorting the rows that have same value in column 8.
a1 = max(T(:,8)); x = [1.0:1:a1];
for i = 1.0:1:a1
T1 = T(T(:, 8)== x(i), :);
end
This works perfectly and does my job.
But I want to perform the similar operation using Fortran.
I have tried the followings:
read(7,*) *(height(i,j),j=1,8)
k=maxval(height(:,8))
a1 = int(height(:,8))
allocate(x(k), T1(k,8))
do i=1,k
x(i) = i
end do
do i = 1, k
T1 = height((a1(i)== x(i)),:)
end do
When compiling this gives me error
Error: Array index at (1) must be of INTEGER type, found LOGICAL

Fortran is not Matlab ;)... Matlab has a feature to extract a subarrays using booleans, Fortran has not. However Fortran has a sub-array indexing feature.
Your code has many flaws, and I have to assume that height and T1 are real arrays. You can obtain your desired result (at least what I understand you want) with:
integer :: i
integer, allocatable :: idx(:)
real, allocatable :: x(:), T1(:,:)
a1 = nint(height(:,8))
x = [(i,i=1,size(a1))]
idx = pack( x, mask=(a1==x) )
T1 = height(idx(:),:)
Explanation, for instance:
a1 : [4 2 3 1 5]
x : [1 2 3 4 5]
(a1 == x) : [F T T F T]
idx : [ 2 3 5] ! 3 elements
T1 will be made of the columns 2, 3, 5 of height

i am new, program gives error "there are no type variables left in list"

How the game works is that there is a 3-digit number, and you have to guess it. If you guess a digit in the right spot, you get a strike, and if you guess a digit but in the wrong spot you get a ball. I've coded it like this.
x = random.randint(1, 9)
y = random.randint(1, 9)
z = random.randint(1, 9)
userguessunlisted = input('What number do you want to guess?')
numbertoguess = list[x, y, z]
userguess = list(userguessunlisted)
b = 0
s = 0
while 0 == 0:
if userguess[0] == numbertoguess[0]:
s = s + 1
if userguess[0] == numbertoguess[1]:
b = b + 1
if userguess[0] == numbertoguess[2]:
b = b + 1
if userguess[1] == numbertoguess[0]:
b = b + 1
if userguess[1] == numbertoguess[1]:
s = s + 1
if userguess[1] == numbertoguess[2]:
b = b + 1
if userguess[2] == numbertoguess[0]:
b = b + 1
if userguess[2] == numbertoguess[1]:
b = b + 1
if userguess[2] == numbertoguess[2]:
s = s + 1
print(s + "S", b + "B")
if s != 3:
b = 0
s = 0
else:
print('you win!')
break

When you said list[x, y, z] on line 5, you used square brackets, which python interprets to be a type annotation. For example, if I wanted to specify that a variable is a list of ints, I could say
my_list_of_ints: list[int] = [1, 2, 3]
I think what you meant to do is create a new list from x, y, and z. One way to do this is
numbertoguess = list([x, y, z])
which is probably what you meant to write. This is valid because the list function takes an iterable as its one and only argument.
However, the list portion is redundant; square brackets on the right-hand side of an assignment statement already means "create a list with this content," so instead you should simply say
numbertoguess = [x, y, z]
A few other notes:
input will return a string, but you are comparing that string to integers further down, so none of the comparisons will ever be true. What you want to say is something like the following:
while True:
try:
userguessunlisted = int(input('What number do you want to guess?'))
except:
continue
break
What this code does is attempts to parse the string returned from input into an int. If it fails to do so, which would happen if the user inputted something other than a valid integer, an exception would be thrown, and the except block would be entered. continue means go to the top of the loop, so the input line runs repeatedly until a valid int is entered. When that happens, the except block is skipped, so break runs, which means "exit the loop."
userguessunlisted is only ever going to contain 1 number as written, so userguess will be a list of length 1, and all of the comparisons using userguess[1] and userguess[2] will throw an IndexError. Try to figure out how to wrap the code from (1) in another loop to gather multiple guesses from the user. Hint: use a for loop with range.
It might also be that you meant for the user to input a 3-digit number all at once. In that case, you can use a list comprehension to grab each character from the input and parse it into a separate int. This is probably a bit complicated for a beginner, so I'll help you out:
[int(char) for char in input('What number do you want to guess?')]
print(s + "S", b + "B") will throw TypeError: unsupported operand type(s) for +: 'int' and 'str'. There are lots of ways to combine non-string types with strings, but the most modern way is using f-strings. For example, to combine s with "S", you can say f"{s}S".
When adding some amount to a variable, instead of saying e.g. b = b + 1, you can use the += operator to more concisely say b += 1.
It's idiomatic in python to use snake_case for variables and Pascal case for classes. So instead of writing e.g. numbertoguess, you should use number_to_guess. This makes your code more readable and familiar to other python programmers.
Happy coding!

Using factorials to find combinations

So, this is a two part question but based on the same project. I am trying to write a small program that can illustrate how a computer can quickly crack a password, using a brute force attack. It only has three inputs: A check box to denote if it should use integers, a check box to denote if it should use letters, and a textbox to enter the password to be cracked. It then outputs the number of combinations. Here is my code:
dim a,b,c,d,P as double
'Using the following formula:
'P(n,r) = n!/(r!(n-r)!)
'Let's assume we are just using numbers, so n = 10
'r = the count of characters in the textbox.
a = factorial(n)
b = factorial(r)
c = (n - r)
d = factorial(c)
P = a / (b * d)
Output = "With a password of " & r & " characters and " & n & " possible values, the number of combinations are " & P
Me.RichTextBox1.Text = Output & vbCrLf
Function factorial(ByVal n As Integer) As Integer
If n <= 1 Then
Return 1
Else
Return factorial(n - 1) * n
End If
End Function
So, let's assume I'm only looking at the characters 0-9, with the following number of characters in a password, I get:
P(10,1) = 10!/(1! * (10-1)!) = 10
P(10,2) = 10!/(2! * (10-2)!) = 45
P(10,3) = 10!/(3! * (10-3)!) = 120
P(10,4) = 10!/(4! * (10-4)!) = 210
P(10,5) = 10!/(5! * (10-5)!) = 252
P(10,6) = 10!/(6! * (10-6)!) = 210
P(10,7) = 10!/(6! * (10-7)!) = 120
You can see the number of combinations goes down, once it gets past 5. I assume this is right, but wanted to check before I present this. Is this because the total number in the pool remains the same, while the sample increases?
My second question is about how to consider a password to crack that repeats numbers. Again, let's assume that we are just pulling from digits 0-9. If the sample size it two (lets say 15), then there are 45 possible combinations, right? But, what if they put in 55? Are there still 45 combinations? I suppose the computer still needs to iterate over every possible combination, so it would still be considered 45 possibilities?

Shuffle data in a repeatable way (ability to get the same "random" order again)

This is the opposite of what most "random order" questions are about.
I want to select data from a database in random order. But I want to be able to repeat certain selects, getting the same order again.
Current (random) select:
SELECT custId, rand() as random from
(
SELECT DISTINCT custId FROM dummy
)
Using this, every key/row gets a random number. Ordering those ascending results in a random order.
But I want to repeat this select, getting the very same order again. My idea is to calculate a random number (r) once per session (e.g. "4") and use this number to shuffle the data in some way.
My first idea:
SELECT custId, custId * 4 as random from
(
SELECT DISTINCT custId FROM dummy
)
(in real life "4" would be something like 4005226664240702)
This results in a different number for each line but the same ones every run. By changing "r" to 5 all numbers will change.
The problem is: multiplication is not sufficient here. It just increases the numbers but keeps the order the same. Therefore I need some other kind of arithmetic function.
More abstract
Starting with my data (A-D). k is the key and r is the random number currently used:
k r
A = 1 4
B = 2 4
C = 3 4
D = 4 4
Doing some calculation using k and r in every line I want to get something like:
k r
A = 1 4 --> 12
B = 2 4 --> 13
C = 3 4 --> 11
D = 4 4 --> 10
The numbers can be whatever they want, but when I order them ascending I want to get a different order than the initial one. In this case D, C, A, B, E.
Setting r to 7 should result in a different order (C, A, B, D):
k r
A = 1 7 --> 56
B = 2 7 --> 78
C = 3 7 --> 23
D = 4 7 --> 80
Every time I use r = 7 should result in the same numbers => same order.
I'm looking for a mathematical function to do the calculation with k and r. Seeding the RAND() function is not suitable because it's not supported by some databases we support
Please note that r is already a randomly generated number
Background
One Table - Two data consumers. One consumer will get random 5% of the table, the other one the other 95%. They don't just get the data but a generated SQL. So there are two SQL's which must not select the same data twice but still random.

You could try and implement the Multiply-With-Carry PseudoRandomNumberGenerator. The C version goes like this (source: Wikipedia):
m_w = <choose-initializer>; /* must not be zero, nor 0x464fffff */
m_z = <choose-initializer>; /* must not be zero, nor 0x9068ffff */
uint get_random()
{
m_z = 36969 * (m_z & 65535) + (m_z >> 16);
m_w = 18000 * (m_w & 65535) + (m_w >> 16);
return (m_z << 16) + m_w; /* 32-bit result */
}
In SQL, you could create a table Random, with two columns to contain w and z, and one ID column to identify each session. Perhaps your vendor supports variables and you need not bother with the table.
Nonetheless, even if we use a table, we immediately run into trouble cause ANSI SQL doesn't support unsigned INTs. In SQL Server I could switch to BIGINT, unsure if your vendor supports that.
CREATE TABLE Random (ID INT, [w] BIGINT, [z] BIGINT)
Initialize a new session, say number 3, by inserting 1 into z and the seed into w:
INSERT INTO Random (ID, w, z) VALUES (3, 8921, 1);
Then each time you wish to generate a new random number, do the computations:
UPDATE Random
SET
z = (36969 * (z % 65536) + z / 65536) % 4294967296,
w = (18000 * (w % 65536) + w / 65536) % 4294967296
WHERE ID = 3
(Note how I have replaced bitwise operands with div and mod operations and how, after computing, you need to mod 4294967296 to stay within the proper 32 bits unsigned int range.)
And select the new value:
SELECT(z * 65536 + w) % 4294967296
FROM Random
WHERE ID = 3
SQLFiddle demo

Not sure if this applies in non-SQL Server, but typically when you use a RAND() function, you can specify a seed. Everytime you specify the same seed, the randomization will be the same.
So, it sounds like you just need to store the seed number and use that each time to get the same set of random numbers.
MSDN Article on RAND

Each vendor has solved this in its own way. Creating your own implementation will be hard, since random number generation is difficult.
Oracle
dbms_random can be initialized with a seed: http://docs.oracle.com/cd/B19306_01/appdev.102/b14258/d_random.htm#i998255
SQL Server
First call to RAND() can provide a seed: http://technet.microsoft.com/en-us/library/ms177610.aspx
MySql
First call to RAND() can provide a seed: http://dev.mysql.com/doc/refman/4.1/en/mathematical-functions.html#function_rand
Postgresql
Use SET SEED or SELECT setseed() : http://www.postgresql.org/docs/8.3/static/sql-set.html

Random Number Generation to Memory from a Distribution using VBA

I want to generate random numbers from a selected distribution in VBA (Excel 2007).
I'm currently using the Analysis Toolpak with the following code:
Application.Run "ATPVBAEN.XLAM!Random", "", A, B, C, D, E, F
Where
A = how many variables that are to be randomly generated
B = number of random numbers generated per variable
C = number corresponding to a distribution
1= Uniform
2= Normal
3= Bernoulli
4= Binomial
5= Poisson
6= Patterned
7= Discrete
D = random number seed
E = parameter of distribution (mu, lambda, etc.) depends on choice for C
(F) = additional parameter of distribution (sigma, etc.) depends on choice for C
But I want to have the random numbers be generated into an array, and NOT onto a sheet.
I understand that where the "" is designates where the random numbers should be printed to, but I don't know the syntax for assigning the random numbers to an array, or some other form of memory storage instead of to a sheet.
I've tried following the syntax discussed at this Analysis Toolpak site, but have had no success.
I realize that VBA is not the ideal place to generate random numbers, but I need to do this in VBA. Any help is much appreciated! Thanks!

Using the inbuilt functions is the key. There is a corresponding version for each of these functions but Poisson. In my presented solution I am using an algorithm presented by Knuth to generate a random number from the Poisson Distribution.
For Discrete or Patterned you obviously have to write your custom algorithm.
Regarding the seed you can place a Randomize [seed] before filling your array.
Function RandomNumber(distribution As Integer, Optional param1 = 0, Optional param2 = 0)
Select Case distribution
Case 1 'Uniform
RandomNumber = Rnd()
Case 2 'Normal
RandomNumber = Application.WorksheetFunction.NormInv(Rnd(), param1, param2)
Case 3 'Bernoulli
RandomNumber = IIf(Rnd() > param1, 1, 0)
Case 4 'Binomial
RandomNumber = Application.WorksheetFunction.Binom_Inv(param1, param2, Rnd())
Case 5 'Poisson
RandomNumber = RandomPoisson(param1)
Case 6 'Patterned
RandomNumber = 0
Case 7 'Discrete
RandomNumber = 0
End Select
End Function
Function RandomPoisson(ByVal lambda As Integer) 'Algorithm by Knuth
l = Exp(-lambda)
k = 0
p = 1
Do
k = k + 1
p = p * Rnd()
Loop While p > l
RandomPoisson = k - 1
End Function

Why not use the inbuilt functions?
Uniform = rnd
Normal = WorksheetFunction.NormInv
Bernoulli = iif(rnd()<p,0,1)
Binomial = WorksheetFunction.Binomdist
Poisson = WorksheetFunction.poisson
Patterned = for ... next
Discrete =
-
select case rnd()
case <0.1
'choice 1
case 0.1 to 0.4
'choice 2
case >0.4
'choice 3
end select

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Query Search Algorithm using priority array and ignoring conditions - sql

Related

Fortran indexing similar as Matlab

i am new, program gives error "there are no type variables left in list"

Using factorials to find combinations

Shuffle data in a repeatable way (ability to get the same "random" order again)

Random Number Generation to Memory from a Distribution using VBA

Categories

Resources