Multi-parameter matching + weighted random pick in Redis

Multi-parameter matching + weighted random pick in Redis - redis

Let's say I have a set of objects with properties:
Object Quantity Color Shape Kind
----------------------------------------
APPLE 12 RED ROUND FRUIT
APPLE 3 GREEN ROUND FRUIT
ORANGE 6 ORANGE ROUND FRUIT
CARROT 0 RED CONICAL VEGETABLE
RADISH 24 RED ROUND VEGETABLE
Object and all properties except quantity are represented as strings. Quantity is a number.
I must compose a random list of objects, based on user's query.
Query contains values for all string properties (that is, all properties except quantity).
Value in query may be either exact property value, or a wildcard (meaning "any value would do for this property"), or a negation — "NOT this exact property value".
Query result is an object, picked by weighted random from all object with matching properties. Weight for the random pick is the quantity.
For example:
Query -> Probabilities -> Example
random result
-----------------------------------------------------------------------------
* ROUND FRUIT -> APPLE 12 / APPLE 3 = APPLE 15 -> APPLE
!GREEN ROUND FRUIT -> APPLE 12 / ORANGE 6 -> ORANGE
RED * * -> CARROT 0 / APPLE 12 / RADISH 24
= APPLE 12 / RADISH 24 -> RADISH
RED CONICAL VEGETABLE -> CARROT 0
= (none) -> (none)
For self-education purposes, I would like to build this system using Redis for data storage.
The question is — how to do this elegantly and with least amount of application logic (as opposed to in-Redis operations)? Weights and negation kind of spoil the picture. Otherwise it would be nicely doable with sets.
Any hints are welcome.

Since redis can only query keys and not values, a good option is to store the individual values of each object in seperate redis lists.
For example, when you add the object ...
APPLE 12 RED ROUND FRUIT
you would store it as
hmset obj:1 name apple qty 12 color red shape round kind fruit
and then ...
sadd name:apple obj:1,
sadd color:red obj:1
sadd shape:round obj:1
This way you have a way to interrogate sets directly and be able to pick the object using a random number based on, for example, the total number of items in the set returned.
Hope that helps. If you need more explanation, hit me up.

Related

Minimum number if Common Items in 2 Dynamic Stacks

I have a verbal algorithm question, thus I have no code yet. The question is this: How can I possibly create an algorithm such that I have 2 dynamic stacks, both can or can not have duplicate items of strings, for example I have 3 breads, 4 lemons and 2 pens in the first stack, say s1, and I have 5 breads, 3 lemons and 5 pens in the second stack, say s2. I want to find the number of duplicates in each stack, and print out the minimum number of duplicates in both lists, for example:
bread --> 3
lemon --> 3
pen --> 2
How can I traverse 2 stacks and print the number of duplicated occurrences until the end of stacks? If you are confused about anything, I can edit my question depending on your confusion. Thanks.

Re-writing an AVERAGEIFS statement into a STDEV statement

I am looking to rewrite my averageifs statement into a STDEV statement. I currently have an average if statement which looks for the current name "N" within the type "M", and finds the same type and name within columns "A" and "B", and will average the results "C" for those rows.
=AVERAGEIFS(C:C,A:A,M4,B:B,N4)
=AVERAGEIFS(C:C,A:A,M5,B:B,N5)
=AVERAGEIFS(C:C,A:A,M6,B:B,N6)
etc...
I would like to do the same with STDEV, however the inputs requirements are different as STDEVIFS, doesn't exist.
=STDEV(BG:BG,C:C,BL4,K:K,BM4) will give the dev of all the columns. How could I fix this to be the same as my averageifs statement, but for STDEV.
A B C M N O
x x x x x x
x x x x x x
Type Name Mass Type Name AVG Mass
Fruit Apple 3 Fruit Apple 4.25
Veggie Tomato 5 Fruit Orange 6.5
Veggie Lettuce 1 Veggie Tomato 6.333
Veggie Tomato 7 Veggie Lettuce 2.3333
Fruit Orange 6 Fruit Watermelon 5.5
Fruit Apple 5 Veggie Watermelon 4
Fruit Watermelon 5
Veggie Watermelon 3
Fruit Apple 3
Fruit Apple 6
Veggie Watermelon 5
Fruit Watermelon 6
Fruit Orange 7
Veggie Lettuce 3
Veggie Lettuce 3
Veggie Tomato 7
x = non included info

You would us an array form of STDEV with an IF() inside:
=STDEV(IF(($A$4:$A$19=M4)*($B$4:$B$19=N4),$C$4:$C$19))
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of enter when exiting edit mode. If done correctly the Excel will put {} around the formula.
Array formulas are different than normal formula in that they will calculate every thing in the referenced range and therefore full column references should not be used.
You can do as Ron stated in his answer and name the ranges to limit the references. Or you can use a table which will do the same or just limit the range as I have done here.

You can use an IF function to return either the appropriate value, or a Boolean FALSE which will be ignored. So, given your data sample, and example might be: (entered with ctrl+shift+enter as an array formula)
=STDEV(IF((Type=M5)*(Name=N5),Mass))
Type is the Named Range in Column A
Name is the Named Range in Column B
Mass is the Named Range in Column C
M5 contains the Type you are looking for
N5 contains the Name you are looking for
Note: You can use whole (or partial) column arguments instead of Named Ranges in the formulas above -- it would just be less efficient.

Google Spreadsheet with SQL query - finding best combination

I have a google spreadsheet for my gaming information. It contains 2 sheets - one for monster information, another for team.
Monster information sheet contains the attack value, defend value, and the mana cost of monsters. It's almost like a database of monsters that I can summon.
Team sheet does the following:
Asks for the amount of mana I currently have.
Computes a list of up to 5 monsters that I can summon (it can be less than 5).
Each monster has their own mana cost, therefore total mana cost mustn't exceed the amount of mana I have given in point 1.
The tabulated list should give me a team that have the highest combined attack value. It does not matter how many monsters are summoned. Each monster cannot be summoned twice though.
I have been thinking of using query() function so that I can make use of SQL statements. (so that I can hopefully retrieve the tabulated list directly)
Sample: Monster Info
A B C D
1 Monster Attack Defense Cost
2 MonA 1200 1200 35
3 MonB 1400 1300 50
... ...
Sample: Team
A B C D
1 Mana 120
2
3 Attack Team
4 Monster Attack Cost Total Attack
5 MonB 1400 50 1400
6 MonA 1200 35 2600
7 ... ...
I have these formula in "Team" sheet
A5: =query('Monster Info'!$A$:$D,"SELECT A,B,D ORDER BY B DESC LIMIT 5")
B5: =CONTINUE(A5, 1, 2)
C5: =CONTINUE(A5, 1, 3)
D5: =C5
A6: =CONTINUE(A5, 2, 1)
B6: =CONTINUE(A5, 2, 2)
C6: =CONTINUE(A5, 2, 3)
D6: =D5+C6
That only gets the 5 best attack monsters, regardless of the mana cost consideration. How do I do that such that it takes consideration of both attack value and mana cost value? There is another problem shown in the example below:
Example: (simplified version, without defense value etc)
Monster Attack Cost
MonA 1400 50
MonB 1200 35
MonC 1100 30
MonD 900 25
MonE 500 20
MonF 400 15
MonG 350 10
MonH 250 5
If I have 160 mana, then the obvious team is A+B+C+D+E (5100 Attack).
If I have 150 mana, it becomes A+B+C+D+G (4950 Attack).
If I have 140 mana, it becomes A+B+C+D (4600 Attack).
If I have 130 mana, it becomes B+C+D+E+F (4100 Attack using 125 mana) or A+B+C+F (4100 Attack using all 130 mana).
If I have 120 mana, it becomes B+C+D+E+G (4050 Attack).
If I have 110 mana, it becomes B+C+D+F+H (3850 Attack).
As you can see, there isn't really a pattern within the results.
Any expert willing to share their insights on this?

I've played with the problem for an hour and I only have a workaround here. Your problem seems to be a standard linear programming task which should can easily be solved by a "Solver" software. There used to be a so called "Solver" in google spreadsheet, but unfortunately it was removed from the newest version. If you are not insisting on Google solution, you should try it in one of the Solver-supported spreadsheet manager softwares.
I tried MS Office (it has a Solver add-in, installation guide: http://office.microsoft.com/en-001/excel-help/load-the-solver-add-in-HP010342660.aspx).
Before you run the solver, you should prepare your original dataset a bit, with helper columns and cells.
Add a new column next to the "Cost" column (let's assume it is column "D"), and under it put each row either 0, or 1. This column will tell you if a monster is selected to the attack team or not.
Add two more columns ("E" and "F" respectively). These columns will be products of the Attack and of the Cost respectively. So you should write a function to the E2 cell: =b2*d2, and for the F2 cell: =c2*d2. With this way if a monster is selected (which is told by the D column, remember), the appropriate E and F cells will be non zero values, aotherwise they will be 0.
Create a SUM row under the last row, and create a summarizing function for the D,E,F columns respectively. So in my spreadsheet D10 cell gets its value like this: =sum(d2:d9), and so on.
I created a spreadsheet to show these steps: https://docs.google.com/spreadsheets/d/1_7XRlupEEwat3CthSSz8h_yJ44MysK9hMsj0ijPEn18/edit?usp=sharing
Remember to copy this worksheet to an MS Office worksheet, before you start the Solver.
Now, you are ready to start the Solver. (Data menu, Solver in MS Office). You can see a video here on using the Solver: https://www.youtube.com/watch?v=Oyc0k9kiD7o
It's not that hard as it looks like, but for this case I'll describe what to write where:
Set Objective: you should select the "E10" cell, as that represents the sum of all the attack points.
Check "Max" radiobutton as we would like to maximize the value of the attacks.
By Changing variable cells: Select the "d2:d9" interval as those cells are representing whether a monster is selected or not. The solver will try to adjust these values (0, or 1) in order to maximise the sum attack.
Subject to the Contraints: Here we should add some constraints. Click on the Add button, and then:
First we should ensure that d2:d9 are all binary values. So "Cell reference" should be "d2:d9" and from the dropdown menu, select "bin" as binary.
Another constraint should be that the sum of the selected monsters should not exceed 5. So select the cell where the sum of the selected monsters is represented (D10) and add "<=" and the value "5"
Finally we cannot use more manna that we have, so select the cell in which you store the sum of used manna (F2), and "<=", and add the whole amount of manna we can spend in my case it's in the I2 cell).
Done. It should work, in my case it worked at least.
Hope it helps anyway.

Nearest Neighbor Search on large database table - SQL and/or ArcGis

Sorry for posting something that's probably obvious, but I don't have much database experience. Any help would be greatly appreciated - but remember, I'm a beginner :-)
I have a table like this:
Table.fruit
ID type Xcoordinate Ycoordinate Taste Fruitiness
1 Apple 3 3 Good 1,5
2 Orange 5 4 Bad 2,9
3 Apple 7 77 Medium 1,4
4 Banana 4 69 Bad 9,5
5 Pear 9 15 Medium 0,1
6 Apple 3 38 Good -5,8
7 Apple 1 4 Good 3
8 Banana 15 99 Bad 6,8
9 Pear 298 18789 Medium 10,01
… … … … … …
1000 Apple 1344 1388 Bad 5
… … … … … …
1958 Banana 759 1239 Good 1
1959 Banana 3 4 Medium 5,2
I need:
A table that gives me
The n (eg.: n=5) closest points to EACH point in the original table, including distance
Table.5nearest (please note that the distances are fake). So the resulting table has ID1, ID2 and distance between ID1 and ID2 (can't post images yet, unfortunately).
ID.Fruit1 ID.Fruit2 Distance
1 1959 1
1 7 2
1 2 2
1 5 30
1 14 50
2 1959 1
2 1 2
… … …
1000 1958 400
1000 Xxx Xxx
… … …
How can I do this (ideally with SQL/database management) or in ArcGis or similar? Any ideas?
Unfortunately, my table contains 15000 datasets, so the resulting table will have 75000 datasets if I choose n=5.
Any suggestions GREATLY appreciated.
EDIT:
Thank you very much for your comments and suggestions so far. Let me expand on it a little:
The first proposed method is sort of a brute-force scan of the whole table rendering huge filesizes or, likely, crashes, correct?
Now, the fruit is just a dummy, the real table contains a fix ID, nominal attributes ("fruit types" etc), X and Y spatial columns (in Gauss-Krueger) and some numeric attributes.
Now, I guess there is a way to code a "bounding box" into this, so the distances calculation is done for my point in question (let's say 1) and every other point within a square with a certain edge length. I can imagine (remotely) coding or querying for that, but how do I get the script to do that for EVERY point in my ID column. The way I understand it, this should either create a "subtable" for each record/point in my "Table.Fruit" containing all points within the square around the record/point with a distance field added - or, one big new table ("Table.5nearest"). I hope this makes some kind of sense. Any ideas? THanks again

To get all the distances between all fruit is fairly straightforward. In Access SQL (although you may need to add parentheses everywhere to get it to work :P):
select fruit1.id,
fruit2.id,
sqr(((fruit2.xcoordinate - fruit1.xcoordinate)^2) + ((fruit2.ycoordinate - fruit1.ycoordinate)^2)) as distance
from fruit as fruit1
join fruit as fruit2
on fruit2.id <> fruit1.id
order by distance;
I don't know if Access has the necessary sophistication to limit this to the "top n" records for each fruit; so this query, on your recordset, will return 225 million records (or, more likely, crash while trying)!

Thank you for your comments so far; in the meantime, I have gone for a pre-fabricated solution, an add-in for ArcGis called Hawth's Tools. This really works like a breeze to find the n closest neighbors to any point feature with an x and y value. So I hope it can help someone with similar problems and questions.
However, it leaves me with a more database-related issue now. Do you have an idea how I can get any DBMS (preferably Access), to give me a list of all my combinations? That is, if I have a point feature with 15000 fruits arranged in space, how do I get all "pure banana neighborhoods" (apple, lemon, etc.) and all other combinations?
Cheers and best wishes.

storing weight in a database table

I am playing around with learning MVC and want to create a recipe recorder application to store my recipes.
I am using .net with Sql Server 2008 R2 however I don't think that really matters with what I am trying to do.
I want to be able to record all of the measures I use. In my country we use metric however I want people to be able to use imperial with my application.
How do I structure my table to cope with the differences, I was thinking of storing all of the measurements as ints and have a foreign key to store the kind of weight.
Ideally I would like to be able to share the recipes between people and display the measurements in their preferred way.
Is this the right kind of way
IngredientID PK
Weight int
TypeOfWeight int e.g. tsp=1,tbl=2,kilogram=3,pound=4,litre=5,ounce=6 etc
UserID int
Or is this way off track? Any suggestions would be great!

I think you should store the weights (Kilo/Pound) etc as a single weight type (metric) and simply "display" them in the correct conversion using the user's preference. If the user has there weight settings set to Imperial, values entered into the system would need to be converted as well. This should simplify your data anyway.
Similar to Dates, you could store every date and what timezone it is from, or otherwise store all dates as the same (or no timezone) and then display them in the application using offsets according to the user's preference

If you are storing weights (a non-discrete value) I would strongly suggest using numeric or decimal for this data. You have the right idea with the typeofweight column. Store a reference table somewhere showing what the conversion ratio is for each (to a certain standard).
This gets quite tricky when you want to show ounces as TSP, because the conversion depends on the ingredient itself, so you need a 3rd table - ingredient: id, name, volume-to-weight ratio.
Example typeofweight table, where the standard unit is grams
type | conversion
gram | 1
ounce | 28.35
kg | 1000
tsp | 5 // assuming that 1 tsp = 5 grams of water
pound | 453.59
Example ingredient volume to weight conversion
type | vol-to-weight
water | 1
sugar | 1.4 // i.e. 1 tsp holds 5g of water, but 7g of sugar
So to display 500 ounces of sugar in tsp, you would use the formula
units x ounce.conversion x sugar.vol-to-weight
= 500 x 28.35 x 1.4
Another example with 2 weights
Ingredient is specified as 3 ounces of starch. Show in grams
= 3 x 28.35 (straightforward isn't it)
or
Ingredient is specified as 3 ounces of starch. Show in pounds
= 3 * 28.35 / 453.59

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas