postgreSQL Combinations - sql

I'm trying to figure out how to find all possible combinations (using SQL) for the following situation:
I have 100 ping pong balls in a bowl (id = 1...100)
Each ball is one of 4 colors (color = red, green, blue, yellow)
I want to pick 5 balls (without replacement) as follows.
1 red ball
2 green balls
3 blue balls
2 yellow balls
1 ball that is green, blue, or yellow
How can I determine all possible combinations using SQL as efficiently as possible?
Below is the best I could come up with, but I don't want order to matter (combinations) and I want no replacement:
SELECT pick1.id, pick2.id, pick3.id, pick4.id, pick5.id, pick6.id, pick7.id, pick8.id, pick9.id
FROM bowl AS pick1, bowl AS pick2, bowl AS pick3, bowl AS pick4, bowl AS pick5, bowl AS pick6,
bowl AS pick7, bowl AS pick8, bowl AS pick9
WHERE
pick1.color = "red" AND
pick2.color = "green" AND
pick3.color = "green" AND
pick4.color = "blue" AND
pick5.color = "blue" AND
pick6.color = "blue" AND
pick7.color = "yellow" AND
pick8.color = "yellow" AND
(pick9.color = "green" OR
pick9.color = "blue" OR
pick9.color = "yellow")

I haven't tried this in an actual postgresql server but here is an idea.
First I would codify the colors in integers:
0 = red
1 = green
2 = blue
3 = yellow
Now, for example, I want to draw 3 balls: 1 red, 1 green, and 1 green or yellow. The corresponding color codes, after sorting, will be used as a filter in the where clause of the final SQL statement:
[0, 1, 1]
[0, 1, 3]
Then the not in (...) basically ensures that there are no repeating ids, and the sorted array of colors is limited to the set that we specified above.
CREATE EXTENSION intarray;
select p1.id, p2.id, p3.id
from bowl as p1
cross join bowl as p2
cross join bowl as p3
where
p2.id not in (p1.id)
and p3.id not in (p1.id, p2.id)
and sort(int[p1.color, p2.color, p3.color]) in (
int[0,1,1],
int[0,1,3]
)
The intarray extension is needed for the sort() function.
A variation not involving array[] nor the intarray extension is also possible as long as you list out all desired combinations of colors in the IN (..) predicate. See link.

Related

How to sort the data row wise in spark

Input----->
I need to sort the data row wise in spark. below is the input and the output.
cat,black,dog,apple,red
zoo,apple,red,blue,green
apple,green,zoo,black,walk
Output --->
apple,black,cat,dog,red
apple,blue,green,red,zoo
apple,black,green,walk,zoo
If you have one column per word, you first need to gather them all in one column with the function array. Then you can use the function array_sort:
scala> df.withColumn("as_list", array_sort(array(df.columns.map(col): _*))).show(truncate=false)
+-----+-----+---+-----+-----+--------------------------------+
|_c0 |_c1 |_c2|_c3 |_c4 |as_list |
+-----+-----+---+-----+-----+--------------------------------+
|cat |black|dog|apple|red |[apple, black, cat, dog, red] |
|zoo |apple|red|blue |green|[apple, blue, green, red, zoo] |
|apple|green|zoo|black|walk |[apple, black, green, walk, zoo]|
+-----+-----+---+-----+-----+--------------------------------+
If you words are already in a list, you can directly use array_sort:
scala> df.withColumn("words", array_sort(col(words))).show(truncate=false)

How to represent samples that can belong to multiple categories of a categorical feature

For example, say I have these data points:
feature: color class
---------------------------------------------
red, green A
yellow, orange B
blue, green, red A
yellow B
The categorical feature column would be [red, blue, green, yellow, orange], but each sample can belong to multiple categories (such as (red, green)).
One approach would be to represent each category (color) as it's own column, and then perform a binary encoding on top of that (1 or 0 for true or false).
Would this be the best approach in Tensorflow, or is there a better way to do this?

How to make points one color when a third column equals zero, and another color otherwise, in Gnuplot?

I need to vary the point color for a row of values based on the color in one column. The data:
# x y z
1, 3, 0
1, 5, 6
3, 5, 2
4, 5, 0
The color should be one value if the column is zero and a different color if the value in the third column is non-zero.
So, I'm assuming:
plot "./file.dat" u 1:2:3 with points palette
as found here: https://stackoverflow.com/a/4115001 will not quite work.
In the above example data, that gnuplot command provides three different colors instead of the two I'm looking for.
This is probably close to what you want:
set palette model RGB defined ( 0 'red', 1 'green' )
plot[0:5][0:6] "file.dat" u 1:2:( $3 == 0 ? 0 : 1 ) with points palette
You could go one step further and remove the "noise":
unset key
unset colorbox
plot[0:5][0:6] "file.dat" u 1:2:( $3 == 0 ? 0 : 1 ) with points pt 7 ps 3 palette
if only the differentiation between zero and non-zero matters.
You can adjust the palette by
set palette defined (-0.1 "blue", 0 "red", 0.1 "blue")

Multi-parameter matching + weighted random pick in Redis

Let's say I have a set of objects with properties:
Object Quantity Color Shape Kind
----------------------------------------
APPLE 12 RED ROUND FRUIT
APPLE 3 GREEN ROUND FRUIT
ORANGE 6 ORANGE ROUND FRUIT
CARROT 0 RED CONICAL VEGETABLE
RADISH 24 RED ROUND VEGETABLE
Object and all properties except quantity are represented as strings. Quantity is a number.
I must compose a random list of objects, based on user's query.
Query contains values for all string properties (that is, all properties except quantity).
Value in query may be either exact property value, or a wildcard (meaning "any value would do for this property"), or a negation — "NOT this exact property value".
Query result is an object, picked by weighted random from all object with matching properties. Weight for the random pick is the quantity.
For example:
Query -> Probabilities -> Example
random result
-----------------------------------------------------------------------------
* ROUND FRUIT -> APPLE 12 / APPLE 3 = APPLE 15 -> APPLE
!GREEN ROUND FRUIT -> APPLE 12 / ORANGE 6 -> ORANGE
RED * * -> CARROT 0 / APPLE 12 / RADISH 24
= APPLE 12 / RADISH 24 -> RADISH
RED CONICAL VEGETABLE -> CARROT 0
= (none) -> (none)
For self-education purposes, I would like to build this system using Redis for data storage.
The question is — how to do this elegantly and with least amount of application logic (as opposed to in-Redis operations)? Weights and negation kind of spoil the picture. Otherwise it would be nicely doable with sets.
Any hints are welcome.
Since redis can only query keys and not values, a good option is to store the individual values of each object in seperate redis lists.
For example, when you add the object ...
APPLE 12 RED ROUND FRUIT
you would store it as
hmset obj:1 name apple qty 12 color red shape round kind fruit
and then ...
sadd name:apple obj:1,
sadd color:red obj:1
sadd shape:round obj:1
This way you have a way to interrogate sets directly and be able to pick the object using a random number based on, for example, the total number of items in the set returned.
Hope that helps. If you need more explanation, hit me up.

LINQ Query for filter by Selected Items in Checkbox List

Could not find this through Google or in SO questions...
I have a checkbox listbox on my form. I want to filter my List by the list of selected Ids from that listbox that are checked, in SQL I would have done this like "Where TypeId In (1, 4, 5, 7)"... how do I do that in LINQ?
I feel like I am missing a really obvious answer, but cannot get it.
For argument sake... here is the what I have for sample data:
In Colors (List<of currentColors>)
ID, Name, TypeId
1, Red, 1
2, Blue, 1
3, Green, 2
4, Pink, 3
Selected Types 2 and 3 in CheckboxList: filteredColors
filteredResults = (From C In WorkItemMonitor Where ????).ToList()
Expected Items in filteredResults would be:
[3, Green, 2], [4, Pink, 3]
EDIT:
My Current Query.. (sorry was told it would be an list, turns out to be a datatable I am filtering)
Dim workItemsListing As DataTable
workItemsListing = (From L In WorkItemMonitor.AsEnumerable() _
Where clbStatus.CheckedItems.Contains(L.Item("CurrentStatusId"))).CopyToDataTable()
List<CurrentColor> colors = chkListCurrentColors.CheckedItems.Cast<CurrentColor> ();
filteredResults = (From C In WorkItemMonitor colors.Contains(C.TypeId)).ToList()
That's about the best I can do with your description. If you need more help, you'll need to show how what you add to the CheckedListBox and the Type of your colors.