Does anyone know using Optaplanner what the rule would be to evenly spread out the course scheduling?
eg:
I am trying to input 15 courses - some of which need to be in a specific order, then spread these 15 course as evenly as possible over the input time period.
If this type of scenario is documented somewhere, please just forward with the link, I really can't find any eaxmples
TIA
Phil
In the docs, look for fairness/load balancing.
The squared trick also works for spreads, for example optimal schedule:
exam A on 1-FEB
exam B on 4-FEB => 3 days between A and B => -3² = - 9
exam C on 7-FEB => 3 days between B and C => -3² = - 9
Total: - 18
Non optimal schedule:
exam A on 1-FEB
exam B on 3-FEB => 2 days between A and B => -2² = - 4
exam C on 7-FEB => 4 days between B and C => -4² = - 16
Total: - 20 (so worse than -18, which is what we want)
To combine this weighting with other constraints, see the tennis example and read my blog post a couple of times.
Related
I have a column using bits to record status of every mission. The index of bits represents the number of mission while 1/0 indicates if this mission is successful and all bits are logically isolated although they are put together.
For instance: 1010 is stored in decimal means a user finished the 2nd and 4th mission successfully and the table looks like:
uid status
a 1100
b 1111
c 1001
d 0100
e 0011
Now I need to calculate: for every mission, how many users passed this mission. E.g.: for mission1: it's 0+1+1+0+1 = 5 while for mission2, it's 0+1+0+0+1 = 2.
I can use a formula FLOOR(status%POWER(10,n)/POWER(10,n-1)) to get the bit of every mission of every user, but actually this means I need to run my query by n times and now the status is 64-bit long...
Is there any elegant way to do this in one query? Any help is appreciated....
The obvious approach is to normalise your data:
uid mission status
a 1 0
a 2 0
a 3 1
a 4 1
b 1 1
b 2 1
b 3 1
b 4 1
c 1 1
c 2 0
c 3 0
c 4 1
d 1 0
d 2 0
d 3 1
d 4 0
e 1 1
e 2 1
e 3 0
e 4 0
Alternatively, you can store a bitwise integer (or just do what you're currently doing) and process the data in your application code (e.g. a bit of PHP)...
uid status
a 12
b 15
c 9
d 4
e 3
<?php
$input = 15; // value comes from a query
$missions = array(1,2,3,4); // not really necessary in this particular instance
for( $i=0; $i<4; $i++ ) {
$intbit = pow(2,$i);
if( $input & $intbit ) {
echo $missions[$i] . ' ';
}
}
?>
Outputs '1 2 3 4'
Just convert the value to a string, remove the '0's, and calculate the length. Assuming that the value really is a decimal:
select length(replace(cast(status as char), '0', '')) as num_missions as num_missions
from t;
Here is a db<>fiddle using MySQL. Note that the conversion to a string might look a little different in Hive, but the idea is the same.
If it is stored as an integer, you can use the the bin() function to convert an integer to a string. This is supported in both Hive and MySQL (the original tags on the question).
Bit fiddling in databases is usually a bad idea and suggests a poor data model. Your data should have one row per user and mission. Attempts at optimizing by stuffing things into bits may work sometimes in some programming languages, but rarely in SQL.
I would like to count how many people appears during one hour while creating unique package heavier than 5kg.
For example:
X is gathering items into one package. At 4/30/2020 11:10 he picked item with weight 3kg into Y and at 11:30 he picked item with weight 4 kg into Y. That gives us one package Y over 5kg. Then he picked 4kg into new package Z at 11:45 and 2kg at 12:05 -> another package over 5kg. Now I would expect to see:
ID - Packages heavier than 5kg during 1 hour
X - 2
Q - 4
U - 4
SQL sentence that I've used below. I don't know how to count based on mentioned conditions.
SELECT timestamp,
ID,
package
FROM table
GROUP by timestamp,id,package
HAVING SUM(weight) >= 5000
I have a dataframe that looks like this, where the "Date" is set as the index
A B C D E
Date
1999-01-01 1 2 3 4 5
1999-01-02 1 2 3 4 5
1999-01-03 1 2 3 4 5
1999-01-04 1 2 3 4 5
I'm trying to compare the percent difference between two pairs of dates. I think I can do the first bit:
start_1 = "1999-01-02"
end_1 = "1999-01-03"
start_2 = "1999-01-03"
end_2 = "1999-01-04"
Obs_1 = df.loc[end_1] / df.loc[start_1] -1
Obs_2 = df.loc[end_2] / df.loc[start_2] -1
The output I get from - eg Obs_1 looks like this:
A 0.011197
B 0.007933
C 0.012850
D 0.016678
E 0.007330
dtype: float64
I'm looking to build some correlations between Obs_1 and Obs_2. I think I need to create a new dataframe with the labels A-E as one column (or as the index), and then the data series from Obs_1 and Obs_2 as adjacent columns.
But I'm struggling! I can't 'see' what Obs_1 and Obs_2 'are' - have I created a list? A series? How can I tell? What would be the best way of combining the two into a single dataframe...say df_1.
I'm sure the answer is staring me in the face but I'm going mental trying to figure it out...and because I'm not quite sure what Obs_1 and Obs_2 'are', it's hard to search the SO archive to help me.
Thanks in advance
This seems to be a 2 step problem I'm trying to solve.
Let's say we have N records, and we are trying to distribute as evenly as possible into K groups.
The second problem - each group in K can only accept an M amount of records.
For example, if we have 5 records, and 3 groups, then we would distribute 2 into Group K1, 2 into Group K2 and 1 record into Group K3. However, if say in group 1, it only accepts at most 1 record. Then the arrangement would need to be 1 into Group K1, 2 into Group K2, and 2 into Group K3.
I'm not necessary after the solution but what algorithm I might need to use to solve this? Apparently for the distribution, I need to use the Greedy algorithm? But for the second step, this seems to be a bit more complicated
Edit:
The example I'm looking at is:
Number of records: 23
Groups: 10
Max records for each group
G1 = 4
G2 = 1
G3 = 0
G4 = 5
G5 = 0
G6 = 0
G7 = 2
G8 = 4
G9 = 2
G10 = 2
if N=12 and K=3 then in normal situation,you just split it V=12/3=4 for each group. but since you have M limitation, and for example K3 can only accept 1 then the distribution can be 6-5-1 which is not evenly distributed.
So i guess you need to sort K based on the M limitation, so for the example above the groups order become K3-K1-K2.
then if the distributed value V is bigger than the accepted amount M for that group, you need to take the remainder and distribute it again to the remaining group (K3=1, then 4-1=3 must be distributed to K1 and K2).
the implementation might be complicated, i hope you can find more simple solution for this
From what I understood, you need to separate all groups which allows a fixed number of values first and then equally distribute records among remaining groups. Let's take an example, let's say we have 15 records which needs to be distributed among 5 groups (G1, G2, G3, G4 and G5). Also let's assume that G2 and G4 allows max records of 2 and 4 respectively. Now algorithm should go like this:
Get average(ceiling integer) of records based on number of groups (In this example we'll get 3).
Add all max allowed records which are smaller than our average (In this example it's G2 only who's max limit(i.e. 2) is less than our average hence the number comes as 2).
Now subtract our number from step 2 from total records and also subtract the number of groups involved in step 2 from total groups. (remaining total records: 13, remaining total groups 4).
Get the new average(ceiling integer) using remaining records and groups. (New average 4).
Get average (Integer) (i.e. 3) and allot equal number of records to remaining groups - 1.
Get Mod (i.e. 1) and allot that number to the last group.
Now what we finally will have here:
G1(No limit): 4
G2(Limit 2): 2
G3(No limit): 4
G4(Limit 4): 4
G5(No limit): 1
Let me know if you think that this algo might fail for some scenarios.
Formula to get ceiling integer average
floor((#total_records + #total_groups-1) / #total_groups)
I have two tables (sql server), as shown below:
locations
id cubicfeet order
-------------------------------------
1 5 1
2 10 1
3 6 1
items
id cubic feet order
--------------------------------------
1 6 1
2 6 1
3 6 1
I need a query to tell me if all the items will fit into all the locations (for a given order). If all items will not fit into 1 or all locations then I need to create a new location for that given order - and then move any items that DID fit into the locations before to the new location (as many as fit). The new location will only be given a certain amount of cubic feet also - say 17. In this example, sum won't work because all 3 records are 6 so the sum is 18, which is less than the sum of 5,10,6, but the location with volume 5 can't fit any of the items since they are all volume 6 cubic feet.
the only way I think I can do it is creating temp tables in my sp and using a while loop to go through them and update the locations 1 at a time to see if it still fits more...