Fair Allocation to Buckets with Constraints - optimization

I have a vector with n prices(p1,p2,...,pn).
I want to allocate these into 3 buckets A,B,C such that the average of all the prices in each bucket is as close as possible. The constraint will be that each basket has a different number of prices put into it.
Using R I've tried to MIN the squared difference between the prices in each bucket, but I'm not getting the correct results.
Am I approaching this correctly?
So the way I approached it was I had a vector:
x<-c(50,50,50,10,20,40,50,20,4,40,20)
And I want to Minimise
A<-|(0x[0]+1x[1]+0x[2]+0x[3]....) - allocation * average price |^2
B<-|(0x[0]+1x[1]+0x[2]+0x[3]....) - allocation * average price |^2
C<-|(0x[0]+1x[1]+0x[2]+0x[3]....) - allocation * average price |^2
To get the square differences from the average
Such that each bucket is either allocated or not allocated that specific price.
The constraints would be the allocations of A, B, C would be say 10 to A 5 to B and 12 to C.

Related

How to find Lagrange Multiplier word problem constraint and objective function?

What would the objective and constraint be in this word problem?
A company manufactures x units of one item and y units of another. The total cost in dollars, C, of producing these two items is approximated by the function C=6x^2+3xy+7y^2+900.
If the production quota for the total number of items (both types combined) is 220, find the minimum production cost.

Optimizong billboard placement

I have a table of a 18,000 billboards with an ID, latitude, longitude, Impacts(the amount of people that see the billboard in a month).
ID
Latitude
Longitude
Impacts
1
107.45
92.45
200,000
2
102.67
96.67
180,000
3
105.12
94.23
160,000
4
106.42
91.87
220,000
5
109.89
93.56
240,000
The idea is I want to build a model that optimizes for a maximum amount of impacts, keeping a minimum distance between each billboard, for an amount of billboards chosen by the user.
I can build a matrix with the linear distances of each billboard to all the others, so basically I have the value that I want to maximize which are the impacts, a distance matrix which has linear distances between each billboard which is a constraint and the amount of billboards to select which is another constraint.
does anyone know a sort of linear programming model that I could implement for this specific case?

Performing a sparse sum on Mathematica

I want to evaluate a sum in Mathematica of the form
g[[i,j,k,l,m,n]] x g[[o,p,q,r,s,t]] x ( complicated function of the indices )
But all these indices range from 0 to 3, so the total number of cases to sum over is 4^12, which will take an unforgiving amount of time. However, barely any elements of the array g[[i,j,k,l,m,n]] are nonzero -- there are probably around 8 nonzero entries -- so I would like to restrict the sum over {i,j,k,l,m,n,o,p,q,r,s,t} to precisely those combinations of indices for which both factors of g are nonzero.
I can't find a way to do this for summation over multiple indices, where the allowed index choices are particular combinations of {i,j,k,l,m,n} as opposed to specific values of each particular index. Any help appreciated!

Allocation via SQL - Retaining repeating decimals for the sum()

I am allocating a single unit across multiple rows using a calculation and storing the results into a table. I am then sum() the allocations and the sums are resulting in numbers that are not whole numbers. What is going on is that some of the allocations are ending up as numbers with repeating decimals, and then the sum of those not adding back up to the whole number (ala 1/3 + 1/3 + 1/3 != 1).
I have tried casting the numbers into different formats, however, Athena keep rounding the decimals at some arbitrary precision resulting in the problem.
I would like the sum of the allocations to equal the sum of the original units.
My Database is AWS Athena which I understand to use the Presto SQL language.
Example of my allocation:
case
when count_of_visits = 1 then 1
when count_of_visits = 2 then .5
when count_of_visits >= 3 then
case
when visit_seq_number = min_visit_seq_number then .4
when visit_seq_number = max_visit_seq_number then .4
else .2 / (count_of_visits - 2 )
end
else 0
end as u_shp_alloc_leads
In this allocation, the first and last visits get 40% of the allocation and all visits in between split 20%
A unit that is being allocated to 29 visits ends up dividing the 20% by 27 which equals 0.00740Repeating. The table is storing 0.007407407407407408 which when I go to sum the numbers the result is 1.0000000000000004 I would like the result to be 1
This is a limitation of databases or computers in general. When you work with fractions like that, some sort of rounding will always take place.
I would apply a reasonable degree of rounding to the x-th decimal on the sums you retrieve from your table, that will just cut off these residual decimals at the end.
If that's not sufficient for you, something you can do to at least theoretically have full precision is to store numerator and denominator separately in two columns. When computing sum( numerator_column/denominator_column ) you will see the same rounding effects, so summing up the numbers would be something a little more complicated like this:
SELECT sum(numerator_sum/denominator)
FROM (
SELECT
denominator,
sum(numerator) as numerator_sum
FROM your_allocation_table
GROUP BY denominator
)

Fast way of multiplying two 1-D arrays

I have the following data:
A = [a0 a1 a2 a3 a4 a5 .... a24]
B = [b0 b1 b2 b3 b4 b5 .... b24]
which I then want to multiply as follows:
C = A * B' = [a0b0 a1b1 a2b2 ... a24b24]
This clearly involves 25 multiplies.
However, in my scenario, only 5 new values are shifted into A per "loop iteration" (and 5 old values are shifted out of A). Is there any fast way to exploit the fact that data is shifting through A rather than being completely new? Ideally I want to minimize the number of multiplication operations (at a cost of perhaps more additions/subtractions/accumulations). I initially thought a systolic array might help, but it doesn't (I think!?)
Update 1: Note B is fixed for long periods, but can be reprogrammed.
Update 2: the shifting of A is like the following: a[24] <= a[19], a[23] <= a[18]... a[1] <= new01, a[0] <= new00. And so on so forth each clock cycle
Many thanks!
Is there any fast way to exploit the fact that data is shifting through A rather than being completely new?
Even though all you're doing is the shifting and adding new elements to A, the products in C will, in general, all be different since one of the operands will generally change after each iteration. If you have additional information about the way the elements of A or B are structured, you could potentially use that structure to reduce the number of multiplications. Barring any such structural considerations, you will have to compute all 25 products each loop.
Ideally I want to minimize the number of multiplication operations (at a cost of perhaps more additions/subtractions/accumulations).
In theory, you can reduce the number of multiplications to 0 by shifting and adding the array elements to simulate multiplication. In practice, this will be slower than a hardware multiplication so you're better off just using any available hardware-based multiplication unless there's some additional, relevant constraint you haven't mentioned.
on the very first 5 data set you could be saving upto 50 multiplications. but after that its a flat road of multiplications. since for every set after the first 5 set you need to multiply with the new set of data.
i'l assume all the arrays are initialized to zero.
i dont think those 50 saved are of any use considering the amount of multiplication on the whole.
But still i will give you a hint on how to save those 50 maybe you could find an extension to it?
1st data set arrived : multiply the first data set in a with each of the data set in b. save all in a, copy only a[0] to a[4] to c. 25 multiplications here.
2nd data set arrived : multiply only a[0] to a[4](having new data) with b[0] to b[4] resp. save in a[0] to a[4],copy to a[0->9] to c. 5 multiplications here
3rd data set arrived : multiply a[0] to a[9] with b[0] to b[9] this time and copy to corresponding a[0->14] to c.10 multiplications here
4th data set : multiply a[0] to a[14] with corresponding b copy corresponding a[0->19] to c. 15 multiplications here.
5th data set : mutiply a[0] to a[19] with corresponding b copy corresponding a[0->24] to c. 20 multiplications here.
total saved mutiplications : 50 multiplications.
6th data set : usual data multiplications. 25 each. this is because for each set in the array a there a new data set avaiable so multiplication is unavoidable.
Can you add another array D to flag the changed/unchanged value in A. Each time you check this array to decide whether to do new multiplications or not.