In Linear programming we have:
maximum number of corner points for a problem with m constrains and n variable is . n+mCn . (taking a combination of the number of equations plus variables with number of variables )
why this is the case? I have no idea why this is true.
Define:
m = number of rows = number of logical variables (slacks)
n = number of columns = number of structural variables
so the total number of variables is n+m
Further, we have:
number of basic variables = m (solved by linear algebra)
number of non-basic variables = n (temporarily fixed, usually at 0)
The total number of corner points is equal to the number of ways we can choose m basic variables out of n+m total variables.
But we have:
n+m choose m = n+m choose n
Note that in general many of these bases are infeasible.
I have a database of N billboards giving the IDs of all the people that saw each billboard. I need to find the k billboards that have been seen by the largest number of unique people across the k billboards.
As an example:
I have N = 3 billboards: billboard 1 was seen by persons 'a', 'b', and 'c', billboard 2 was
seen by person 'b' and billboard 3 was seen by persons 'c' and 'd'
k = 2
The solution is billboards 1 & 3, which together were seen by four people ('a', 'b', 'c' and 'd')
So each billboard represents a set of values, and I need to find the k billboards from the N available that have the highest number of unique values.
I can't do this with brute force because of the huge number of potential combinations (>10K billboards in my database), is there an algorithm that more quickly find an optimal or near-optimal solution? Speed here is more important than getting the answer exactly right.
Preferably I would also like to be able to constrain the algorithm such that the sum of the costs of the selected billboards was below a certain value, this isn't strictly required though.
I'm thinking this is similar to some of the combinatorial optimisation problems described here, in particular the knapsack problem here, except that these problems are working with sets of numbers rather than sets of sets. My maths skills are sketchy so I haven't been able to work out whether I could modify these equations to suit my needs.
Thank you
I've been using the PuLP library for a side project (daily fantasy sports) where I optimize the projected value of a lineup based on a series of constraints.
I've implemented most of them, but one constraint is that players must come from at least three separate teams.
This paper has an implementation (page 18, 4.2), which I've attached as an image:
It seems that they somehow derive an indicator variable for each team that's one if a given team has at least one player in the lineup, and then it constrains the sum of those indicators to be greater than or equal to 3.
Does anybody know how this would be implemented in PuLP?
Similar examples would also be helpful.
Any assistance would be super appreciated!
In this case you would define a binary variable t that sets an upper limit of the x variables. In python I don't like to name variables with a single letter but as I have nothing else to go on here is how I would do it in pulp.
assume that the variables lineups, players, players_by_team and teams are set somewhere else
x_index = [i,p for i in lineups for p in players]
t_index = [i,t for i in lineups for t in teams]
x = LpVariable.dicts("x", x_index, lowBound=0)
t = LpVAriable.dicts("t", t_index, cat=LpBinary)
for l in teams:
prob += t[i,l] <=lpSum([x[i,k] for k in players_by_team[l]])
prob += lpSum([t[i,l] for l in teams]) >= 3
what is the denominator in average rating of a user in Adjusted cosine similarity? (Item Based Collaborative Filtering)
Is it number all Items in system?? Or Just number of rated items by user??
and
is there a function in MatLab for Adjusted Cosine?
thanks
Question 1: Is it number all Items in system?? Or Just number of rated items by
user??
Answer 1: Neither
If you see this formula:
in the denominator you need to calculate the rooted sum for each Rating of user u for item i minus the average rating of this User (mean rating) and this subtraction is squared then it is the same thing for item j which you have to multiply together.
Question 2: is there a function in MatLab for Adjusted Cosine?
Answer 2: By default no. But it should be relatively easy to write it your self given that you have the formula.
I need a simple way to randomly select a letter from the alphabet, weighted on the percentage I want it to come up. For example, I want the letter 'E' to come up in the random function 5.9% of the time, but I only want 'Z' to come up 0.3% of the time (and so on, based on the average occurrence of each letter in the alphabet). Any suggestions? The only way I see is to populate an array with, say, 10000 letters (590 'E's, 3 'Z's, and so on) and then randomly select an letter from that array, but it seems memory intensive and clumsy.
Not sure if this would work, but it seems like it might do the trick:
Take your list of letters and frequencies and sort them from
smallest frequency to largest.
Create a 26 element array where each element n contains the sum of all previous weights and the element n from the list of frequencies. Make note of the sum in the
last element of the array
Generate a random number between 0 and the sum you made note of above
Do a binary search of the array of sums until you reach the element where that number would fall
That's a little hard to follow, so it would be something like this:
if you have a 5 letter alphabet with these frequencies, a = 5%, b = 20%, c = 10%, d = 40%, e = 25%, sort them by frequency: a,c,b,e,d
Keep a running sum of the elements: 5, 15, 35, 60, 100
Generate a random number between 0 and 100. Say it came out 22.
Do a binary search for the element where 22 would fall. In this case it would be between element 2 and 3, which would be the letter "b" (rounding up is what you want here, I think)
You've already acknowledged the tradeoff between space and speed, so I won't get into that.
If you can calculate the frequency of each letter a priori, then you can pre-generate an array (or dynamically create and fill an array once) to scale up with your desired level of precision.
Since you used percentages with a single digit of precision after the decimal point, then consider an array of 1000 entries. Each index represents one tenth of one percent of frequency. So you'd have letter[0] to letter[82] equal to 'a', letter[83] to letter[97] equal to 'b', and so on up until letter[999] equal to 'z'. (Values according to Relative frequencies of letters in the English language)
Now generate a random number between 0 and 1 (using whatever favourite PRNG you have, assuming uniform distribution) and multiply the result by 1000. That gives you the index into your array, and your weighted-random letter.
Use the method explained here. Alas this is for Python but could be rewritten for C etc.
https://stackoverflow.com/a/4113400/129202
First you need to make a NSDicationary of the letters and their frequencies;
I'll explain it with an example:
let's say your dictionary is something like this:
{#"a": #0.2, #"b", #0.5, #"c": #0.3};
So the frequency of you letters covers the interval of [0, 1] this way:
a->[0, 0.2] + b->[0.2, 0.7] + c->[0.7, 1]
You generate a random number between 0 and 1. Then easily by checking that this random belongs to which interval and returning the corresponding letter you get what you want.
you seed the random function at the beginning of you program: srand48(time(0));
-(NSSting *)weightedRandomForDicLetters:(NSDictionary *)letterFreq
{
double randomNumber = drand48();
double endOfInterval = 0;
for (NSString *letter in dic){
endOfInterval += [[letterFreq objectForKey:letter] doubleValue];
if (randomNumber < endOfInterval) {
return letter;
}
}
}