Generate combinations ordered by an attribute - sum

I'm looking for a way to generate combinations of objects ordered by a single attribute. I don't think lexicographical order is what I'm looking for... I'll try to give an example. Let's say I have a list of objects A,B,C,D with the attribute values I want to order by being 3,3,2,1. This gives A3, B3, C2, D1 objects. Now I want to generate combinations of 2 objects, but they need to be ordered in a descending way:
A3 B3
A3 C2
B3 C2
A3 D1
B3 D1
C2 D1
Generating all combinations and sorting them is not acceptable because the real world scenario involves large sets and millions of combinations. (set of 40, order of 8), and I need only combinations above the certain threshold.
Actually I need count of combinations above a threshold grouped by a sum of a given attribute, but I think it is far more difficult to do - so I'd settle for developing all combinations above a threshold and counting them. If that's possible at all.
EDIT - My original question wasn't very precise... I don't actually need these combinations ordered, just thought it would help to isolate combinations above a threshold. To be more precise, in the above example, giving a threshold of 5, I'm looking for an information that the given set produces 1 combination with a sum of 6 ( A3 B3 ) and 2 with a sum of 5 ( A3 C2, B3 C2). I don't actually need the combinations themselves.
I was looking into subset-sum problem, but if I understood correctly given dynamic solution it will only give you information is there a given sum or no, not count of the sums.
Thanks

Actually, I think you do want lexicographic order, but descending rather than ascending. In addition:
It's not clear to me from your description that A, B, ... D play any role in your answer (except possibly as the container for the values).
I think your question example is simply "For each integer at least 5, up to the maximum possible total of two values, how many distinct pairs from the set {3, 3, 2, 1} have sums of that integer?"
The interesting part is the early bailout, once no possible solution can be reached (remaining achievable sums are too small).
I'll post sample code later.
Here's the sample code I promised, with a few remarks following:
public class Combos {
/* permanent state for instance */
private int values[];
private int length;
/* transient state during single "count" computation */
private int n;
private int limit;
private Tally<Integer> tally;
private int best[][]; // used for early-bail-out
private void initializeForCount(int n, int limit) {
this.n = n;
this.limit = limit;
best = new int[n+1][length+1];
for (int i = 1; i <= n; ++i) {
for (int j = 0; j <= length - i; ++j) {
best[i][j] = values[j] + best[i-1][j+1];
}
}
}
private void countAt(int left, int start, int sum) {
if (left == 0) {
tally.inc(sum);
} else {
for (
int i = start;
i <= length - left
&& limit <= sum + best[left][i]; // bail-out-check
++i
) {
countAt(left - 1, i + 1, sum + values[i]);
}
}
}
public Tally<Integer> count(int n, int limit) {
tally = new Tally<Integer>();
if (n <= length) {
initializeForCount(n, limit);
countAt(n, 0, 0);
}
return tally;
}
public Combos(int[] values) {
this.values = values;
this.length = values.length;
}
}
Preface remarks:
This uses a little helper class called Tally, that just isolates the tabulation (including initialization for never-before-seen keys). I'll put it at the end.
To keep this concise, I've taken some shortcuts that aren't good practice for "real" code:
This doesn't check for a null value array, etc.
I assume that the value array is already sorted into descending order, required for the early-bail-out technique. (Good production code would include the sorting.)
I put transient data into instance variables instead of passing them as arguments among the private methods that support count. That makes this class non-thread-safe.
Explanation:
An instance of Combos is created with the (descending ordered) array of integers to combine. The value array is set up once per instance, but multiple calls to count can be made with varying population sizes and limits.
The count method triggers a (mostly) standard recursive traversal of unique combinations of n integers from values. The limit argument gives the lower bound on sums of interest.
The countAt method examines combinations of integers from values. The left argument is how many integers remain to make up n integers in a sum, start is the position in values from which to search, and sum is the partial sum.
The early-bail-out mechanism is based on computing best, a two-dimensional array that specifies the "best" sum reachable from a given state. The value in best[n][p] is the largest sum of n values beginning in position p of the original values.
The recursion of countAt bottoms out when the correct population has been accumulated; this adds the current sum (of n values) to the tally. If countAt has not bottomed out, it sweeps the values from the start-ing position to increase the current partial sum, as long as:
enough positions remain in values to achieve the specified population, and
the best (largest) subtotal remaining is big enough to make the limit.
A sample run with your question's data:
int[] values = {3, 3, 2, 1};
Combos mine = new Combos(values);
Tally<Integer> tally = mine.count(2, 5);
for (int i = 5; i < 9; ++i) {
int n = tally.get(i);
if (0 < n) {
System.out.println("found " + tally.get(i) + " sums of " + i);
}
}
produces the results you specified:
found 2 sums of 5
found 1 sums of 6
Here's the Tally code:
public static class Tally<T> {
private Map<T,Integer> tally = new HashMap<T,Integer>();
public Tally() {/* nothing */}
public void inc(T key) {
Integer value = tally.get(key);
if (value == null) {
value = Integer.valueOf(0);
}
tally.put(key, (value + 1));
}
public int get(T key) {
Integer result = tally.get(key);
return result == null ? 0 : result;
}
public Collection<T> keys() {
return tally.keySet();
}
}

I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.

Check out this question in stackoverflow: Algorithm to return all combinations
I also just used a the java code below to generate all permutations, but it could easily be used to generate unique combination's given an index.
public static <E> E[] permutation(E[] s, int num) {//s is the input elements array and num is the number which represents the permutation
int factorial = 1;
for(int i = 2; i < s.length; i++)
factorial *= i;//calculates the factorial of (s.length - 1)
if (num/s.length >= factorial)// Optional. if the number is not in the range of [0, s.length! - 1]
return null;
for(int i = 0; i < s.length - 1; i++){//go over the array
int tempi = (num / factorial) % (s.length - i);//calculates the next cell from the cells left (the cells in the range [i, s.length - 1])
E temp = s[i + tempi];//Temporarily saves the value of the cell needed to add to the permutation this time
for(int j = i + tempi; j > i; j--)//shift all elements to "cover" the "missing" cell
s[j] = s[j-1];
s[i] = temp;//put the chosen cell in the correct spot
factorial /= (s.length - (i + 1));//updates the factorial
}
return s;
}

I am extremely sorry (after all those clarifications in the comments) to say that I could not find an efficient solution to this problem. I tried for the past hour with no results.
The reason (I think) is that this problem is very similar to problems like the traveling salesman problem. Until unless you try all the combinations, there is no way to know which attributes will add upto the threshold.
There seems to be no clever trick that can solve this class of problems.
Still there are many optimizations that you can do to the actual code.
Try sorting the data according to the attributes. You may be able to avoid processing some values from the list when you find that a higher value cannot satisfy the threshold (so all lower values can be eliminated).

If you're using C# there is a fairly good generics library here. Note though that the generation of some permutations is not in lexicographic order

Here's a recursive approach to count the number of these subsets: We define a function count(minIndex,numElements,minSum) that returns the number of subsets of size numElements whose sum is at least minSum, containing elements with indices minIndex or greater.
As in the problem statement, we sort our elements in descending order, e.g. [3,3,2,1], and call the first index zero, and the total number of elements N. We assume all elements are nonnegative. To find all 2-subsets whose sum is at least 5, we call count(0,2,5).
Sample Code (Java):
int count(int minIndex, int numElements, int minSum)
{
int total = 0;
if (numElements == 1)
{
// just count number of elements >= minSum
for (int i = minIndex; i <= N-1; i++)
if (a[i] >= minSum) total++; else break;
}
else
{
if (minSum <= 0)
{
// any subset will do (n-choose-k of them)
if (numElements <= (N-minIndex))
total = nchoosek(N-minIndex, numElements);
}
else
{
// add element a[i] to the set, and then consider the count
// for all elements to its right
for (int i = minIndex; i <= (N-numElements); i++)
total += count(i+1, numElements-1, minSum-a[i]);
}
}
return total;
}
Btw, I've run the above with an array of 40 elements, and size-8 subsets and consistently got back results in less than a second.

Related

How do I generate a ranking of columns in Google Sheets? Do I use ArrayFormula?

Here's a picture as an example:
The left side is the raw data I currently have, consisting of columns of teams A, B, C, D, E, F. Each row represents 1 game, and many games are played (this spreadsheet has several thousand rows).
How do I generate a ranking of the teams, like what I manually did on the right side? I want to sort the teams by the number of points they have.
In the first row, Team A scores the most points, so they are ranked 1st; Team C scores the second most points, and thus are in the second column, etc.
I'm having trouble figuring out how to sort by column in each row, rather than sorting by column, which is what the default sort() function does. I've tried multiple methods, including using the google SQL query function, (https://developers.google.com/chart/interactive/docs/querylanguage) but couldn't figure out how to use that either.
Is using ArrayFormula() a good idea in this situation? How would I implement that? This sheet does have several thousand rows, so it would be much better if I can just put an ArrayFormula() in the first row, rather than manually autofilling every row.
You may use costum formula, for cell H2:
=ArraySort(A1:F1,A2:F2001)
here's the code to paste into script editor:
function ArraySort(Arr1, Arr2) {
var w = Arr1[0].length;
var column = [];
var row = [];
var result = [];
for (var i = 0; i < Arr2.length; i++) {
row = [];
for (var k = 0; k < w; k++) {
column[k] = [Arr2[i][k],Arr1[0][k]];
}
column.sort(sortFunction);
for (var j = 0; j < w; j++) {
row[j] = column[j][1];
}
result.push(row);
}
return result;
}
function sortFunction(a, b) {
if (a[0] === b[0]) {
return 0;
}
else {
return (a[0] > b[0]) ? -1 : 1;
}
}
The following formula creates the ranking in the second row:
=array_constrain(transpose(sort(transpose({A$1:F$1; A2:F2}), 2, False)), 1, 6)
It can be copied down to rank the rest of rows. I do not think there is an arrayformula approach that would sort an array of arrays.
Explanation of the formula: it creates a two-row array (team names and scores), transposes it so the scores become the second column, sorts by that column, transposes again, and keeps only the names (the first row).

array lists in java: exclude the first element from `for` loop

I am taking an introduction to Java programing class and I have an array list where I need to exclude the first element from my for loop that finds an average. The first element in the array list is a weight for the average (which is why it needs to be excluded). I also need to drop the lowest value from the remainder of the array list hence my second for loop. I have tried to create a copy of the list and also tried to create a sub list but I cannot get it to work.
public static double Avgerage(ArrayList<Double> inputValues) {
double avg;
double sum = 0;
double weightValue = inputValues.get(0);
double lowest = inputValues.get(0);
for (int i = 1; i > inputValues.size(); i++) {
if (inputValues.get(i) < lowest) {
lowest = inputValues.get(i);
}
}
for (int i = 0; i < inputValues.size(); i++) {
sum = sum + inputValues.get(i);
}
double average = (sum - lowest) / (inputValues.size() - 1);
avg = average * weightValue;
return avg;
}
To start with good programming practice, you should work with interfaces rather than classes, where possible. The appropriate interface here is List<Double>, and when you create it in your class, you should use
List<Double> nameOfList = new ArrayList<Double>();
What we're doing is creating an object which has the behaviour of a List, with the underlying implementation of an ArrayList (more info here.
With regards to the question, you don't appear to be excluding the first element, as you said you wished to - both for loops iterate through all values in the list. Remember to treat the ArrayList like an array - accessing an element does not modify it, like it might in a Queue.
I have edited your code below to demonstrate this, and have also included some other optimisations and corrected the sign error on line 7:
public static double average(List<Double> inputValues) {
double sum = 0;
//Exclude the first element, as it contains the weight
double lowest = inputValues.get(1);
for (int i = 2; i < inputValues.size(); i++) {
lowest = Math.min(inputValues.get(i), lowest);
}
for (int i = 1; i < inputValues.size(); i++) {
sum += inputValues.get(i);
}
double average = (sum - lowest) / (inputValues.size() - 1);
//Scale by the weight
avg *= inputValues.get(0);
return avg;
}
Note: The convention in java is to use camelCase for method names, I have adjusted accordingly.
Also, I don't know your requirements, but optimally, you should be providing logical parameters. If possible do the following before calling the function:
int weight = inputValues.get(0);
inputValues.remove(0);
//And then you would call like this, and update your method signature to match
average(inputValues, weight);
I don't do this inside the method, as the context implies that we would not be modifying values.

Returning the smallest value within an Array List

I need to write a method that returns me the smallest distance (which is a whole number value) within an Array List called "babyTurtles". There are 5 turtles within this array list and they all move a random distance each time the program is ran.
I've been trying to figure out how to do it for an hour and all I've accomplished is making myself frustrated and coming here.
p.s.
In my class we wrote this code to find the average distance moved by the baby turtles:
public double getAverageDistanceMovedByChildren() {
if (this.babyTurtles.size() == 0) {
return 0;
}
double sum = 0;
for (Turtle currentTurtle : this.babyTurtles) {
sum = sum + currentTurtle.getDistanceMoved();
}
double average = sum / this.babyTurtles.size();
return average;
}
That's all I've got to work on, but I just can't seem to find out how to do it.
I'd really appreciate it if you could assist me.
This will give you the index in the array list of the smallest number:
int lowestIndex = distanceList.indexOf(Collections.min(distanceList));
You can then get the value using this:
int lowestDistance = distanceList.get(lowestIndex);

game maker random cave generation

I want to make a cave explorer game in game maker 8.0.
I've made a block object and an generator But I'm stuck. Here is my code for the generator
var r;
r = random_range(0, 1);
repeat(room_width/16) {
repeat(room_height/16) {
if (r == 1) {
instance_create(x, y, obj_block)
}
y += 16;
}
x += 16;
}
now i always get a blank frame
You need to use irandom(1) so you get an integer. You also should put it inside the loop so it generates a new value each time.
In the second statement, you are generating a random real value and storing it in r. What you actually require is choosing one of the two values. I recommend that you use the function choose(...) for this. Here goes the corrected statement:
r = choose(0,1); //Choose either 0 or 1 and store it in r
Also, move the above statement to the inner loop. (Because you want to decide whether you want to place a block at the said (x,y) location at every spot, right?)
Also, I recommend that you substitute sprite_width and sprite_height instead of using the value 16 directly, so that any changes you make to the sprite will adjust the resulting layout of the blocks accordingly.
Here is the code with corrections:
var r;
repeat(room_width/sprite_width) {
repeat(room_height/sprite_height) {
r = choose(0, 1);
if (r == 1)
instance_create(x, y, obj_block);
y += sprite_height;
}
x += sprite_width;
}
That should work. I hope that helps!
Looks like you are only creating a instance if r==1. Shouldn't you create a instance every time?
Variable assignment r = random_range(0, 1); is outside the loop. Therefore performed only once before starting the loop.
random_range(0, 1) returns a random real number between 0 and 1 (not integer!). But you have if (r == 1) - the probability of getting 1 is a very small.
as example:
repeat(room_width/16) {
repeat(room_height/16) {
if (irandom(1)) {
instance_create(x, y, obj_block)
}
y += 16;
}
x += 16;
}
Here's a possible, maybe even better solution:
length = room_width/16;
height = room_height/16;
for(xx = 0; xx < length; xx+=1)
{
for(yy = 0; yy < height; yy+=1)
{
if choose(0, 1) = 1 {
instance_create(xx*16, yy*16, obj_block); }
}
}
if you want random caves, you should probably delete random sections of those blocks,
not just single ones.
For bonus points, you could use a seed value for the random cave generation. You can also have a pathway random generation that will have a guaranteed path to the finish with random openings and fake paths that generate randomly from that path. Then you can fill in the extra spaces with other random pieces.
But in regards to your code, you must redefine the random number each time you are placing a block, which is why all of them are the same. It should be called inside of the loops, and should be an integer instead of a decimal value.
Problem is on the first line, you need to put r = something in the for cycle

Non repeating random numbers in Objective-C

I'm using
for (int i = 1, i<100, i++)
int i = arc4random() % array count;
but I'm getting repeats every time. How can I fill out the chosen int value from the range, so that when the program loops I will not get any dupe?
It sounds like you want shuffling of a set rather than "true" randomness. Simply create an array where all the positions match the numbers and initialize a counter:
num[ 0] = 0
num[ 1] = 1
: :
num[99] = 99
numNums = 100
Then, whenever you want a random number, use the following method:
idx = rnd (numNums); // return value 0 through numNums-1
val = num[idx]; // get then number at that position.
num[idx] = val[numNums-1]; // remove it from pool by overwriting with highest
numNums--; // and removing the highest position from pool.
return val; // give it back to caller.
This will return a random value from an ever-decreasing pool, guaranteeing no repeats. You will have to beware of the pool running down to zero size of course, and intelligently re-initialize the pool.
This is a more deterministic solution than keeping a list of used numbers and continuing to loop until you find one not in that list. The performance of that sort of algorithm will degrade as the pool gets smaller.
A C function using static values something like this should do the trick. Call it with
int i = myRandom (200);
to set the pool up (with any number zero or greater specifying the size) or
int i = myRandom (-1);
to get the next number from the pool (any negative number will suffice). If the function can't allocate enough memory, it will return -2. If there's no numbers left in the pool, it will return -1 (at which point you could re-initialize the pool if you wish). Here's the function with a unit testing main for you to try out:
#include <stdio.h>
#include <stdlib.h>
#define ERR_NO_NUM -1
#define ERR_NO_MEM -2
int myRandom (int size) {
int i, n;
static int numNums = 0;
static int *numArr = NULL;
// Initialize with a specific size.
if (size >= 0) {
if (numArr != NULL)
free (numArr);
if ((numArr = malloc (sizeof(int) * size)) == NULL)
return ERR_NO_MEM;
for (i = 0; i < size; i++)
numArr[i] = i;
numNums = size;
}
// Error if no numbers left in pool.
if (numNums == 0)
return ERR_NO_NUM;
// Get random number from pool and remove it (rnd in this
// case returns a number between 0 and numNums-1 inclusive).
n = rand() % numNums;
i = numArr[n];
numArr[n] = numArr[numNums-1];
numNums--;
if (numNums == 0) {
free (numArr);
numArr = 0;
}
return i;
}
int main (void) {
int i;
srand (time (NULL));
i = myRandom (20);
while (i >= 0) {
printf ("Number = %3d\n", i);
i = myRandom (-1);
}
printf ("Final = %3d\n", i);
return 0;
}
And here's the output from one run:
Number = 19
Number = 10
Number = 2
Number = 15
Number = 0
Number = 6
Number = 1
Number = 3
Number = 17
Number = 14
Number = 12
Number = 18
Number = 4
Number = 9
Number = 7
Number = 8
Number = 16
Number = 5
Number = 11
Number = 13
Final = -1
Keep in mind that, because it uses statics, it's not safe for calling from two different places if they want to maintain their own separate pools. If that were the case, the statics would be replaced with a buffer (holding count and pool) that would "belong" to the caller (a double-pointer could be passed in for this purpose).
And, if you're looking for the "multiple pool" version, I include it here for completeness.
#include <stdio.h>
#include <stdlib.h>
#define ERR_NO_NUM -1
#define ERR_NO_MEM -2
int myRandom (int size, int *ppPool[]) {
int i, n;
// Initialize with a specific size.
if (size >= 0) {
if (*ppPool != NULL)
free (*ppPool);
if ((*ppPool = malloc (sizeof(int) * (size + 1))) == NULL)
return ERR_NO_MEM;
(*ppPool)[0] = size;
for (i = 0; i < size; i++) {
(*ppPool)[i+1] = i;
}
}
// Error if no numbers left in pool.
if (*ppPool == NULL)
return ERR_NO_NUM;
// Get random number from pool and remove it (rnd in this
// case returns a number between 0 and numNums-1 inclusive).
n = rand() % (*ppPool)[0];
i = (*ppPool)[n+1];
(*ppPool)[n+1] = (*ppPool)[(*ppPool)[0]];
(*ppPool)[0]--;
if ((*ppPool)[0] == 0) {
free (*ppPool);
*ppPool = NULL;
}
return i;
}
int main (void) {
int i;
int *pPool;
srand (time (NULL));
pPool = NULL;
i = myRandom (20, &pPool);
while (i >= 0) {
printf ("Number = %3d\n", i);
i = myRandom (-1, &pPool);
}
printf ("Final = %3d\n", i);
return 0;
}
As you can see from the modified main(), you need to first initialise an int pointer to NULL then pass its address to the myRandom() function. This allows each client (location in the code) to have their own pool which is automatically allocated and freed, although you could still share pools if you wish.
You could use Format-Preserving Encryption to encrypt a counter. Your counter just goes from 0 upwards, and the encryption uses a key of your choice to turn it into a seemingly random value of whatever radix and width you want.
Block ciphers normally have a fixed block size of e.g. 64 or 128 bits. But Format-Preserving Encryption allows you to take a standard cipher like AES and make a smaller-width cipher, of whatever radix and width you want (e.g. radix 2, width 16), with an algorithm which is still cryptographically robust.
It is guaranteed to never have collisions (because cryptographic algorithms create a 1:1 mapping). It is also reversible (a 2-way mapping), so you can take the resulting number and get back to the counter value you started with.
AES-FFX is one proposed standard method to achieve this. I've experimented with some basic Python code which is based on the AES-FFX idea, although not fully conformant--see Python code here. It can e.g. encrypt a counter to a random-looking 7-digit decimal number, or a 16-bit number.
You need to keep track of the numbers you have already used (for instance, in an array). Get a random number, and discard it if it has already been used.
Without relying on external stochastic processes, like radioactive decay or user input, computers will always generate pseudorandom numbers - that is numbers which have many of the statistical properties of random numbers, but repeat in sequences.
This explains the suggestions to randomise the computer's output by shuffling.
Discarding previously used numbers may lengthen the sequence artificially, but at a cost to the statistics which give the impression of randomness.
The best way to do this is create an array for numbers already used. After a random number has been created then add it to the array. Then when you go to create another random number, ensure that it is not in the array of used numbers.
In addition to using secondary array to store already generated random numbers, invoking random no. seeding function before every call of random no. generation function might help to generate different seq. of random numbers in every run.