How to program a MIP solver to find balanced Gray code for mixed radices? - optimization

The permutations of a mixed radix number can be ordered to achieve Grayness (in the sense of Gray code) with optimal balance and span length. Each of these constraints will be explained in turn. In my examples, I use a mixed radix number consisting of a base 2 digit, a base 3 digit, and a base 4 digit. This set is called [234], and it has 2 × 3 × 4 = 24 permutations. The permutations are listed below, in ascending order. For compactness, the digits are shown as rows, with the top row corresponding to the set’s first digit. The leftmost column is the first permutation 000, the next column is the second permutation 001, then 002, 003, 010, 011, 012, 013, and so on.
2: 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1
3: 0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2
4: 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
In the above set, multiple digits may change from one permutation to the next. For example, between the fourth and fifth permutations (003 and 010), two digits change at once. To make a Gray set, we must reorder the permutations so that only one digit changes at a time. This constraint includes the wraparound from the first to the last permutation. Below is [234] reordered to be Gray:
2: 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 0 0 1 1 0 0
3: 0 0 1 1 2 2 2 2 0 0 1 1 1 1 1 1 0 2 2 2 2 0 0 0
4: 0 0 0 0 0 0 1 1 1 1 1 2 2 1 3 3 3 3 3 2 2 2 2 3
The above set is Gray, but not balanced. To be balanced, each of the set’s digits must change the same number of times, or as close as possible. In the above set, the 2’s place changes 10 times, the 3’s place changes 7 times, and the 4’s place also changes 7 times. A set’s imbalance is the absolute value of the difference between its minimum and maximum digit changes, in this case 10 – 7 = 3. Below is [234] reordered to have optimal balance; each digit changes 8 times, so the imbalance is now zero:
2: 0 1 1 0 0 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 0 0 0
3: 0 0 1 1 2 2 2 2 0 0 1 1 1 1 1 1 2 0 0 2 2 2 0 0
4: 0 0 0 0 0 0 1 1 1 1 1 2 2 1 3 3 3 3 2 2 2 3 3 2
The above set is Gray and balanced, but digits get stuck for longer than we’d like. For example, the 4’s place stays zero for the first six permutations. This constitutes a span, with a length of six. In the above set, the maximum span length is six. For optimal granularity, the maximum span should be as short as possible. Below is [234] reordered so that the maximum span length is four instead of six:
2: 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0
3: 0 0 1 1 1 1 0 0 0 0 1 2 2 2 2 1 1 1 0 2 2 2 2 0
4: 0 0 0 0 1 1 1 1 2 2 2 2 0 0 2 2 3 3 3 3 1 1 3 3
The above ordering of [234] is Gray, optimally balanced, and minimizes stuck digits. It’s the best we can do for this particular set. But for larger sets, such as [345], optimal solutions are much harder to find, because my code is too slow. Can a MIP solver do better? The solution should be coded in a language that’s supported by one of the solvers available at NEOS, because these are the only high-quality solvers I have access to (for example CPLEX via AMPL, GAMS, LP, MPS, or NL). The application is atonal music theory, hence only sets with ranges of twelve or less are relevant. The complete list of sets I’m trying to optimize is here.
EDIT: Some commenters asked about my code, so I'm enclosing it below. I use Visual Studio 2012, but this code should compile fairly easily in any C++ compiler. I use x64 (64-bit code).
// Copyleft 2023 Chris Korda
// This program is free software; you can redistribute it and/or modify it
// under the terms of the GNU General Public License as published by the Free
// Software Foundation; either version 2 of the License, or any later version.
// BalaGray.cpp : Defines the entry point for the console application.
// This app computes balanced Gray code sequences, for use in music theory.
#include "stdafx.h" // precompiled header
#include "stdint.h" // standard sizes
#include "vector" // growable array
#include "fstream" // file I/O
#include "assert.h" // debugging
using namespace std;
#define MORE_PLACES 0 // set non-zero to use more than four places
#define DO_PRUNING 1 // set non-zero to do branch pruning and reduce runtime
class CBalaGray {
public:
// Construction
CBalaGray();
// Attributes
int GetPermCount() const { return static_cast<int>(m_arrPerm.size()); }
// Operations
void Reset();
void Calc(int nPlaces, const uint8_t *arrRange);
protected:
// Constants
enum {
#if MORE_PLACES
MAX_PLACES = 8,
#else
MAX_PLACES = 4,
#endif
MAX_RANGE = 255,
ULONGLONG_BITS = 64,
};
enum { // pruning thresholds may require manual tuning; see notes in set list
PRUNE_MAXTRANS = 18,
PRUNE_IMBALANCE = 3,
};
// Types
union PERM { // permutation; size depends on MAX_PLACES
uint8_t b[MAX_PLACES]; // array of places
#if MORE_PLACES
uint64_t dw; // double word containing all places
#else
uint32_t dw; // double word containing all places
#endif
};
struct STATE { // crawler stack element
uint8_t iPerm; // permutation index
uint8_t iGray; // Gray neighbor index
PERM nTrans; // transition counts, one per place
};
typedef vector<PERM> CPermArray;
typedef vector<STATE> CStateArray;
typedef vector<uint8_t> CPlaceArray; // enough for atonal music theory
// Member data
int m_nPlaces; // number of places
int m_nGrayPerms; // number of Gray permutations reachable from a permutation
int m_nGrayStrideShift; // stride of Gray permutations array, as a shift in bits
CPlaceArray m_arrRange; // array of ranges, one for each place
CPermArray m_arrPerm; // array of permutations
CPlaceArray m_arrGray; // 2D table of permutations reachable from each permutation
CStateArray m_arrState; // array of states; crawler stack
ofstream m_fOut; // output file
// Helpers
int Pack(const PERM& perm) const;
void MakePerms(int nPlaces, const uint8_t *arrRange);
void MakeGrayTable();
void DumpGrayTablePerms() const;
void DumpPerm(const PERM& perm) const;
void DumpPerms() const;
void DumpSet() const;
void WriteBalanceToLog(int nImbalance, int nMaxTrans, int nMaxSpan);
void WriteSequenceToLog(int iDepth);
bool IsGray(PERM p1, PERM p2) const;
int ComputeBalance(int iDepth, int& nMaxTrans, PERM& nTransCounts) const;
int ComputeMaxSpan(int iDepth) const;
};
CBalaGray::CBalaGray()
{
m_fOut.open("BalaGrayIter.txt", ios_base::out); // open output file
assert(m_fOut != NULL);
Reset();
}
void CBalaGray::Reset()
{
m_nPlaces = 0;
m_arrRange.clear();
m_arrState.clear();
}
int CBalaGray::Pack(const PERM& perm) const
{
int nPacked = perm.b[m_nPlaces - 1]; // init total to first place
for (int iPlace = m_nPlaces - 2; iPlace >= 0; iPlace--) { // for each subsequent place
nPacked *= m_arrRange[iPlace]; // multiply total by places's range
nPacked += perm.b[iPlace]; // add place to total
}
return nPacked;
}
void CBalaGray::MakePerms(int nPlaces, const uint8_t *arrRange)
{
m_nPlaces = nPlaces;
m_arrRange.resize(nPlaces);
int nPerms = 1;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
assert(arrRange[iPlace] > 1); // radix must be at least binary
m_arrRange[iPlace] = arrRange[iPlace]; // store range
nPerms *= arrRange[iPlace]; // update range
}
m_arrPerm.resize(nPerms);
for (int iPerm = 0; iPerm < nPerms; iPerm++) {
PERM perm;
perm.dw = 0;
int nVal = iPerm;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
int nRange = m_arrRange[iPlace];
perm.b[iPlace] = nVal % nRange;
nVal /= nRange;
}
m_arrPerm[iPerm] = perm;
}
}
void CBalaGray::MakeGrayTable()
{
// Build 2D table of permutations reachable from each permutation.
// One row for each permutation, one column for each Gray neighbor.
// Each element is a permutation index, and must be dereferenced.
int nPlaces = m_nPlaces;
int nGrayPerms = 0;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
nGrayPerms += m_arrRange[iPlace] - 1; // one less than place's range
}
// Compute stride of Gray permutations array; to avoid multiplication,
// round up stride to nearest power of two and convert it to a shift.
unsigned long iFirstBitPos;
_BitScanReverse(&iFirstBitPos, nGrayPerms - 1);
int nStrideShift = 1 << iFirstBitPos;
m_arrGray.resize(m_arrPerm.size() << nStrideShift);
int nPerms = GetPermCount();
for (int iPerm = 0; iPerm < nPerms; iPerm++) { // for each permutation
int iCol = 0;
PERM rowPerm, colPerm;
rowPerm.dw = m_arrPerm[iPerm].dw;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
int nRange = m_arrRange[iPlace]; // place's range
for (int iVal = 0; iVal < nRange; iVal++) { // for of place's values
if (iVal != rowPerm.b[iPlace]) { // if value differs from row value
colPerm.dw = rowPerm.dw; // column permutation is same as row
colPerm.b[iPlace] = iVal; // except one place differs (Gray)
m_arrGray[(iPerm << nStrideShift) + iCol] = Pack(colPerm);
iCol++; // next column
}
}
}
}
m_nGrayPerms = nGrayPerms; // save in member var
m_nGrayStrideShift = nStrideShift;
}
void CBalaGray::DumpPerm(const PERM& perm) const
{
printf("[");
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
printf("%d ", perm.b[iPlace]);
}
printf("]");
}
void CBalaGray::DumpPerms() const
{
int nPerms = GetPermCount();
for (int iPerm = 0; iPerm < nPerms; iPerm++) {
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
printf("%d ", m_arrPerm[iPerm].b[iPlace]);
}
printf("\n");
}
}
void CBalaGray::DumpGrayTablePerms() const
{
int nPerms = GetPermCount();
for (int iPerm = 0; iPerm < nPerms; iPerm++) { // for each permutation
DumpPerm(m_arrPerm[iPerm]);
printf(": ");
for (int iGray = 0; iGray < m_nGrayPerms; iGray++) { // for each Gray neighbor
int iPerm2 = m_arrGray[(iPerm << m_nGrayStrideShift) + iGray];
DumpPerm(m_arrPerm[iPerm2]);
}
printf("\n");
}
}
void CBalaGray::DumpSet() const
{
printf("[");
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
printf("%d", m_arrRange[iPlace]);
}
printf("]\n");
}
void CBalaGray::WriteBalanceToLog(int nImbalance, int nMaxTrans, int nMaxSpan)
{
m_fOut << "balance = " << nImbalance << ", maxtrans = " << nMaxTrans << ", maxspan = " << nMaxSpan << '\n';
}
void CBalaGray::WriteSequenceToLog(int iDepth)
{
int nPerms = GetPermCount();
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
for (int iPerm = 0; iPerm < nPerms; iPerm++) {
m_fOut << int(m_arrPerm[m_arrState[iPerm].iPerm].b[iPlace]) << ' ';
}
m_fOut << '\n';
}
m_fOut << '\n';
}
__forceinline bool CBalaGray::IsGray(PERM p1, PERM p2) const
{
// Returns true if the given permutations differ by exactly one place.
bool bDiff = false;
int nPlaces = m_nPlaces;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
if (p1.b[iPlace] != p2.b[iPlace]) { // if places differ
if (!bDiff) { // if first difference
bDiff = true; // set flag
} else { // not first difference
return false; // not Gray; early out
}
}
}
return bDiff;
}
void CBalaGray::Calc(int nPlaces, const uint8_t *arrRange)
{
assert(nPlaces >= 0 && nPlaces <= MAX_PLACES);
Reset();
MakePerms(nPlaces, arrRange);
MakeGrayTable();
// DumpPerms();
// DumpGrayTablePerms();
int nPermGrays = m_nGrayPerms;
int nGrayStrideShift = m_nGrayStrideShift;
DumpSet();
int nPerms = GetPermCount();
printf("nPlaces=%d\n", nPlaces);
printf("nPerms=%d\n", nPerms);
int nBestImbalance = INT_MAX;
int nBestMaxTrans = INT_MAX;
int nBestMaxSpan = INT_MAX;
m_arrState.resize(nPerms);
uint64_t nPasses = 0;
uint64_t nPermUsedMask[2] = {0}; // need 128 bits, as number of permutations may exceed 64
int iDepth = 2; // first two levels are constant to save time; all sequences start with 0, 1
m_arrState[1].iPerm = 1;
m_arrState[1].nTrans.b[0] = 1;
nPermUsedMask[0] = 0x3;
int nStartDepth = iDepth;
while (1) {
nPasses++;
int iPrevPerm = m_arrState[iDepth - 1].iPerm;
int iGray = m_arrState[iDepth].iGray;
int iPerm = m_arrGray[(iPrevPerm << nGrayStrideShift) + iGray]; // optimized 2D table addressing
int iUsedMask = iPerm >= ULONGLONG_BITS; // index selects one of two 64-bit masks
uint64_t nPermMask = 1ull << (iPerm & (ULONGLONG_BITS - 1));
if (!(nPermUsedMask[iUsedMask] & nPermMask)) { // if this permutation hasn't been used yet on this branch
m_arrState[iDepth].iPerm = iPerm; // save permutation index on stack
int nMaxTrans;
PERM nTransCounts;
int nImbalance = ComputeBalance(iDepth, nMaxTrans, nTransCounts);
if (iDepth < nPerms - 1) { // if incomplete sequence
#if DO_PRUNING
// these constants may require tuning, see notes below
// if (nMaxTrans > PRUNE_MAXTRANS || nImbalance > PRUNE_IMBALANCE) { // slightly faster
if (nImbalance > PRUNE_IMBALANCE) {
goto lblPrune; // abandon this branch
}
#endif
// crawl one level deeper
nPermUsedMask[iUsedMask] |= nPermMask; // mark this permutation as used
m_arrState[iDepth].nTrans.dw = nTransCounts.dw; // save current transition counts on stack
iDepth++; // increment depth to next permutation
m_arrState[iDepth].iGray = 0; // reset index of Gray neighbors
m_arrState[iDepth].iPerm = 0; // reset permutation index
continue; // equivalent to recursion, but less overhead
} else { // reached a leaf: complete sequence, a potential winner
// if branch doesn't wrap around Gray
if (!IsGray(m_arrPerm[m_arrState[0].iPerm], m_arrPerm[m_arrState[nPerms - 1].iPerm])) {
goto lblPrune; // abandon this branch
}
// if max transition count or imbalance are worse than our current bests
if (nMaxTrans > nBestMaxTrans || nImbalance > nBestImbalance) {
goto lblPrune; // abandon this branch
}
int nMaxSpan = ComputeMaxSpan(iDepth); // compute maximum span length
// if max transition count and imbalance equal our current bests
if (nMaxTrans == nBestMaxTrans && nImbalance == nBestImbalance) {
if (nMaxSpan >= nBestMaxSpan) { // if max span didn't improve
goto lblPrune; // abandon this branch
}
}
// we have a winner, until something better comes along
nBestMaxTrans = nMaxTrans; // update best max transition count
nBestImbalance = nImbalance; // update best imbalance
nBestMaxSpan = nMaxSpan; // update best maximum span length
printf("balance = %d, maxtrans = %d, maxspan = %d\n", nImbalance, nMaxTrans, nMaxSpan);
WriteBalanceToLog(nImbalance, nMaxTrans, nMaxSpan);
WriteSequenceToLog(iDepth);
}
}
m_arrState[iDepth].iGray++; // increment Gray neighbor index
if (m_arrState[iDepth].iGray >= nPermGrays) { // if no more Gray neighbors for this permutation
lblPrune:
if (iDepth <= nStartDepth) { // if we're at same level where we started
break; // exit main loop
} else { // sufficient levels remain above us
iDepth--; // back up a level
// restore bitmask that keeps track of which permutations we've used on this branch
int iPerm = m_arrState[iDepth].iPerm; // number of permutations may exceed 64
int iUsedMask = iPerm >= ULONGLONG_BITS; // index selects one of two 64-bit masks
uint64_t nPermMask = 1ull << (iPerm & (ULONGLONG_BITS - 1));
nPermUsedMask[iUsedMask] &= ~nPermMask; // mark this permutation as available again
m_arrState[iDepth].iGray++; // increment was skipped by continue statement above
if (m_arrState[iDepth].iGray >= nPermGrays) { // if no more Gray neighbors
goto lblPrune; // keep backing up
}
}
}
}
printf("done!\n");
}
__forceinline int CBalaGray::ComputeBalance(int iDepth, int& nMaxTrans, PERM& nTransCounts) const
{
int nPlaces = m_nPlaces;
PERM nTrans;
nTrans.dw = m_arrState[iDepth - 1].nTrans.dw; // load latest transition counts from stack
// compare current state to previous state
PERM sPrev, sCur;
sPrev.dw = m_arrPerm[m_arrState[iDepth - 1].iPerm].dw;
sCur.dw = m_arrPerm[m_arrState[iDepth].iPerm].dw;
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
if (sCur.b[iPlace] != sPrev.b[iPlace]) { // if place transitioned
nTrans.b[iPlace]++; // increment place's transition count
}
}
nTransCounts = nTrans; // order matters; counts passed back to caller must exclude wraparound
// account for wraparound; compare current state to initial state, which is assumed to be zero
for (int iPlace = 0; iPlace < nPlaces; iPlace++) { // for each place
if (sCur.b[iPlace]) { // if place transitioned
nTrans.b[iPlace]++; // increment place's transition count
}
}
// now that we have latest transition counts, compute their min and max
int nMin = nTrans.b[0]; // initialize min and max to first transition count
int nMax = nTrans.b[0];
for (int iPlace = 1; iPlace < nPlaces; iPlace++) { // for each transition count, excluding first
int n = nTrans.b[iPlace];
if (n < nMin) // if less than min
nMin = n; // update min
if (n > nMax) // if greater than max
nMax = n; // udpate max
}
nMaxTrans = nMax;
return nMax - nMin; // return difference
}
__forceinline int CBalaGray::ComputeMaxSpan(int iDepth) const
{
int arrSpan[MAX_PLACES];
int arrFirstSpan[MAX_PLACES];
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
arrSpan[iPlace] = 1; // initial span length is one
arrFirstSpan[iPlace] = 0; // first span length not set
}
int nMaxSpan = 1;
PERM sFirst, sPrev;
sFirst.dw = m_arrPerm[m_arrState[0].iPerm].dw; // store first state
sPrev.dw = sFirst.dw;
for (int iState = 1; iState <= iDepth; iState++) { // for each state, excluding first
PERM s;
s.dw = m_arrPerm[m_arrState[iState].iPerm].dw; // compare this state to previous state
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
if (s.b[iPlace] != sPrev.b[iPlace]) { // if place transitioned
if (arrSpan[iPlace] > nMaxSpan) // if span length exceeds max
nMaxSpan = arrSpan[iPlace]; // update max span length
if (!arrFirstSpan[iPlace]) // if first span length hasn't been set
arrFirstSpan[iPlace] = arrSpan[iPlace]; // save first span length
arrSpan[iPlace] = 1; // reset span length
} else { // place didn't transition
arrSpan[iPlace]++; // increment span length
}
}
sPrev = s; // update previous state
}
// wrap around from last to first state
for (int iPlace = 0; iPlace < m_nPlaces; iPlace++) { // for each place
if (sFirst.b[iPlace] != sPrev.b[iPlace]) { // if place transitioned
if (arrSpan[iPlace] > nMaxSpan) // if span length exceeds max
nMaxSpan = arrSpan[iPlace]; // update max span length
} else { // place didn't transition
arrSpan[iPlace] += arrFirstSpan[iPlace]; // compute wrapped span length
if (arrSpan[iPlace] > nMaxSpan) // if span length exceeds max
nMaxSpan = arrSpan[iPlace]; // update max span length
}
}
return nMaxSpan;
}
void test()
{
// All cases want PRUNE_IMBALANCE = 3 unless specified otherwise below.
// Pruning greatly reduces runtime, but the results may not be optimal.
// Proven means exited normally with pruning disabled (DO_PRUNING = 0).
//
// const uint8_t arrRange[] = {2, 10}; // proven
// const uint8_t arrRange[] = {3, 9};
// const uint8_t arrRange[] = {4, 8};
// const uint8_t arrRange[] = {5, 7};
// const uint8_t arrRange[] = {6, 6};
// const uint8_t arrRange[] = {2, 9}; // proven
// const uint8_t arrRange[] = {3, 8};
// const uint8_t arrRange[] = {4, 7};
// const uint8_t arrRange[] = {5, 6};
// const uint8_t arrRange[] = {2, 8}; // proven
// const uint8_t arrRange[] = {3, 7};
// const uint8_t arrRange[] = {4, 6};
// const uint8_t arrRange[] = {5, 5};
// const uint8_t arrRange[] = {2, 7}; // proven
// const uint8_t arrRange[] = {3, 6}; // proven
// const uint8_t arrRange[] = {4, 5}; // proven
// const uint8_t arrRange[] = {2, 6}; // proven
// const uint8_t arrRange[] = {3, 5}; // proven
// const uint8_t arrRange[] = {4, 4}; // proven
// const uint8_t arrRange[] = {2, 5}; // proven
// const uint8_t arrRange[] = {3, 4}; // proven
// const uint8_t arrRange[] = {2, 4}; // proven
// const uint8_t arrRange[] = {3, 3}; // proven
// const uint8_t arrRange[] = {2, 3}; // proven
// const uint8_t arrRange[] = {2, 2}; // proven
// const uint8_t arrRange[] = {2, 2, 8};
// const uint8_t arrRange[] = {2, 3, 7};
// const uint8_t arrRange[] = {2, 4, 6};
// const uint8_t arrRange[] = {2, 5, 5};
// const uint8_t arrRange[] = {3, 3, 6};
// const uint8_t arrRange[] = {3, 4, 5};
// const uint8_t arrRange[] = {4, 4, 4};
// const uint8_t arrRange[] = {2, 2, 7};
// const uint8_t arrRange[] = {2, 3, 6};
// const uint8_t arrRange[] = {2, 4, 5};
// const uint8_t arrRange[] = {3, 3, 5};
// const uint8_t arrRange[] = {3, 4, 4};
// const uint8_t arrRange[] = {2, 2, 6}; // proven
// const uint8_t arrRange[] = {2, 3, 5};
// const uint8_t arrRange[] = {2, 4, 4};
// const uint8_t arrRange[] = {3, 3, 4};
// const uint8_t arrRange[] = {2, 2, 5}; // proven
// const uint8_t arrRange[] = {2, 3, 4}; // proven
// const uint8_t arrRange[] = {3, 3, 3};
// const uint8_t arrRange[] = {2, 2, 4}; // proven
// const uint8_t arrRange[] = {2, 3, 3}; // proven
// const uint8_t arrRange[] = {2, 2, 3}; // proven
// const uint8_t arrRange[] = {2, 2, 2}; // proven
// const uint8_t arrRange[] = {2, 2, 2, 6};
// const uint8_t arrRange[] = {2, 2, 3, 5}; // slow
// const uint8_t arrRange[] = {2, 2, 4, 4};
// const uint8_t arrRange[] = {2, 3, 3, 4}; // slow; wants PRUNE_IMBALANCE = 4
const uint8_t arrRange[] = {3, 3, 3, 3}; // slow
// const uint8_t arrRange[] = {2, 2, 2, 5};
// const uint8_t arrRange[] = {2, 2, 3, 4};
// const uint8_t arrRange[] = {2, 3, 3, 3};
// const uint8_t arrRange[] = {2, 2, 2, 4};
// const uint8_t arrRange[] = {2, 2, 3, 3}; // slow
// const uint8_t arrRange[] = {2, 2, 2, 3}; // proven
// const uint8_t arrRange[] = {2, 2, 2, 2}; // proven
//
// *** following cases require MORE_PLACES to be non-zero ***
//
// const uint8_t arrRange[] = {2, 2, 2, 2, 4}; // wants PRUNE_IMBALANCE = 2
// const uint8_t arrRange[] = {2, 2, 2, 3, 3}; // wants PRUNE_IMBALANCE = 4
// const uint8_t arrRange[] = {2, 2, 2, 2, 3}; // wants PRUNE_IMBALANCE = 2
// const uint8_t arrRange[] = {2, 2, 2, 2, 2};
// const uint8_t arrRange[] = {2, 2, 2, 2, 2, 2};
//
CBalaGray bg;
bg.Calc(_countof(arrRange), arrRange);
fgetc(stdin);
}
int _tmain(int argc, _TCHAR* argv[])
{
test();
return 0;
}

Within CPLEX, I would use CPOptimizer.
For instance, in OPL
// 2 2 2 2
using CP;
int Size=4;
int r[1..Size]=[2,2,2,2];
int States=prod(i in 1..Size) r[i];
int fig[1..States][1..Size];
execute
{
var index=0;
for(var f1=1;f1<=r[1];f1++)
for(var f2=1;f2<=r[2];f2++)
for(var f3=1;f3<=r[3];f3++)
for(var f4=1;f4<=r[4];f4++)
{
index++;
fig[index][1]=f1;
fig[index][2]=f2;
fig[index][3]=f3;
fig[index][4]=f4;
}
}
dvar int x[1..States] in 1..States; // list of States in the right order
dvar int change[1..States] in 1..Size; // the figure that is different next time
dexpr int nbChanges[i in 1..Size]=count(change,i);
dexpr int inbalance=max(i in 1..Size) nbChanges[i]-min(i in 1..Size) nbChanges[i];
dvar int+ nochangeForThatManyTimes[1..States][1..Size] in 1..maxint;
dexpr int maxspan=max(i in 1..States,j in 1..Size) nochangeForThatManyTimes[i][j];
minimize staticLex(inbalance,maxspan);
subject to
{
x[1]==1;
allDifferent(x);
// Gray
forall(i in 1..States,j in 1..Size)
((fig[x[i]][j]==fig[x[(i<States)?(i+1):1]][j])==(j!=change[i]));
forall(i in 2..States,j in 1..Size)
{
(j==change[i-1]) => (nochangeForThatManyTimes[i][j]==1);
(j!=change[i-1]) => (nochangeForThatManyTimes[i][j]==1+nochangeForThatManyTimes[i-1][j]);
}
forall(j in 1..Size)
(j==change[States]) ==
(nochangeForThatManyTimes[1][j]==1);
}
execute
{
for(var i=1;i<=States;i++)
{
for(var j=1;j<=Size;j++) write(fig[x[i]][j]-1);
writeln();
}
writeln();
writeln("inbalance = ",inbalance);
writeln("maxspan = ",maxspan);
}
gives
0000
1000
1010
1011
1001
1101
0101
0001
0011
0010
0110
0111
1111
1110
1100
0100
inbalance = 0
maxspan = 6
and with 3,3,3,3 and a 12000 time limit I got
OBJECTIVE: 1; 14
0000
1000
2000
2100
2110
2010
2012
0012
0010
0011
2011
1011
1010
1012
1022
1002
0002
0102
0202
0212
0210
2210
2200
2220
0220
0222
1222
2222
2202
2212
2112
1112
0112
0110
0120
0020
0021
0022
0122
0121
0111
0211
0221
1221
1220
1210
1211
1212
1202
1102
1122
1120
1110
1111
1121
1021
1020
2020
2120
2122
2102
2002
2022
2021
2121
2221
2211
2111
2101
2001
1001
0001
0101
1101
1201
2201
0201
0200
1200
1100
0100
inbalance = 1
maxspan = 14
and after 10 hours
OBJECTIVE: 1; 10
0000
0001
0002
0012
0010
0011
0111
0121
0221
1221
1021
0021
0022
0222
0122
0102
0112
0212
0202
0201
0101
1101
1001
1011
1012
1112
1110
0110
2110
2010
2020
2000
2002
1002
1022
1020
1120
0120
2120
2121
2221
2222
2202
2200
0200
1200
1000
1010
1210
2210
0210
0211
1211
1201
1202
1102
1100
0100
2100
2101
2102
2112
2012
2022
2122
1122
1121
1111
2111
2011
2021
2001
2201
2211
2212
1212
1222
1220
2220
0220
0020
inbalance = 1
maxspan = 10
In order to get
inbalance = 1
maxspan = 9
with
0000
1000
1001
1201
1200
1210
0210
2210
2211
2201
2200
2000
2010
1010
0010
0110
0112
0102
0202
0212
0012
1012
1011
1111
0111
0101
1101
1100
1102
1002
1022
1020
1120
1121
1021
0021
0011
0211
0221
0222
0122
1122
2122
2102
2202
1202
1222
1212
1211
1221
2221
2222
2212
2012
2022
0022
0002
2002
2001
2101
2121
0121
0120
0020
0220
1220
2220
2020
2021
2011
2111
2112
1112
1110
2110
2120
2100
0100
0200
0201
0001
I slightly improved the model
execute
{
cp.param.timelimit=36000;
}
using CP;
int Size=4;
int r[1..Size]=[3,3,3,3];
int maxr=max(i in 1..Size) r[i];
int States=prod(i in 1..Size) r[i];
int fig[1..States][1..Size];
int which[i1 in 1..r[1]][i2 in 1..r[2]][i3 in 1..r[3]][i4 in 1..r[4]];
execute
{
var index=0;
for(var f1=1;f1<=r[1];f1++)
for(var f2=1;f2<=r[2];f2++)
for(var f3=1;f3<=r[3];f3++)
for(var f4=1;f4<=r[4];f4++)
{
index++;
fig[index][1]=f1;
fig[index][2]=f2;
fig[index][3]=f3;
fig[index][4]=f4;
which[f1][f2][f3][f4]=index;
}
}
dvar int x[1..States] in 1..States; // list of States in the right order
dvar int y[1..States] in 1..States;
dvar int change[1..States] in 1..Size; // the figure that is different next time
dvar int move[1..States] in 1..maxr;
dexpr int nbChanges[i in 1..Size]=count(change,i);
dexpr int inbalance=max(i in 1..Size) nbChanges[i]-min(i in 1..Size) nbChanges[i];
dvar int+ nochangeForThatManyTimes[1..States][1..Size] in 1..maxint;
dexpr int maxspan=max(i in 1..States,j in 1..Size) nochangeForThatManyTimes[i][j];
minimize staticLex(inbalance,maxspan);
subject to
{
// inverse(x,y);
// allDifferent(y);
x[1]==1;
forall(i in 1..States) move[i]<=r[change[i]];
//inverse(x,y);
change[1]==1;
allDifferent(x);
// Gray
// forall(i in 1..States,j in 1..Size)
// {
// (j!=change[i]) == (fig[x[i]][j]==fig[x[(i<States)?(i+1):1]][j]);
// (j==change[i]) == ((fig[x[i]][j]+move[i]-1) mod r[j]+1==fig[x[(i<States)?(i+1):1]][j]);
//}
forall(i in 1..States)
x[(i<States)?(i+1):1]
==which
[(fig[x[i]][1]+(1==change[i])*move[i]-1) mod r[1]+1]
[(fig[x[i]][2]+(2==change[i])*move[i]-1) mod r[2]+1]
[(fig[x[i]][3]+(3==change[i])*move[i]-1) mod r[3]+1]
[(fig[x[i]][4]+(4==change[i])*move[i]-1) mod r[4]+1]
;
inferred(change);
inferred(move);
inferred(nochangeForThatManyTimes);
inbalance>=States mod 2;
forall(i in 2..States,j in 1..Size)
{
(j==change[i-1]) => (nochangeForThatManyTimes[i][j]==1);
(j!=change[i-1]) => (nochangeForThatManyTimes[i][j]==1+nochangeForThatManyTimes[i-1][j]);
}
forall(j in 1..Size)
(j==change[States]) ==
(nochangeForThatManyTimes[1][j]==1);
}
execute
{
for(var i=1;i<=States;i++)
{
for(var j=1;j<=Size;j++) write(fig[x[i]][j]-1);
writeln();
}
writeln();
writeln("inbalance = ",inbalance);
writeln("maxspan = ",maxspan);
}
NB: You can use CPLEX for free in the cloud with this OPL API

Well, after some tinkering, I have an Integer Program running for this, that I think is producing quality results. Tried a couple approaches...each had differing limitations
It is a little grotesque in parts as the counting of repeat digits is quite cumbersome.
It really bogs down for things with ~30 states or more, so it's not going to make it to the finish line. :) I think it is much more nimble if I remove the repeat counting, and I'll tinker a bit more. In the interim, here are some results for the cases not marked as proven on your web page. The (4, 6) run (second run) is an improvement, the other 2 are now "proven" as stated, perhaps with a different sequence, I didn't x-check.
I'll update later with any other improvements.
starting run: (3, 7)
WARNING: Initializing ordered Set PR_flat with a fundamentally unordered data
source (type: set). This WILL potentially lead to nondeterministic
behavior in Pyomo
Problem:
- Name: unknown
Lower bound: 1.3
Upper bound: 1.3
Number of objectives: 1
Number of constraints: 3744
Number of variables: 3539
Number of binary variables: 3557
Number of integer variables: 3560
Number of nonzeros: 3
Sense: minimize
Solver:
- Status: ok
User time: -1.0
System time: 353.01
Wallclock time: 305.85
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 1018
Number of created subproblems: 1018
Black box:
Number of iterations: 667719
Error rc: 0
Time: 306.00031781196594
Solution:
- number of solutions: 0
number of solutions displayed: 0
11
01
02
00
10
20
21
22
12
13
23
03
05
25
26
24
14
04
06
16
15
max imbalance: 1.0
max repeats: 3.0
starting run: (4, 6)
WARNING: Initializing ordered Set PR_flat with a fundamentally unordered data
source (type: set). This WILL potentially lead to nondeterministic
behavior in Pyomo
Problem:
- Name: unknown
Lower bound: 0.2
Upper bound: 0.2
Number of objectives: 1
Number of constraints: 4854
Number of variables: 4619
Number of binary variables: 4640
Number of integer variables: 4643
Number of nonzeros: 3
Sense: minimize
Solver:
- Status: ok
User time: -1.0
System time: 34.21
Wallclock time: 34.89
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 1
Number of created subproblems: 1
Black box:
Number of iterations: 14167
Error rc: 0
Time: 34.923232078552246
Solution:
- number of solutions: 0
number of solutions displayed: 0
10
13
33
34
14
15
35
32
02
03
23
22
12
11
01
00
30
31
21
24
04
05
25
20
max imbalance: 0.0
max repeats: 2.0
starting run: (5, 5)
WARNING: Initializing ordered Set PR_flat with a fundamentally unordered data
source (type: set). This WILL potentially lead to nondeterministic
behavior in Pyomo
Problem:
- Name: unknown
Lower bound: 1.3
Upper bound: 1.3
Number of objectives: 1
Number of constraints: 5256
Number of variables: 5011
Number of binary variables: 5033
Number of integer variables: 5036
Number of nonzeros: 3
Sense: minimize
Solver:
- Status: ok
User time: -1.0
System time: 915.71
Wallclock time: 634.99
Termination condition: optimal
Termination message: Model was solved to optimality (subject to tolerances), and an optimal solution is available.
Statistics:
Branch and bound:
Number of bounded subproblems: 1764
Number of created subproblems: 1764
Black box:
Number of iterations: 1855323
Error rc: 0
Time: 635.0473001003265
Solution:
- number of solutions: 0
number of solutions displayed: 0
11
01
31
33
34
44
04
03
00
40
41
42
22
23
43
13
12
02
32
30
10
20
21
24
14
max imbalance: 1.0
max repeats: 3.0

Related

OpenCL bad get_global_id output

I am trying to implement matrix multiplication, but get_global_id returns incorrect values.
This is the host code (n, m, TILE_SIZE = 4):
int dimention = 2;
size_t global_item_size[] = {n, m};
size_t local_item_size[] = {TILE_SIZE, TILE_SIZE};
ret = clEnqueueNDRangeKernel(command_queue, kernel, dimention, NULL, global_item_size, local_item_size, 0, NULL, &perf_event);
And part of the kernel:
kernel void mul_tile(uint n, uint m, uint k, global const float *a, global const float *b, global float *c) {
size_t i = get_global_id(0);
size_t j = get_global_id(1);
printf("aa %i %i\n", i, j);
}
This code prints this:
aa 0 0
aa 1 0
aa 2 0
aa 3 0
aa 0 0
aa 1 0
aa 2 0
aa 3 0
aa 0 0
aa 1 0
aa 2 0
aa 3 0
aa 0 0
aa 1 0
aa 2 0
aa 3 0
After some time I realized that get_global_id(0) returns correct index when I call it the first time and zero when I call it the second time:
kernel void mul_tile(uint n, uint m, uint k, global const float *a, global const float *b, global float *c) {
size_t i = get_global_id(0);
size_t j = get_global_id(0);
printf("aa %i %i\n", i, j);
}
So, this kernel prints the same thing.
In some cases get_global_id(2) returns 2-nd dimension indexes. But when I just rename variables it starts printing zeroes.
This problem looks like some driver bug. I use GeForce GT 745M, Ubuntu 20.04 and recommended drivers(nvidia-driver-440).

glreadpixels return wrong value on different screen

So I have plugin that show differently on different machine. On my machine, when I pass in the image I generate with 32 * 32 * 32 value. with every 32 have red value go from 0 - 1. I get back correctly, every 32 pixel, the red value go from 0-1.
But same code, same program I run on my co-worker machine, it not every 32 pixels but 64 pixel the red value go from 0 - 1.
Is this problem related to screen resolution? how I can get back correct value that I pass in.
Example (first 32 pixel with red value)
Pass in
0, 8, 16 24, 32, 40, ... 248
Correct one (my machine)
0, 8, 16 24, .... 248
Wrong ( co-worker machine .
0, 4, 8, 12, 16 ... 124
Here is my code to create the texture base on Bit map
glActiveTexture(GL_TEXTURE0);
glEnable(GL_TEXTURE_RECTANGLE_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, _texID);
if (_texID) {
glDeleteTextures(1, &_texID);
_texID = 0;
}
glGenTextures(1, &_texID);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_MODULATE);
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGBA8, (int)_imageSize.width, (int)_imageSize.height, 0, GL_RGBA, GL_UNSIGNED_BYTE, _image_buffer);
glClearColor(0.0, 0.0, 0.0, 0.0);
glClear(GL_COLOR_BUFFER_BIT);
Here code to generate the bit map
[self setFrameSize:_imageSize];
GLubyte al = 255; // alpha value
double increment = (al+1) / val;
_totalByteSize = val*val*val*4;
_image_buffer = new GLubyte[_totalByteSize];
for (int i = 0; i < (_totalByteSize); i++) {
_image_buffer[i] = 0;
}
vector<GLKVector3> data;
GLubyte bl = 0;
GLubyte gr = 0;
GLubyte re = 0;
for ( int b = 0; b < val; b++) {
gr = 0;
for ( int g = 0; g < val; g++) {
re = 0;
for ( int r = 0; r < val; r++) {
int ind = r + g * val + b * val * val;
_image_buffer[ind * 4 + 0] = re;
_image_buffer[ind * 4 + 1] = gr;
_image_buffer[ind * 4 + 2] = bl;
_image_buffer[ind * 4 + 3] = al;
re+= increment; // 256 / 32 = 8
}
gr+= increment; // 256 / 32 = 8
}
bl+= increment; // 256 / 32 = 8
}
And here code to read
int totalByteSize = 32*32*32*3;
GLubyte* bitmap = new GLubyte[totalByteSize];
glReadPixels(0, 0, _imageSize.width, _imageSize.height, GL_RGB, GL_UNSIGNED_BYTE, bitmap);
[_lutData removeAllObjects];
for (int i = 0 ; i <= totalByteSize /3 ; i++) {
double val1 = (double)bitmap[i*3+0] / 256.0;
double val2 = (double)bitmap[i*3+1] / 256.0;
double val3 = (double)bitmap[i*3+2] / 256.0;
[_lutData addObject:#[#(val1), #(val2), #(val3)]];
}
Does this can cause by screen with high resolution or different setting causing it to read wrong

TWIN PRIMES BETWEEN 2 VALUES wrong results

I've been working on this program to count how many twin primes between two values and it's been specified that twin primes come in the (6n-1, 6n+1) format, with the exception of (3, 5). My code seems to work fine, but it keeps giving me the wrong result....1 less couple of twin primes than i should get. Between 1 and 40, we should have 5 twin primes, but I'm always getting 4. é
What am I doing wrong? Am I not taking into account (3, 5)?
Here's my code:
#include <stdio.h>
int prime (int num) {
int div;
if (num == 2) return 1;
if (num % 2 == 0) return 0;
div = 3;
while (div*div <= num && num%div != 0)
div = div + 2;
if (num%div == 0)
return 0;
else
return 1;
}
int main(void) {
int low, high, i, count, n, m;
printf("Please enter the values for the lower and upper limits of the interval\n");
scanf("%d%d", &low, &high);
printf("THIS IS THE LOW %d\n AND THIS IS THE HIGH %d\n", low, high);
i = low;
count = 0;
while (6*i-1>=low && 6*i+1<=high) {
n = 6*i-1;
m = 6*i+1;
if (prime(n) && prime(m)) ++count;
i = i + 1;
}
printf("Number of twin primes is %d\n", count);
return 0;
}
Your program misses (3 5) because 3 is not trapped as a prime number, and because 4 is not a multiple of 6. Rather than the main loop stepping by (effectively) 6, this answer steps by 1.
#include <stdio.h>
int prime (int num) {
int div;
if (num == 1) return 0; // excluded 1
if (num == 2 || num == 3) return 1; // included 3 too
if (num % 2 == 0) return 0;
div = 3;
while (div*div <= num) {
if (num % div == 0) // moved to within loop
return 0;
div += 2;
}
return 1;
}
int main(void) {
int low, high, i, count, n, m;
printf("Please enter the values for the lower and upper limits of the interval\n");
scanf("%d%d", &low, &high);
printf("THIS IS THE LOW %d\n AND THIS IS THE HIGH %d\n", low, high);
count = 0;
for (i=low; i<=high; i++) {
n = i-1;
m = i+1;
if (prime(n) && prime(m)) {
printf ("%2d %2d\n", n, m);
++count;
}
}
printf("Number of twin primes is %d\n", count);
return 0;
}
Program output
1
40
THIS IS THE LOW 1
AND THIS IS THE HIGH 40
3 5
5 7
11 13
17 19
29 31
Number of twin primes is 5
Next run:
3
10
THIS IS THE LOW 3
AND THIS IS THE HIGH 10
3 5
5 7
Number of twin primes is 2
https://primes.utm.edu/lists/small/100ktwins.txt
The five twin primes under forty are (3,5), (5,7), (11,13), (17,19), (29,31) so if you know that your code isn't counting (3,5) then it is working correctly, counting (5,7), (11,13), (17,19), and (29,31).
A possible fix would be to add an if-statement which adds 1 to "count" if the starting number is less than 4. I'm not really that used to reading C syntax so I had trouble getting my head around your formulas, sorry.
edit: since comments don't format code snippets:
i = low;
count = 0;
if (low <= 3 && high >= 3){
count ++; // accounts for (3,5) twin primes if the range includes 3
}
You have a problem in your prime function, this is the output of your prime function for the first ten prime evaluations
for(i=1;i<=10;i++) printf("%d\t%d",i,prime(i));
1 1
2 1
3 0
4 0
5 1
6 0
7 1
8 0
Note the prime() function from Weather Vane, you should include 3 as prime (and exclude 1).
From [1], twin primes are the ones that have a prime gap of two, differing by two from another prime.
Examples are (3,5) , (5,7), (11,13). The format (6n-1,6n+1) is true but for (3,5) as you stated. Your program runs almost ok since it shows the number of twin primes that are in the interval AND follows the rule mentioned above. This doesn't include (3,5). You can make a kind of exception (like if low<=3 add 1 to total count), or use another algorithm to count twin primes (like verify if i is prime, then count distance from i to next prime, if distance=2 then they are twin primes)
[1] http://en.wikipedia.org/wiki/Twin_prime

Non-uniform random numbers in Objective-C

I'd like to calculate a non-uniformly distributed random number in the range [0, n - 1]. So the min possible value is zero. The maximum possible value is n-1. I'd like the min-value to occur the most often and the max to occur relatively infrequently with an approximately linear curve between (Gaussian is fine too). How can I do this in Objective-C? (possibly using C-based APIs)
A very rough sketch of my current idea is:
// min value w/ p = 0.7
// some intermediate value w/ p = 0.2
// max value w/ p = 0.1
NSUInteger r = arc4random_uniform(10);
if (r <= 6)
result = 0;
else if (r <= 8)
result = (n - 1) / 2;
else
result = n - 1;
I think you're on basically the right track. There are possible precision or range issues but in general if you wanted to randomly pick, say, 3, 2, 1 or 0 and you wanted the probability of picking 3 to be four times as large as the probability of picking 0 then if it were a paper exercise you might right down a grid filled with:
3 3 3 3
2 2 2
1 1
0
Toss something onto it and read the number it lands on.
The number of options there are for your desired linear scale is:
- 1 if number of options, n, = 1
- 1 + 2 if n = 2
- 1 + 2 + 3 if n = 3
- ... etc ...
It's a simple sum of an arithmetic progression. You end up with n(n+1)/2 possible outcomes. E.g. for n = 1 that's 1 * 2 / 2 = 1. For n = 2 that's 2 * 3 /2 = 3. For n = 3 that's 3 * 4 / 2 = 6.
So you would immediately write something like:
NSUInteger random_linear(NSUInteger range)
{
NSUInteger numberOfOptions = (range * (range + 1)) / 2;
NSUInteger uniformRandom = arc4random_uniform(numberOfOptions);
... something ...
}
At that point you just have to decide which bin uniformRandom falls into. The simplest way is with the most obvious loop:
NSUInteger random_linear(NSUInteger range)
{
NSUInteger numberOfOptions = (range * (range + 1)) / 2;
NSUInteger uniformRandom = arc4random_uniform(numberOfOptions);
NSUInteger index = 0;
NSUInteger optionsToDate = 0;
while(1)
{
if(optionsToDate >= uniformRandom) return index;
index++;
optionsToDate += index;
}
}
Given that you can work out optionsToDate without iterating, an immediately obvious faster solution is a binary search.
An even smarter way to look at it is that uniformRandom is the sum of the boxes underneath a line from (0, 0) to (n, n). So it's the area underneath the graph, and the graph is a simple right-angled triangle. So you can work backwards from the area formula.
Specifically, the area underneath the graph from (0, 0) to (n, n) at position x is (x*x)/2. So you're looking for x, where:
(x-1)*(x-1)/2 <= uniformRandom < x*x/2
=> (x-1)*(x-1) <= uniformRandom*2 < x*x
=> x-1 <= sqrt(uniformRandom*2) < x
In that case you want to take x-1 as the result hadn't progressed to the next discrete column of the number grid. So you can get there with a square root operation simple integer truncation.
So, assuming I haven't muddled my exact inequalities along the way, and assuming all precisions fit:
NSUInteger random_linear(NSUInteger range)
{
NSUInteger numberOfOptions = (range * (range + 1)) / 2;
NSUInteger uniformRandom = arc4random_uniform(numberOfOptions);
return (NSUInteger)sqrtf((float)uniformRandom * 2.0f);
}
What if you try squaring the return value of arc4random_uniform() (or multiplying two of them)?
int rand_nonuniform(int max)
{
int r = arc4random_uniform(max) * arc4random_uniform(max + 1);
return r / max;
}
I've quickly written a sample program for testing it and it looks promising:
int main(int argc, char *argv[])
{
int arr[10] = { 0 };
int i;
for (i = 0; i < 10000; i++) {
arr[rand_nonuniform(10)]++;
}
for (i = 0; i < 10; i++) {
printf("%2d. = %2d\n", i, arr[i]);
}
return 0;
}
Result:
0. = 3656
1. = 1925
2. = 1273
3. = 909
4. = 728
5. = 574
6. = 359
7. = 276
8. = 187
9. = 113

2nd order IIR filter, coefficients for a butterworth bandpass (EQ)?

Important update: I already figured out the answers and put them in this simple open-source library: http://bartolsthoorn.github.com/NVDSP/ Check it out, it will probably save you quite some time if you're having trouble with audio filters in IOS!
^
I have created a (realtime) audio buffer (float *data) that holds a few sin(theta) waves with different frequencies.
The code below shows how I created my buffer, and I've tried to do a bandpass filter but it just turns the signals to noise/blips:
// Multiple signal generator
__block float *phases = nil;
[audioManager setOutputBlock:^(float *data, UInt32 numFrames, UInt32 numChannels)
{
float samplingRate = audioManager.samplingRate;
NSUInteger activeSignalCount = [tones count];
// Initialize phases
if (phases == nil) {
phases = new float[10];
for(int z = 0; z <= 10; z++) {
phases[z] = 0.0;
}
}
// Multiple signals
NSEnumerator * enumerator = [tones objectEnumerator];
id frequency;
UInt32 c = 0;
while(frequency = [enumerator nextObject])
{
for (int i=0; i < numFrames; ++i)
{
for (int iChannel = 0; iChannel < numChannels; ++iChannel)
{
float theta = phases[c] * M_PI * 2;
if (c == 0) {
data[i*numChannels + iChannel] = sin(theta);
} else {
data[i*numChannels + iChannel] = data[i*numChannels + iChannel] + sin(theta);
}
}
phases[c] += 1.0 / (samplingRate / [frequency floatValue]);
if (phases[c] > 1.0) phases[c] = -1;
}
c++;
}
// Normalize data with active signal count
float signalMulti = 1.0 / (float(activeSignalCount) * (sqrt(2.0)));
vDSP_vsmul(data, 1, &signalMulti, data, 1, numFrames*numChannels);
// Apply master volume
float volume = masterVolumeSlider.value;
vDSP_vsmul(data, 1, &volume, data, 1, numFrames*numChannels);
if (fxSwitch.isOn) {
// H(s) = (s/Q) / (s^2 + s/Q + 1)
// http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt
// BW 2.0 Q 0.667
// http://www.rane.com/note170.html
//The order of the coefficients are, B1, B2, A1, A2, B0.
float Fs = samplingRate;
float omega = 2*M_PI*Fs; // w0 = 2*pi*f0/Fs
float Q = 0.50f;
float alpha = sin(omega)/(2*Q); // sin(w0)/(2*Q)
// Through H
for (int i=0; i < numFrames; ++i)
{
for (int iChannel = 0; iChannel < numChannels; ++iChannel)
{
data[i*numChannels + iChannel] = (data[i*numChannels + iChannel]/Q) / (pow(data[i*numChannels + iChannel],2) + data[i*numChannels + iChannel]/Q + 1);
}
}
float b0 = alpha;
float b1 = 0;
float b2 = -alpha;
float a0 = 1 + alpha;
float a1 = -2*cos(omega);
float a2 = 1 - alpha;
float *coefficients = (float *) calloc(5, sizeof(float));
coefficients[0] = b1;
coefficients[1] = b2;
coefficients[2] = a1;
coefficients[3] = a2;
coefficients[3] = b0;
vDSP_deq22(data, 2, coefficients, data, 2, numFrames);
free(coefficients);
}
// Measure dB
[self measureDB:data:numFrames:numChannels];
}];
My aim is to make a 10-band EQ for this buffer, using vDSP_deq22, the syntax of the method is:
vDSP_deq22(<float *vDSP_A>, <vDSP_Stride vDSP_I>, <float *vDSP_B>, <float *vDSP_C>, <vDSP_Stride vDSP_K>, <vDSP_Length __vDSP_N>)
See: http://developer.apple.com/library/mac/#documentation/Accelerate/Reference/vDSPRef/Reference/reference.html#//apple_ref/doc/c_ref/vDSP_deq22
Arguments:
float *vDSP_A is the input data
float *vDSP_B are 5 filter coefficients
float *vDSP_C is the output data
I have to make 10 filters (10 times vDSP_deq22). Then I set the gain for every band and combine them back together. But what coefficients do I feed every filter? I know vDSP_deq22 is a 2nd order (butterworth) IIR filter, but how do I turn this into a bandpass?
Now I have three questions:
a) Do I have to de-interleave and interleave the audio buffer? I know setting stride to 2 just filters on channel but how I filter the other, stride 1 will process both channels as one.
b) Do I have to transform/process the buffer before it enters the vDSP_deq22 method? If so, do I also have to transform it back to normal?
c) What values of the coefficients should I set to the 10 vDSP_deq22s?
I've been trying for days now but I haven't been able to figure this on out, please help me out!
Your omega value need to be normalised, i.e. expressed as a fraction of Fs - it looks like you left out the f0 when you calculated omega, which will make alpha wrong too:
float omega = 2*M_PI*Fs; // w0 = 2*pi*f0/Fs
should probably be:
float omega = 2*M_PI*f0/Fs; // w0 = 2*pi*f0/Fs
where f0 is the centre frequency in Hz.
For your 10 band equaliser you'll need to pick 10 values of f0, spaced logarithmically, e.g. 25 Hz, 50 Hz, 100 Hz, 200 Hz, 400 Hz, 800 Hz, 1.6 kHz, 3.2 kHz, 6.4 kHz, 12.8 kHz.