Find index of an ordered set of N elements - sequence

Problem description:
A set of lists of N integers i1,i2,....,iN with 0<= i1<=i2<=i3<=....<=iN <=M, is created by starting with one integer 0<=i1<=M, and repeatedly adding one integer that is greater or equal to the last integer added.
When adding the last integer to get the final set of lists, the index runs starting from 0 to BinomialC[M+N,N)]-1.
For example, for M=3, i1=0,1,2,3
so the lists are
{0},{1},...,{3}.
Adding another integer i2>=i1 will result in
{0,0},{0,1},{0,2},{0,3},
{1,1},{1,2},{1,3},
{2,2},{2,3}
{3,3}
with indices
0,1,2,3,
4,5,6,
7,8,
9.
This index can be represented in terms of i1,i2,...,iN and M. If the conditions >= were not present, then it would be simply i1*(M+1)^(N-1)+i2*(M+1)^(N-2)+...+iN*(M+1)^(N-N). But, in the case above, there is a negative shift in the index due to the restrictions. For example, N=2 the shift is -i1(i1+1)/2 and index is i = i1*(M+1)^1 + i2*(M+1)^0 -i1(i1+1)/2.
Question:
Does anyone especially from mathematics background knows how to write the index for general N element case? or just the final expression? Any help would be appriciated!
Thanks!

Related

Use of bitwise OR operator in calculating a return value

I am using some code that I found at AllenBrowne.com, which works fine, but I have a question about what it's doing.
The code is designed to return information about any index found on a specific column of a table in MS Access. Index types are identified with a constant, and there are four possible index types (including None):
Private Const intcIndexNone As Integer = 0
Private Const intcIndexGeneral As Integer = 1
Private Const intcIndexUnique As Integer = 3
Private Const intcIndexPrimary As Integer = 7
The relevant piece of code is as follows:
Private Function IndexOnField(tdf As DAO.TableDef, fld As DAO.Field) As Integer
'Purpose: Indicate if there is a single-field index on this field in this table.
'Return: The constant indicating the strongest type.
Dim ind As DAO.Index
Dim intReturn As Integer
intReturn = intcIndexNone
For Each ind In tdf.Indexes
If ind.Fields.Count = 1 Then
If ind.Fields(0).Name = fld.Name Then
If ind.Primary Then
intReturn = (intReturn Or intcIndexPrimary)
ElseIf ind.Unique Then
intReturn = (intReturn Or intcIndexUnique)
Else
intReturn = (intReturn Or intcIndexGeneral)
End If
End If
End If
Next
'Clean up
Set ind = Nothing
IndexOnField = intReturn
End Function
To be truthful, I didn't really understand the concept of a bitwise OR operator, so I've spent the last couple of hours researching that, so now I think I do. And along the way, I noticed that the four possible index values equate to a clear binary pattern:
None: 0
General: 1
Unique: 11
Primay: 111
All of which is good. But I don't understand the use in the function of the OR operator, in the lines:
If ind.Primary Then
intReturn = (intReturn Or intcIndexPrimary)
ElseIf ind.Unique Then
intReturn = (intReturn Or intcIndexUnique)
Else
intReturn = (intReturn Or intcIndexGeneral)
End If
Given that the structure of this code means that only one path can ever be returned, why not just return the actual required constant, without the use of OR? I know that Allen Browne's code is always well crafted, so he won't, I assume, have done this without good reason, but I can't see what it is.
Can someone help, so that I can better understand - and write better code myself in future?
Thanks
As basodre pointed to about the bitwise is correct, but not the 2, 4, 8 basis.
When dealing with an index, ALL of the possibilities are possible, hence the 1, 3, 7 (right-most 3 bits).
0000 = No index
0001 = regular index
0011 = unique index
0111 = PRIMARY index
So, the IF block is testing with the HIGHEST QUALIFIER of the type of index.
Any index can be regular, no problem.
some indexes can be unique, and they could be on some sort of concatenated fields to want as unique that have NOTHING to do with the primary key of the table
Last IS the primary key of the table - which is ALSO UNIQUE.
So, if the index you are testing against IS the primary, it would also show as true if you asked if it was an index or even if it was a unique index.
So, what it is doing is starting the
intReturn = intcIndexNone
which in essence sets the return value to a default of 0. Then it cycles through all indexes in the table that have the given field as part of an index. A table could have 20 indexes on it and 5 of them have an index using the field in question. That one field could be used as any possible part of a regular, unique or primary key index.
So the loop is starting with NONE (0). Then going through each time the field is found as associated with an index. Then whatever type of index that current index is, ORs the result.
So lets say that the index components as it goes through show a given field as Unique first, then regular, then Primary just for grins to see the result of the OR each cycle.
def intReturn 0000
OR Unique 0011
====
0011 NEW value moving forward
intReturn 0011
OR Regular 0001
====
0011 Since unique was higher classification, no change
intReturn 0011
OR Primary 0111
====
0111 Just upgraded to the highest classification index
So now, its returning the OR'd result of whatever the previous value was. In this case, the highest index association is "Primary" index
Does that clarify it for you?
The bitwise OR is useful in cases where combinations of values can exist, and you'd want to return an additive value. In this specific code block, the code is looping through each of the indices, and setting the flag based on the specific index. If there are two indexes, and one of them is general and the other is primary, you can encode this information in resultant bit pattern.
I'm confused by the choice of bitmaps, though. By choosing values with all of the bits set to true, you'd lose information about individual items (maybe that's a design element).
Generally, bitmaps might look something like:
Option A = 2 --> 0010
Option B = 4 --> 0100
Option C = 8 --> 1000
If you want both Option A and Option B to be true, the BIT OR would return 6, which is 0110.
Now, if you need to test if option A is true, you use the BIT AND operation. If you test (6 BIT AND 2) it will return a value greater than 0. However, if you test (8 BIT AND 6), which is the value for option c, it will return a 0.
Hopefully that adds some clarity. I don't have much information about how Access specifically works with indexes, so I'm just speaking to the general case.
EDIT: So I re-read the function definition, and it seems like the choice of integers is intentional. The function intentionally returns the strongest type of index. So, if there is a primary index, it will only show a primary. Considering this, I'm not sure that the bitwise or is the most self-descriptive option here. Maybe there is another consideration at play.

How to find the set difference of two sorted arrays in numpy?

I would like to use an array rows for indexing rows of another array x. Initially, rows contains indices of all rows of x (and is therefor sorted). Throughout the program, some indices exclude are chosen to be removed from rows. Similar to rows itself, exclude is a sorted array.
What is the best way of finding the set difference of rows and exclude?
I have thought of a few different options, but I think their complexities are more than O(n + m), where n is the length of rows and m is the length of exclude.
new_rows = [r for r in rows if r not in exclude]
This solutions requires looking up exclude every time and therefore, O(mn) complexity.
new_rows = setdiff1d(rows, exclude, assume_unique=True)
This will probably take O(nlogm), but I'm not sure.
Convert exclude to a dict and run 1. The problem with this approach is that it requires extra memory, but it meets the complexity requirement.
Here are outlines of two O(n+m) options:
1) heapq.merge will combine two sorted sequences in linear time. As the combined sequence is sorted, shared indices will sit next to each other.
2) as rows as you describe it is a "thinned out range" I assume that the the max value of rows is not excessively large. You can therfore allocate an array E of that size (O(1) if we don't initialize it, i.e. use np.empty). Then you use rows and exclude to index into the empty array. For example, you write E[rows] = 1 E[exclude] = 0 and then check back E[rows] and remove all elements of rows at which E has changed from 1 to 0.
Option 2 also works if the two sets are not sorted.

How to make a biased random number generator in VB.NET?

How do I make a biased random number generator (RNG) in VB.NET?
I know I could make it by fiddling with the output of the Randomize()/Rnd methods, but is there a built-in way of doing this?
I want the biased RNG to give me either a 2 or 4 (though using 1 or 2 as a substitute is also OK by me), with 2 occurring on average 90% of the time and 4 occurring on average 10% of the time.
Create a random number generator to return values from 1-10, if the value from the random number generator is between 1 and 9 send a 2 if the value is 10 send a 4.
You might want to look at this
http://msdn.microsoft.com/en-us/library/vstudio/ctssatww(v=vs.100).aspx?cs-save-lang=1&cs-lang=vb#code-snippet-2
If you want to come out with a mask to generate your values
Here is what I think you can do.
Dim numbers() as integer = {2,2,2,2,4,2,2,2,2,2} ' set 10% for 4, 90% for 2
Dim r as new Random()
Return numbers(r.Next(0, 10))

Speed Enhancements for a Sorted Vector in MATLAB

What is the fastest way to lookup the index of a value in sorted vector in MATLAB?
That is, is there a fast find(vector == myNumber, 1, 'first') for when vector is sorted?
I have a large matrix (200,000 x 4) of locations each with a unique integer ID recorded in the first column. I want to find the right the location of a known ID but thousands of searches can take me a little bit to find.
If you use ismembc2, the loc output should give you what you need. See this for more details:
http://www.mathworks.com/support/solutions/en/data/1-9NIE1N/index.html?product=ML&solution=1-9NIE1N
There are a number of submissions for this on FEX: http://www.mathworks.com/matlabcentral/fileexchange/?term=binary+search+vector
I do not know if it is faster but you may want to try
result=vector(vector(:,1)==myNumber,:)
result will contain the 4 elements row for which vector first column == myNumber

Calculate Rows per Column for CSS in VB.NET?

I am trying to figure out a calculation I can perform in C# to determine the rows per column. Let's say I know I am going to have 3 columns and my record count is 46. I know that I can mod the results to get a remainder, but I would like something more efficient than what I have tried. So I know I will have 16 rows per column with a remainder of 14 for the last column, but what is the best way to loop through the resutls and keep counts.
Integer divsion will give you the number of complete rows (46 / 3 = 15). You then check the modulus to see if you have any leftover (46 Mod 3 = 1; yep, you have one column to put in a final extra row.)
To loop through, just check the modulus of the current record index (zero-based) with your column count. That modulus is the (zero-based) column index. If it equals 0, you start a new row.
But from your question, it sounds like you already got this far. So am I misunderstanding the question?