db2 rowset cursor parametrization - sql

Would there be a way to program a parameter that is not hard coded to this?
In place of :SomeValue host variable in this question/snippet:
EXEC SQL
FETCH NEXT ROWSET FROM C_NJSRD2_cursor_declared_and_opened
FOR :SomeValue ROWS
INTO
:NJCT0022.SL_ISO2 :NJMT0022.iSL_ISO2
etc....
Here is some clarification:
Parametrization of the request like posted in opening question actually works in case I set the host variable :SomeValue to 1 and define host variable arrays for filling from database to size 1 like
struct
??<
char SL_ISO2 ??(1??) ??(3??); // sorry for Z/os trigraphs
etc..
And it also works if I set the host variable arrays to a larger defined integer value (i.e. 20) and hard code the value (:SomeValue) to that value in cursor rowset fetch.
EXEC SQL
FETCH NEXT ROWSET FROM C_NJSRD2
FOR 20 ROWS
INTO
:NJCT0022.SL_ISO2 :NJMT0022.iSL_ISO2
,:NJCT0022.BZ_COUNTRY :NJMT0022.iBZ_COUNTRY
,:NJCT0022.KZ_RISK :NJMT0022.iKZ_RISK
I wish to receive the number of rows from the calling program (COBOL), a and ideally set the size of host variable arrays accordingly. To avoid variable array sizing problem, oversizing host variable arrays to a larger value would be good also.
Those combinations return compile errors:
HOST VARIABLE ARRAY "NJCT0022" IS EITHER NOT DEFINED OR IS NOT USABLE

And in good tradition here is an answer to my own question.
The good people of SO will upvote me for sure now.
Rowsets are very fast and beside making host variable arrays arrays or for chars array arrays these cursors just require adjusting program functions for saving values and setting null values in loops. They are declared like this:
FETCH NEXT ROWSET FROM C_NJSRD2
FOR 19 ROWS
Rowset cursors can not change host array (that is array array) size dynamically.
Unlike scroll cursors they can not jump to position or go backwards.They can however go forward not by the whole preset rowset number of rows but just by a single row.
FETCH NEXT ROWSET FROM C_NJSRD2 FOR 1 ROWS
INTO
So to answer my question, to make the algorithm able to accept any kind row number requested for fetches it is basically just a question of segmenting the request to rowsets and eventually one line fetches until the requested number is met. To calculate loop counters for rowsets and single liners:
if((iRowCount>iRowsetPreset)&&
((iRowCount%iRowsetPreset)!=0))
??<
iOneLinersCount = iRowCount % iRowsetPreset;
iRowsetsCount = (iRowCount - iOneLinersCount)
/ iRowsetPreset;
??>
if ((iRowCount==iRowsetPreset) _OR_
((iRowCount%iRowsetPreset)==0))
??<
iOneLinersCount = 0;
iRowsetsCount = iRowCount / iRowsetPreset;
??>
if (iRowCount<iRowsetPreset)
??<
iOneLinersCount = iRowCount;
iRowsetsCount = 0;
??>

Related

Displaying a pymongo cursor without loading in a list

I am working on a PyQT5 app. I want to display a large dataset extracted from a MongoDB database.
To do so, I extract my collection in 3 cursors (I need to sort the display). However, until now, I was casting the cursors in a list, then emitting it.
But now my database grew significantly in size, and runtime became a major issue. Going through lists is time-consuming. Therefore, I am trying to find a way to directly access the cursor in its entirety without looping through all of it (I know it sounds a bit tricky since the cursor is just a reference to the collection)
An example of what is done now :
D1 = list(self.collection.find({"$and":[ {"location": current_location}, {"vehicle":"Car" }]}))
D2 = list(self.collection.find({"$and":[ {"location": current_location}, {"vehicle":{"$nin":["Car","Truck"]}}]}))
D3 = list(self.collection.find({"$and":[ {"location": current_location}, {"vehicle" : "Truck"}]}).sort([("_id",-1)]).limit(60 - len(D1)))
extracted = D1 + D2 + D3
l_data_extracted.emit(extracted) # send the loaded data to the front
Loading 3 cursors are time-consuming, but then passing them in 1 list makes it heavier for the app.
I looked for resources about cursor management, but every time, I see answers which involve looping or casting (which I already do).
Would there be a way to directly emit the cursor and a bit like in C, pass the argument by reference to get the pointed data ? Or am I bound to loop/cast in the list due to its particular nature?

Load in many values at once using a query

I am using a program coded in Delphi 7 (sadly I cannot use a newer version for this program) which considers individual persons. For each person, I need to load in a bunch of values (0 to 90 at most, with the exact number depending on the person; non-fixed) which are used later in the code. After trying out a number of things, including loading in via Excel (which was horribly slow) someone suggested loading in the data through Access. I managed to get the following code so far:
MainConnection : TADOConnection;
Table : TADOTable;
StrConnection : String;
//I first open a connection to load the values in from
MainConnection:=TADOConnection.Create(nil);
StrConnection:='Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\test.mdb;Mode=Read;Persist Security Info=False;';
MainConnection.LoginPrompt:=False;
MainConnection.ConnectionString:= StrConnection;
MainConnection.Connected:=True;
Table:=TADOTable.Create(nil);
Table.Connection :=MainConnection;
Table.TableName := 'Sheet1';
Table.Open;
// I get the first three values which I absolutely require
Firstvalue := Table.Fields[0].value;
Secondvalue := Table.Fields[1].value;
Thirdvalue := Table.Fields[2].value;
//Whether I need additional values depends on the first and second values; if the first is a specific value
// I do not need any of the other ones
nrofvaluestoget := round((Secondvalue-Firstvalue));
if (Firstvalue = 100) then nrofvaluestoget:= 0;
if (nrofvaluestoget>0) then begin
for k := 0 to (nrofvaluestoget) do begin
Valueholder[k] := Table.Fields[5+k].value; ; // values for the valueholder
Table.Next; //Go to next person
This links the access database and technically does what I want. However, while it is quicker than loading in an excel file it's still quite slow due to the "nrofvaluestoget" loop. Skipping that and loading in all values for a person at once would speed up the process quite a bit.
As far as I'm aware this may be possible using a SQL query; something akin to: 'SELECT * FROM Sheet1'. However, I am not familiar with SQL, let alone linking it through Delphi 7. Is it even possible to get all the values at once and assign them immediately to the "Valueholder" with Delphi 7? Or at the very least, is there some way to speed up the code above that I'm not aware of? Any help would be much appreciated.
EDIT:
Per Juan's suggestion I added some additional descriptions with regards to the database.
I posted a picture as an example of the database as I was unable to embed one or create a decent looking table.
Let's say I have three persons. Person 1 would have 15 as a first age, and 16 as the second age.
In the current loop, Valueholder would have the value 2 at index 0, and value 0 at index 1. Person 1 has no further ages with values, so these are not considered in the loop.
When the next person is evaluated, all indices of Valueholder are set to their base value (blank).
Person 2 has 18 as the first age and 20 as the second. Valueholder then gets 3 values, namely: the value 8 at index 0, the value 4 at index 1 and the value 2 at index 2.
For the last person, all indices of Valueholder are again reset to their base value.
Person 3 has 100 as a first age; this is an indication that this person has no values which need to be loaded, so Valueholder is blank
I hope this clarifies the question a bit.
(if this is a one off import)
I would recommend exporting the data to a csv file and using TFileStream to read it in your Delphi program. This will be faster than having to connect to Access or SQL server or any database.

How to add record cell value to array variable (IE sum values in array)

I have a function returning a setof records. This can be seen in this picture
.
I have a range of boards of length 2.8m thru to 4.9m (ln28 thru ln49 respectively) they have characteristics that set bits as seen in bincodes (9,2049,4097 etc.) For each given board length, I need to sum the number of boards for each bincode. EG in this case ln28 (bincode 4097) would = 3+17+14 = 34. Where you see brdsource = 128 series is where I intend to store these values, so for row brdsource 128, bincodes 4097, I want to store 34 in ln28.
You will see that I have 0's in ln28 values for all brdsource = 128. I have generated extra records as part of my setof records, and am trying to use a multidimensional array to add the values and keep track of them as seen above with array - Summary[boardlength 0-8][bincode 0-4].
Question 1 - I see that if I add 1 (for argument sake, it can be any number) to an array location, it returns a null value (no error, just nothing in table cell). However if I first set the array location to 0, then add 1, it works perfectly. How can an array defined as type integer hold a null value?
Question 2 - How do I add my respective record (call it rc) board length count to the array. IE I want to do something like this
if (rc.bincode = 4097) then Summary[0][2] := Summary[0][2] + rc.ln28;
and then later, on, when injecting this into my table (during brdsource = 128 phase) :
if (rc.bincode = 4097) then rc.ln28 := Summary[0][2];
Of course I may be going about this in a completely unorthodox way (though to me SQL is just plain unorthodox, sigh). I have made attempts to sum all previous records based on the required conditions (eg using a case(when...end) statement, but I proved what I already suspected, ie that each returned record is simply a single row of data. There is just no means of accessing data in the previous record lines as returned by the functions FOR LOOP...END LOOP.
A final note is that everything discussed here is occurring inside the function. I am not attempting to add records etc. to data returned by the function.
I am using PostgreSQL 9.2.9, compiled by Visual C++ build 1600, 64-bit. And yes I am aware this is an older version.

How do I find the middle element of an ArrayList?

How do I find the middle element of an ArrayList? What if the size is even or odd?
It turns out that a proper ArrayList object (in Java) maintains its size as a property of the object, so a call to arrayList.size() just accesses an internal integer. Easy.
/**
* Returns the number of elements in this list.
*
* #return the number of elements in this list
*/
public int size() {
return size;
}
It is both the shortest (in terms of characters) and fastest (in terms of execution speed) method available.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/ArrayList.java#ArrayList.0size
So, presuming you want the "middle" element (i.e. item 3 in a list of 5 items -- 2 items on either side), it'd be this:
Object item = arrayList.get((arrayList.size()/2)+1);
Now, it gets a little trickier if you are thinking about an even sized array, because an exact middle doesn't exist. In an array of 4 elements, you have one item on one side, and two on the other.
If you accept that the "middle" will be biased to ward the end of the array, the above logic also works. Otherwise, you'll have to detect when the size of the elements is even and behave accordingly. Wind up your propeller beanie friends...
Object item = arrayList.get((arrayList.size()/2) + (arrayList.size() % 2));
if the arraylist is odd : list.get(list.size() / 2);
if the arratlist is even: list.get((list.size() / 2) -1);
If you have a limitation for not using arraylist.size() / arraylist.length() method; you can use two iterators. One of them iterates from beginning to the end of the array, the other iterates from end to the beginning. When they reach the same index on the arraylist, then you find the middle element.
Some additional controls might be necessary to assure iterators wait each other before next iteration, you should not miss the meeting point..etc.
While iterating, for both iterators you keep total number of elements they read. So they should iterate one element in a cycle. With cycle, I mean a process including these operations:
iteratorA reads one element from the beginning
iteratorB reads one element from the end
The iterators might need to read more than one index to read an element. In other words you should skip one element in one cycle, not one index.

Optimal Solution: Get a random sample of items from a data set

So I recently had this as an interview question and I was wondering what the optimal solution would be. Code is in Objective-c.
Say we have a very large data set, and we want to get a random sample
of items from it for testing a new tool. Rather than worry about the
specifics of accessing things, let's assume the system provides these
things:
// Return a random number from the set 0, 1, 2, ..., n-2, n-1.
int Rand(int n);
// Interface to implementations other people write.
#interface Dataset : NSObject
// YES when there is no more data.
- (BOOL)endOfData;
// Get the next element and move forward.
- (NSString*)getNext;
#end
// This function reads elements from |input| until the end, and
// returns an array of |k| randomly-selected elements.
- (NSArray*)getSamples:(unsigned)k from:(Dataset*)input
{
// Describe how this works.
}
Edit: So you are supposed to randomly select items from a given array. So if k = 5, then I would want to randomly select 5 elements from the dataset and return an array of those items. Each element in the dataset has to have an equal chance of getting selected.
This seems like a good time to use Reservoir Sampling. The following is an Objective-C adaptation for this use case:
NSMutableArray* result = [[NSMutableArray alloc] initWithCapacity:k];
int i,j;
for (i = 0; i < k; i++) {
[result setObject:[input getNext] atIndexedSubscript:i];
}
for (i = k; ![input endOfData]; i++) {
j = Rand(i);
NSString* next = [input getNext];
if (j < k) {
[result setObject:next atIndexedSubscript:j];
}
}
return result;
The code above is not the most efficient reservoir sampling algorithm because it generates a random number for every entry of the reservoir past the entry at index k. Slightly more complex algorithms exist under the general category "reservoir sampling". This is an interesting read on an algorithm named "Algorithm Z". I would be curious if people find newer literature on reservoir sampling, too, because this article was published in 1985.
Interessting question, but as there is no count or similar method in DataSet and you are not allowed to iterate more than once, i can only come up with this solution to get good random samples (no k > Datasize handling):
- (NSArray *)getSamples:(unsigned)k from:(Dataset*)input {
NSMutableArray *source = [[NSMutableArray alloc] init];
while(![input endOfData]) {
[source addObject:[input getNext]];
}
NSMutableArray *ret = [[NSMutableArray alloc] initWithCapacity:k];
int count = [source count];
while ([ret count] < k) {
int index = Rand(count);
[ret addObject:[source objectAtIndex:index]];
[source removeObjectAtIndex:index];
count--;
}
return ret;
}
This is not the answer I did in the interview but here is what I wish I had done:
Store pointer to first element in dataset
Loop over dataset to get count
Reset dataset to point at first element
Create NSMutableDictionary for storing random indexes
Do for loop from i=0 to i=k. Each iteration, generate a random value, check if value exists in dictionary. If it does, keep generating a random value until you get a fresh value.
Loop over dataset. If the current index is within the dictionary, add it to a the array of random subset values.
Return array of random subsets.
There are multiple ways to do this, the first way:
1. use input parameter k to dynamically allocate an array of numbers
unsigned * numsArray = (unsigned *)malloc(sizeof(unsigned) * k);
2. run a loop that gets k random numbers and stores them into the numsArray (must be careful here to check each new random to see if we have gotten it before, and if we have, get another random, etc...)
3. sort numsArray
4. run a loop beginning at the beginning of DataSet with your own incrementing counter dataCount and another counter numsCount both beginning at 0. whenever dataCount is equal to numsArray[numsCount], grab the current data object and add it to your newly created random list then increment numsCount.
5. The loop in step 4 can end when either numsCount > k or when dataCount reaches the end of the dataset.
6. The only other step that may need to be added here is before any of this to use the next command of the object type to count how large the dataset is to be able to bound your random numbers and check to make sure k is less than or equal to that.
The 2nd way to do this would be to run through the actual list MULTIPLE times.
// one must assume that once we get to the end, we can start over within the set again
1. run a while loop that checks for endOfData
a. count up a count variable that is initialized to 0
2. run a loop from 0 through k-1
a. generate a random number that you constrain to the list size
b. run a loop that moves through the dataset until it hits the rand element
c. compare that element with all other elements in your new list to make sure it isnt already in your new list
d. store the element into your new list
there may be ways to speed up the 2nd method by storing a current list location, that way if you generate a random that is past the current pointer you dont have to move through the whole list again to get back to element 0, then to the element you wish to retreive.
A potential 3rd way to do this might be to:
1. run a loop from 0 through k-1
a. generate a random
b. use the generated random as a skip count, move skip count objects through the list
c. store the current item from the list into your new list
Problem with this 3rd method is without knowing how big the list is, you dont know how to constrain the random skip count. Further, even if you did, chances are that it wouldnt truly look like a randomly grabbed subset that could easily reach the last element in the list as it would become statistically unlikely that you would ever reach the end element (i.e. not every element is given an equal shot of being select.)
Arguably the FASTEST way to do this is method 1, where you generate the random numerics first, then traverse the list only once (yes its actually twice, once to get the size of the dataset list then again to grab the random elements)
We need a little probability theory. As others, I will ignore the case n < k. The probability that the n'th item will be selected into the set of size k is just C(n-1, k-1) / C(n, k) where C is the binomial coefficient. A bit of math says shows that this is just k/n. For the rest, note that the selection of the n'th item is independent of all other selections. In other words, "the past doesn't matter."
So an algorithm is:
S = set of up to k elements
n = 0
while not end of input
v = next value
n = n + 1
if |S| < k add v to S
else if random(0,1) >= k/n replace a randomly chosen element of S with v
I will let the coders code this one! It's pretty trivial. All you need is an array of size k and one pass over the data.
If you care about efficiency (as your tags suggest) and the number of items in the population is known, don't use reservior sampling. That would require you to loop through the entire data set and generate a random number for each.
Instead, pick five values ranges from 0 to n-1. In the unlikely case, there is a duplicate among the five indexes, replace the duplicate with another random value. Then use the five indexes to do a random-access lookup to the i-th element in the population.
This is simple. It uses a minimum number of calls the random number generator. And it accesses memory only for the relevant selections.
If you don't know the number of data elements in advance, you can loop-over the data once to get the population size and proceed as above.
If you aren't allow to iterate over the data more than once, use a chunked form of reservior sampling: 1) Choose the first five elements as the initial sample, each having a probability of 1/5th. 2) Read in a large chunk of data and choose five new samples from the new set (using only five calls to Rand). 3) Pairwise, decide whether to keep the new sample item or old sample element (with odds proportional the the probablities for each of the two sample groups). 4) Repeat until all the data has been read.
For example, assume there are 1000 data elements (but we don't know this in advance).
Choose the first five as the initial sample: current_sample = read(5); population=5.
Read a chunk of n datapoints (perhaps n=200 in this example):
subpop = read(200);
m = len(subpop);
new_sample = choose(5, subpop);
loop-over the two samples pairwise:
for (a, b) in (current_sample and new_sample): if random(0 to population + m) < population, then keep a, otherwise keep *b)
population += m
repeat