Constructing a recursive compare with SQL - sql

This is an ugly one. I wish I wasn't having to ask this question, but the project is already built such that we are handling heavy loads of validations in the database. Essentially, I'm trying to build a function that will take two stacks of data, weave them together with an unknown batch of operations or comparators, and produce a long string.
Yes, that was phrased very poorly, so I'm going to give an example. I have a form that can have multiple iterations of itself. For some reason, the system wants to know if the entered start date on any of these forms is equal to the entered end date on any of these forms. Unfortunately, due to the way the system is designed, everything is stored as a string, so I have to format it as a date first, before I can compare. Below is pseudo code, so please don't correct me on my syntax
Input data:
'logFormValidation("to_date(#) == to_date(^)"
, formname.control1name, formname.control2name)'
Now, as I mentioned, there are multiple iterations of this form, and I need to loop build a fully recursive comparison (note: it may not always be typical boolean comparisons, it could be internally called functions as well, so .In or anything like that won't work.) In the end, I need to get it into a format like below so the validation parser can read it.
OR(to_date(formname.control1name.1) == to_date(formname.control2name.1)
,to_date(formname.control1name.2) == to_date(formname.control2name.1)
,to_date(formname.control1name.3) == to_date(formname.control2name.1)
,to_date(formname.control1name.1) == to_date(formname.control2name.2)
:
:
,to_date(formname.control1name.n) == to_date(formname.control2name.n))
Yeah, it's ugly...but given the way our validation parser works, I don't have much of a choice. Any input on how this might be accomplished? I'm hoping for something more efficient than a double recursive loop, but don't have any ideas beyond that
Okay, seeing as my question is apparently terribly unclear, I'm going to add some more info. I don't know what comparison I will be performing on the items, I'm just trying to reformat the data into something useable for ANY given function. If I were to do this outside the database, it'd look something like this. Note: Pseudocode. '#' is the place marker in a function for vals1, '^' is a place marker for vals2.
function dynamicRecursiveValidation(string functionStr, strArray vals1, strArray vals2){
string finalFunction = "OR("
foreach(i in vals1){
foreach(j in vals2){
finalFunction += functionStr.replace('#', i).replace('^', j) + ",";
}
}
finalFunction.substring(0, finalFunction.length - 1); //to remove last comma
finalFunction += ")";
return finalFunction;
}
That is all I'm trying to accomplish. Take any given comparator and two arrays, and create a string that contains every possible combination. Given the substitution characters I listed above, below is a list of possible added operations
# > ^
to_date(#) == to_date(^)
someFunction(#, ^)
# * 2 - 3 <= ^ / 4
All I'm trying to do is produce the string that I will later execute, and I'm trying to do it without having to kill the server in a recursive loop

I don't have a solution code for this but you can algorithmically do the following
Create a temp table (start_date, end_date, formid) and populate it with every date from any existing form
Get the start_date from the form and simply:
SELECT end_date, form_id FROM temp_table WHERE end_date = <start date to check>
For the reverse
SELECT start_date, form_id FROM temp_table WHERE start_date = <end date to check>
If the database is available why not let it do all the heavy lifting.

I ended up performing a cross product of the data, and looping through the results. It wasn't the sort of solution I really wanted, but it worked.

Related

Referencing nested arrays in awk

I'm creating a bunch of mappings that can be indexed into using 3 keys such as below:
mappings["foo"]["bar"]["blah"][1]=0
split( "10,13,19,49", mappings["foo"]["bar"]["blah"] )
I can then index into the nested array using for example
mappings[product][format][version][i]
But this is a bit long-winded when I need to refer to the same nested array several times, so in other languages I'd create a reference to the inner array:
map=mappings[product][format][version]
map[i]
However, I can't seem to get this to work in awk (gawk 4.1.3).
I can only find one link over google, that suggests this is impossible in previous versions of awk, and a loop setting the keys and values one-by-one is the only solution. Is this still the case or does anyone have a suggestions for a better solution?
https://developer.apple.com/library/archive/documentation/OpenSource/Conceptual/ShellScripting/Howawk-ward/Howawk-ward.html
EDIT
In response to comments a bit more background on what I'm trying to do. If there is a better approach, I'm all for using it!
I have set of CSV files that I'm feeding into AWK. The idea is to calculate a checksum based on specific columns after applying filtering to the rows.
The columns to checksum on, and the filtering to apply, are derivived from runtime parameters sent into the script.
The runtime parameters are a triple of (product,format,version), hence my use of a 3-nested assoicative array.
Another approach would be to use triple as a single key, rather than nesting, but gawk doesn't seem to natively support this, so I'd end-up concatenating the values as string. This felt a bit less structured to me, but if I'm wrong, happy to change my mind on this apporach.
Anyway, it is these parameters that are used to index into the array to structure to retrieve the column numbers, etc.
You can then build-up a tree-like structure, for example, the below shows 2 formats for product foo on version blah, and so on...:
mappings["product-foo"]["format-bar"]["version-blah"][1]=0
split( "10,13,19,49", mappings["product-foo"]["format-bar"]["version-blah"] )
mappings["product-foo"]["format-moo"]["version-blah"][1]=0
split( "55,23,14,6", mappings["product-foo"]["format-moo"]["version-blah"] )
The magic happens like this, you can see how long-winded the mappings indexing becomes without referencing:
(FNR>1 && (format!="some-format" ||
(version=="some-version" && $1=="some-filter") ||
(version=="some-other-version" && $8=="some-other-filter"))) {
# Loop over each supplied field summing an absolute tally for each
for (i=1; i <= length(mappings[product][format][version]); i++) {
sumarr[i] += ( $mappings[product][format][version][i] < 0 ? -$mappings[product][format][version][i]:$mappings[product][format][version][i] )
}
}
The comment from #ed-morton simplifies this as originally requested, but interested if their is a simpler approach.
The right answer is from #ed-morton above (thanks!).
Ed - if you write it out as an answer I'll accept it, otherwise I'll accept this quote in a few days for good housekeeping.
Right, there is no array copy functionality in awk and there are no pointers/references so you can't create a pointer to an array. You can of course create function map(i) { return mappings[product][format][version][i]}

AS400 RPGLE/free dynamic variables in operations

I'm fairly certain after years of searching that this is not possible, but I'll ask anyway.
The question is whether it's possible to use a dynamic variable in an operation when you don't know the field name. For example, I have a data structure that contains a few hundred fields. The operator selects one of those fields and the program needs to know what data resides in the field from the data structure passed. So we'll say that there are 100 fields, and field50 is what the operator chose to operate on. The program would be passed in the field name (i.e. field50) in the FLDNAM variable. The program would read something like this the normal way:
/free
if field50 = 'XXX'
// do something
endif;
/end-free
The problem is that I would have to code this 100 times for every operation. For example:
/free
if fldnam = 'field1';
// do something
elseif fldnam = 'field2';
// do something
..
elseif fldnam = 'field50';
// do something
endif;
Is there any possible way of performing an operation on a field not yet known? (i.e. IF FLDNAM(pointer data) = 'XXX' then do something)
If the data structure is externally-described and you know what file it comes from, you could use the QUSLFLD API to find out the offset, length, and type of the field in the data structure, and then use substring to get the data and then use other calculations to get the value, depending on the data type.
Simple answer, no.
RPG's simply not designed for that. Few languages are.
You may want to look at scripting languages. Perl for instance, can evaluate on the fly. REXX, which comes installed on the IBM i, has an INTERPRET keyword.
REXX Reference manual

Use String for IF statement conditions

I'm hoping someone can help answer my question, perhaps with an idea of where to go or whether what I'm trying to do is not possible with the way I want to do it.
I've been asked to write a set of rules based on the data held by our ERP form components or variables.
Unfortunately, these components and variables cannot be accessed or used outside of the ERP, so I can't use SQL to query the values and then build some kind of SQL query.
They'd like the ability to put statements like these:
C(MyComponentName) = C(MyOtherComponentName)
V(MyVariableName) > 16
(C(MyComponentName) = "") AND V(MyVariableName) <> "")
((C(MyComponentName) = "") OR C(MyOtherComponentName) = "") AND V(MyVariableName) <> "")
This should be turned into some kind of query which gets the value of MyComponentName and MyOtherComponentName and (in this case) compares them for equality.
They don't necessarily want to just compare for equality, but to be able to determine whether a component / variable value is greaterthan or lessthan etc.
Basically it's a free-form statement that gets converted into something similar to an IF statement.
I've tried this:
Sub TestCondition()
Dim Condition as string = String.Format("{0} = {1}", _
Component("MyComponent").Value, Component("MyOtherComponent").Value)
If (Condition) Then
' Do Something
Else
' Do Something Else
End If
End Sub
Obviously, this does not work and I honestly didn't think it would be so simple.
Ignoring the fact that I'd have to parse the line, extract the required operators, the values from components or variables (denoted by a C or V) - how can I do this?
I've looked at Expression Trees but these were confusing, especially as I'd never heard of them, let alone used them. (Is it possible to create an expression tree for dynamic if statements? - This link provided some detail on expression trees in C#)
I know an easier way to solve this might be to simply populate the form with a multitude of drop-down lists, so users pick what they want from lists or fill in a text box for a specific search criteria.
This wouldn't be a simple matter as the ERP doesn't allow you to dynamically create controls on its forms. You have to drag each component manually and would be next to useless as we'd potentially want at least 1 rule for every form we have (100+).
I'm either looking for someone to say you cannot do this the way you want to do it (with a suitable reason or suggestion as to how I could do it) that I can take to my manager or some hints, perhaps a link or 2 pointing me in the right direction.
If (Condition) Then
This is not possible. There is no way to treat data stored in a string as code. While the above statement is valid, it won't and can't function the way you want it to. Instead, Condition will be evaluated as what it is: a string. (Anything that doesn't boil down to 0 is treated as True; see this question.)
What you are attempting borders on allowing the user to type code dynamically to get a result. I won't say this is impossible per se in VB.Net, but it is incredibly ambitious.
Instead, I would suggest clearly defining what your application can and can't do. Enumerate the operators your code will allow and build code to support each directly. For example:
Public Function TestCondition(value1 As Object, value2 As Object, op as string) As Boolean
Select Case op
Case "="
Return value1 = value2
Case "<"
Return value1 < value2
Case ">"
Return value1 > value2
Case Else
'Error handling
End Select
End Function
Obviously you would need to tailor the above to the types of variables you will be handling and your other specific needs, but this approach should give you a workable solution.
For my particular requirements, using the NCalc library has enabled me to do most of what I was looking to do. Easy to work with and the documentation is quite extensive - lots of examples too.

Regex match SQL values string with multiple rows and same number of columns

I tried to match the sql values string (0),(5),(12),... or (0,11),(122,33),(4,51),... or (0,121,12),(31,4,5),(26,227,38),... and so on with the regular expression
\(\s*\d+\s*(\s*,\s*\d+\s*)*\)(\s*,\s*\(\s*\d+\s*(\s*,\s*\d+\s*)*\))*
and it works. But...
How can I ensure that the regex does not match a values string like (0,12),(1,2,3),(56,7) with different number of columns?
Thanks in advance...
As i mentioned in comment to the question, the best way to check if input string is valid: contains the same count of numbers between brackets, is to use client side programm, but not clear SQL.
Implementation:
List<string> s = new List<string>(){
"(0),(5),(12)", "(0,11),(122,33),(4,51)",
"(0,121,12),(31,4,5),(26,227,38)","(0,12),(1,2,3),(56,7)"};
var qry = s.Select(a=>new
{
orig = a,
newst = a.Split(new string[]{"),(", "(", ")"},
StringSplitOptions.RemoveEmptyEntries)
})
.Select(a=>new
{
orig = a.orig,
isValid = (a.newst
.Sum(b=>b.Split(new char[]{','},
StringSplitOptions.RemoveEmptyEntries).Count()) %
a.newst.Count()) ==0
});
Result:
orig isValid
(0),(5),(12) True
(0,11),(122,33),(4,51) True
(0,121,12),(31,4,5),(26,227,38) True
(0,12),(1,2,3),(56,7) False
Note: The second Select statement gets the modulo of sum of comma instances and the count of items in string array returned by Split function. If the result isn't equal to zero, it means that input string is invalid.
I strongly believe there's a simplest way to achieve that, but - at this moment - i don't know how ;)
:(
Unless you add some more constraints, I don't think you can solve this problem only with regular expressions.
It isn't able to solve all of your string problems, just as it cannot be used to check that the opening and closing of brackets (like "((())()(()(())))") is invalid. That's a more complicated issue.
That's what I learnt in class :P If someone knows a way then that'd be sweet!
I'm sorry, I spent a bit of time looking into how we could turn this string into an array and do more work to it with SQL but built in functionality is lacking and the solution would end up being very hacky.
I'd recommend trying to handle this situation differently as large scale string computation isn't the best way to go if your database is to gradually fill up.
A combination of client and serverside validation can be used to help prevent bad data (like the ones with more numbers) from getting into the database.
If you need to keep those numbers then you could rework your schema to include some metadata which you can use in your queries, like how many numbers there are and whether it all matches nicely. This information can be computed inexpensively from your server and provided to the database.
Good luck!

Enter date into function without quotes, return date

I'm trying to write a function of this form:
Function cont(requestdate As Date)
cont = requestdate
End Function
Unfortunately, when I enter =cont(12/12/2012) into a cell, I do not get my date back. I get a very small number, which I think equals 12 divided by 12 divided by 2012. How can I get this to give me back the date? I do not want the user to have to enter =cont("12/12/2012").
I've attempted to google for an answer, unfortunately, I have not found anything helpful. Please let me know if my vocabulary is correct.
Let's say my user pulled a report with 3 columns, a, b and c. a has beginning of quarter balances, b has end of quarter balances and c has a first and last name. I want my user to put in column d: =cont(a1,b1,c1,12/12/2012) and make it create something like:
BOQ IS 1200, EOQ IS 1300, NAME IS EDDARD STARK, DATE IS 12/12/2012
So we could load this into a database. I apologize for the lack of info the first time around. To be honest, this function wouldn't save me a ton of time. I'm just trying to learn VBA, and thought this would be a good exercise... Then I got stuck.
Hard to tell what you are really trying to accomplish.
Function cont(requestdate As String) As String
cont = Format(Replace(requestdate, ".", "/"), "'mm_dd_YYYY")
End Function
This code will take a string that Excel does not recognize as a number e.g. 12.12.12 and formats it (about the only useful thing I can think of for this UDF) and return it as a string (that is not a number or date) to a cell that is formatted as text.
You can get as fancy as you like in processing the string entered and formatting the string returned - just that BOTH can never be a number or a date (or anything else Excel recognizes.)
There is no way to do exactly what you're trying to do. I will try to explain why.
You might think that because your function requires a Date argument, that this somehow forces or should force that 12/12/2012 to be treated as a Date. And it is treated as a Date — but only after it's evaluated (only if the evaluated expression cannot be interpreted as a Date, then you will get an error).
Why does Excel evaluate this before the function receives it?
Without requiring string qualifiers, how could the application possibly know what type of data you intended, or whether you intended for that to be evaluated? It could not possibly know, so there would be chaos.
Perhaps this is best illustrated by example. Using your function:
=Cont(1/1/0000) should raise an error.
Or consider a very simple formula:
=1/2
Should this formula return .5 (double) or January 2 (date) or should it return "1/2" (string literal)? Ultimately, it has to do one of these, and do that one thing consistently, and the one thing that Excel will do in this case is to evaluate the expression.
TL;DR
Your problem is that unqualified expression will be evaluated before being passed, and this is done to avoid confusion or ambiguity (per examples).
Here is my method for allowing quick date entry into a User Defined Function without wrapping the date in quotes:
Function cont(requestdate As Double) As Date
cont = CDate((Mid(Application.Caller.Formula, 7, 10)))
End Function
The UDF call lines up with the OP's initial request:
=cont(12/12/2012)
I believe that this method would adapt just fine for the OP's more complex ask, but suggest moving the date to the beginning of the call:
=cont(12/12/2012,a1,b1,c1)
I fully expect that this method can be optimized for both speed and flexibility. Working on a project now that might require me to further dig into the speed piece, but it suits my needs in the meantime. Will update if anything useful turns up.
Brief Explanation
Application.Caller returns a Range containing the cell that called the UDF. (See Caveat #2)
Mid returns part of a string (the formula from the range that called the UDF in this case) starting at the specified character count (7) of the specified length (10).
CDate may not actually be necessary, but forces the value into date format if possible.
Caveats
This does require use of the full dd/mm/yyyy (1/1/2012 would fail) but pleasantly still works with my preferred yyyy/mm/dd format as well as covering some other delimiters. dd-mm-yyyy or dd+mm+yyyy would work, but dd.mm.yyyy will not because excel does not recognize it as a valid number.
Additional work would be necessary for this to function as part of a multi-cell array formula because Application.Caller returns a range containing all of the associated cells in that case.
There is no error handling, and =cont(123) or =cont(derp) (basically anything not dd/mm/yyy) will naturally fail.
Disclaimers
A quick note to the folks who are questioning the wisdom of a UDF here: I've got a big grid of items and their associated tasks. With no arguments, my UDF calculates due dates based on a number of item and task parameters. When the optional date is included, the UDF returns a delta between the actual date and what was calculated. I use this delta to monitor and calibrate my calculated due dates.
All of this can absolutely be performed without the UDF, but bulk entry would be considerably more challenging to say the least.
Removing the need for quotes sets my data entry up such that loading =cont( into the clipboard allows my left hand to F2/ctrl-v/tab while my right hand furiously enters dates on the numpad without need to frequently (and awkwardly) shift left-hand position for a shift+'.