How to eval a string containing column names? - vba

As I cannot attach a conditional formatting on a Table, I need an abstract function to chech if a set of records or all records have errors inside, and show these errors into forms and/or reports.
Because, to achieve this goal in the 'standard' mode, I have to define the rule [○for a field of a table every time I use that field in a control or report, and this means the need to repeate the same things an annoying lot of times, not to tell about introducing errors and resulting in a maintenance nightmare.
So, my idea is to define all the check for all the tables and their rows in an CheckError-table, like the following fragment related to the table 'Persone':
TableName | FieldName | TestNumber | TestCode | TestMessage | ErrorType[/B][/COLOR]
Persone | CAP | 4 | len([CAP]) = 0 or isnull([cap]) | CAP mancante | warning
Persone | Codice Fiscale | 1 | len([Codice Fiscale]) < 16 | Codice fiscale nullo o mancante | error
Persone | Data di nascita | 2 | (now() - [Data di nascita]) < 18 * 365 | Minorenne | info
Persone | mail | 5 | len([mail)] = 0 or isnull([mail] | email mancante | warning
Persone | mail | 6 | (len([mail)] = 0 or isnull([mail]) | richiesto l'invio dei referti via e- mail, | error
| | | and [modalità ritiro referti] = ""e-mail"" | ma l'indirizzo e-mail è mancante |
Persone | Via | 3 | len([Via]) = 0 or isnull([Via]) | Indirizzo mancante | warning
Now, in each form or report which use the table Persona, I want to set an 'onload' property to a function
' to validate all fields in all rows and set the appropriate bg and fg color
Private Sub Form_Open(Cancel As Integer)
Call validazione.validazione(Form, "Persone", 0)
End Sub
' to validate all fields in the row identified by ID and set the appropriate bg and fg color
Private Sub Codice_Fiscale_LostFocus()
Call validazione.validazione(Form, "Persone", ID)
End Sub
So, the function validazione, at a certain point, as exactly one row for the table Persone, and the set of expressions described in the column [TestCode] above.
Now, I need to logically evaluate the TestString against the table row, to obtain a true or a false.
If true, I'll set the fg and bg color of the field as normal
if false, I'll set the the fg and bg color as per error, info or warning, as defined by the column [ErrorType] above.
All the above is easy, ready, and running, except for the red statement above:
How can I evaluate the teststring against the table row, to obtain a result?
Thank you
Paolo

Related

Excel/VBA/Conditional Formatting: Dictionary of Dictionaries

I've got an Excel workbook that obtains data from an MS SQL database. One of the sheets is used to check the data against requirements and to highlight faults. In order to do that, I've got a requirements sheet where the requirement is in a named range; after updating the data I copy the conditional formatting of the table header to all data rows. That works pretty nicely so far. The problem comes when I have more than one set of requirements:
An (agreeably silly) example could be car racing, where requirements may exist for driver's license and min/max horsepower. When looking at the example, please imagine there are a few thousand rows and 71 columns presently...
+-----+--------+----------------+------------+---------+
| Car | Race | RequirementSet | Horsepower | License |
+-----+--------+----------------+------------+---------+
| 1 | Monaco | 2 | 200 | A |
+-----+--------+----------------+------------+---------+
| 2 | Monaco | 2 | 400 | B |
+-----+--------+----------------+------------+---------+
| 3 | Japan | 3 | 200 | C |
+-----+--------+----------------+------------+---------+
| 4 | Japan | 3 | 300 | A |
+-----+--------+----------------+------------+---------+
| 5 | Japan | 3 | 350 | B |
+-----+--------+----------------+------------+---------+
| 6 | Mexico | 1 | 200 | A |
+-----+--------+----------------+------------+---------+
The individual data now needs to be checked against the requirements set in another sheet:
+-------------+---------------+---------------+---------+
| Requirement | MinHorsepower | MaxHorsepower | License |
+-------------+---------------+---------------+---------+
| 1 | 200 | 250 | A |
+-------------+---------------+---------------+---------+
| 2 | 250 | 500 | B |
+-------------+---------------+---------------+---------+
| 3 | 250 | 400 | A |
+-------------+---------------+---------------+---------+
In order to relate back to my present situation, I am only looking at either the Monaco, Japan or Mexico Race, and there is only 1 record in the requirements sheet, where the value in e.g. Cell B2 is always the MinHorsepower and the value in C2 is always the MaxHorsepower. So these cells are a named range that I can access in my data sheet.
Now however I would like to obtain all races at once, and refer conditional formatting formulas to the particular requirement.
Focussing on "Horsepower" in Monaco (requirement set 2), I can now find out that the min Horsepower is 250 and the max is 500 - so I will colour that column for car 1 as red and for car 2 as green.
The formula is programatically copied from the header row (the first conditional format rule is if row(D1) = 1 then do nothing)
I can't decide what the best approach to the problem is. Ideally, the formula is readable, something like `AND(D2 >= MinHorsepower; D2 <= MaxHorsepower) - I cannot imagine it to be maintainable if I had to use Vlookup combined with Indirect and Match to match a column header in requirements for that particular requirement - especially when it comes to combining criteria like in the HP example with min and max above.
I am wondering if I should read the requirements table into a dictionary or something in VBA, and then use a function like
public function check(requirementId as int, requirement$)
which then in Excel I could use like =D2 >= check(c2, "MinHorsepower")
Playing around with this a little bit it appears to be pretty slow as opposed to the previous system where I could only have one requirement. It would be fantastic if you could help me out with a fresh approach to this problem. I'll update this question as I go along; I'm not sure if I managed to illustrate the example really well but the actual data wouldn't mean anything to you.
In any case, thanks for hanging in until here!
Edit 29 October 2016
I have found a solution as basis for mine. Using the following code I can add my whole requirements table to a dictionary, and access the requirement.
Using a class clsRangeToDictionary (based on Tim Williams clsMatrix)
Option Explicit
Private m_array As Variant
Private dictRows As Object
Private dictColumns As Object
Public Sub Init(vArray As Variant)
Dim i As Long
Set dictRows = CreateObject("Scripting.Dictionary")
Set dictColumns = CreateObject("Scripting.Dictionary")
'add the row keys and positions. Skip the first row as it contains the column key
For i = LBound(vArray, 1) + 1 To UBound(vArray, 1)
dictRows.Add vArray(i, 1), i
Next i
'add the column keys and positions, skipping the first column
For i = LBound(vArray, 2) + 1 To UBound(vArray, 2)
dictColumns.Add vArray(1, i), i
Next i
' store the array for future use
m_array = vArray
End Sub
Public Function GetValue(rowKey, colKey) As Variant
If dictRows.Exists(rowKey) And dictColumns.Exists(colKey) Then
GetValue = m_array(dictRows(rowKey), dictColumns(colKey))
Else
Err.Raise 1000, "clsRangeToDictionary:GetValue", "The requested row key " & CStr(rowKey) & " or column Key " & CStr(colKey) & " does not exist"
End If
End Function
' return a zero-based array of RowKeys
Public Function RowKeys() As Variant
RowKeys = dictRows.Keys
End Function
' return a zero-based array of ColumnKeys
Public Function ColumnKeys() As Variant
ColumnKeys = dictColumns.Keys
End Function
I can now read the whole RequirementSet table into a dictionary and write a helper to obtain the particular requirement roughly so:
myDictionaryObject.GetValue(table1's RequirementSet, "MinHorsePower")
If someone could help me figure out how to put this into an answer giving the credit due to Tim Williams that'd be great.

How to write a select query or server-side function that will generate a neat time-flow graph from many data points?

NOTE: I am using a graph database (OrientDB to be specific). This gives me the freedom to write a server-side function in javascript or groovy rather than limit myself to SQL for this issue.*
NOTE 2: Since this is a graph database, the arrows below are simply describing the flow of data. I do not literally need the arrows to be returned in the query. The arrows represent relationships.*
I have data that is represented in a time-flow manner; i.e. EventC occurs after EventB which occurs after EventA, etc. This data is coming from multiple sources, so it is not completely linear. It needs to be congregated together, which is where I'm having the issue.
Currently the data looks something like this:
# | event | next
--------------------------
12:0 | EventA | 12:1
12:1 | EventB | 12:2
12:2 | EventC |
12:3 | EventA | 12:4
12:4 | EventD |
Where "next" is the out() edge to the event that comes next in the time-flow. On a graph this comes out to look like:
EventA-->EventB-->EventC
EventA-->EventD
Since this data needs to be congregated together, I need to merge duplicate events but preserve their edges. In other words, I need a select query that will result in:
-->EventB-->EventC
EventA--|
-->EventD
In this example, since EventB and EventD both occurred after EventA (just at different times), the select query will show two branches off EventA as opposed to two separate time-flows.
EDIT #2
If an additional set of data were to be added to the data above, with EventB->EventE, the resulting data/graph would look like:
# | event | next
--------------------------
12:0 | EventA | 12:1
12:1 | EventB | 12:2
12:2 | EventC |
12:3 | EventA | 12:4
12:4 | EventD |
12:5 | EventB | 12:6
12:6 | EventE |
EventA-->EventB-->EventC
EventA-->EventD
EventB-->EventE
I need a query to produce a tree like:
-->EventC
-->EventB--|
| -->EventE
EventA--|
-->EventD
EDIT #3 and #4
Here is the data with edges shown as opposed to the "next" column above. I also added a couple additional columns here to hopefully clear up any confusion about the data:
# | event | ip_address | timestamp | in | out |
----------------------------------------------------------------------------
12:0 | EventA | 123.156.189.18 | 2015-04-17 12:48:01 | | 13:0 |
12:1 | EventB | 123.156.189.18 | 2015-04-17 12:48:32 | 13:0 | 13:1 |
12:2 | EventC | 123.156.189.18 | 2015-04-17 12:48:49 | 13:1 | |
12:3 | EventA | 103.145.187.22 | 2015-04-17 14:03:08 | | 13:2 |
12:4 | EventD | 103.145.187.22 | 2015-04-17 14:05:23 | 13:2 | |
12:5 | EventB | 96.109.199.184 | 2015-04-17 21:53:00 | | 13:3 |
12:6 | EventE | 96.109.199.184 | 2015-04-17 21:53:07 | 13:3 | |
The data is saved like this to preserve each individual event and the flow of a session (labeled by the ip address).
TL;DR
Got lots of events, some duplicates, and need them all organized into one neat time-flow graph.
Holy cow.
After wrestling with this for over a week I think I FINALLY have a working function. This isn't optimized for performance (oh the loops!), but gets the job done for the time being while I can work on performance. The resulting OrientDB server-side function (written in javascript):
The function:
// Clear previous runs
db.command("truncate class tmp_Then");
db.command("truncate class tmp_Events");
// Get all distinct events
var distinctEvents = db.query("select from Events group by event");
// Send 404 if null, otherwise proceed
if (distinctEvents == null) {
response.send(404, "Events not found", "text/plain", "Error: events not found" );
} else {
var edges = [];
// Loop through all distinct events
distinctEvents.forEach(function(distinctEvent) {
var newEvent = [];
var rid = distinctEvent.field("#rid");
var eventType = distinctEvent.field("event");
// The main query that finds all *direct* descendents of the distinct event
var result = db.query("select from (traverse * from (select from Events where event = ?) where $depth <= 2) where #class = 'Events' and $depth > 1 and #rid in (select from Events group by event)", [eventType]);
// Save the distinct event in a temp table to create temp edges
db.command("create vertex tmp_Events set rid = ?, event = ?", [rid, event]);
edges.push(result);
});
// The edges array defines which edges should exist for a given event
edges.forEach(function(edge, index) {
edge.forEach(function(e) {
// Create the temp edge that corresponds to its distinct event
db.command("create edge tmp_Then from (select from tmp_Events where rid = " + distinctEvents[index].field("#rid") + ") to (select from tmp_Events where rid = " + e.field("#rid") + ")");
});
});
var result = db.query("select from tmp_Events");
return result;
}
Takeaways:
Temp tables appeared to be necessary. I tried to do this without temp tables (classes), but I'm not sure it could be done. I needed to mock edges that didn't exist in the raw data.
Traverse was very helpful in writing the main query. Traversing through an event to find its direct, unique descendents was fairly simple.
Having the ability to write stored procs in Javascript is freaking awesome. This would have been a nightmare in SQL.
omfg loops. I plan to optimize this and continue to make it better so hopefully other people can find some use for it.

SQLAlchemy getting label names out from columns

I want to use the same labels from a SQLAlchemy table, to re-aggregate some data (e.g. I want to iterate through mytable.c to get the column names exactly).
I have some spending data that looks like the following:
| name | region | date | spending |
| John | A | .... | 123 |
| Jack | A | .... | 20 |
| Jill | B | .... | 240 |
I'm then passing it to an existing function we have, that aggregates spending over 2 periods (using a case statement) and groups by region:
grouped table:
| Region | Total (this period) | Total (last period) |
| A | 3048 | 1034 |
| B | 2058 | 900 |
The function returns a SQLAlchemy query object that I can then use subquery() on to re-query e.g.:
subquery = get_aggregated_data(original_table)
region_A_results = session.query(subquery).filter(subquery.c.region = 'A')
I want to then re-aggregate this subquery (summing every column that can be summed, replacing the region column with a string 'other'.
The problem is, if I iterate through subquery.c, I get labels that look like:
anon_1.region
anon_1.sum_this_period
anon_1.sum_last_period
Is there a way to get the textual label from a set of column objects, without the anon_1. prefix? Especially since I feel that the prefix may change depending on how SQLAlchemy decides to generate the query.
Split the name string and take the second part, and if you want to prepare for the chance that the name is not prefixed by the table name, put the code in a try - except block:
for col in subquery.c:
try:
print(col.name.split('.')[1])
except IndexError:
print(col.name)
Also, the result proxy (region_A_results) has a method keys which returns an a list of column names. Again, if you don't need the table names, you can easily get rid of them.

Is there a way to perform a cross join or Cartesian product in excel?

At the moment, I cannot use a typical database so am using excel temporarily. Any ideas?
The
You have 3 dimensions here: dim1 (ABC), dim2 (123), dim3 (XYZ).
Here is how you make a cartesian product of 2 dimensions using standard Excel and no VBA:
1) Plot dim1 vertically and dim2 horizontally. Concatenate dimension members on the intersections:
2) Unpivoting data. Launch pivot table wizard using ALT-D-P (don't hold ALT, press it once). Pick "Multiple consolidation ranges" --> create a single page.. --> Select all cells (including headers!) and add it to the list, press next.
3) Plot the resulting values vertically and disassemble the concatenated strings
Voila, you've got the cross join. If you need another dimension added, repeat this algorithm again.
Cheers,
Constantine.
Here is a very easy way to generate the Cartesian product of an arbitrary number of lists using Pivot tables:
https://chandoo.org/wp/generate-all-combinations-from-two-lists-excel/
The example is for two lists, but it works for any number of tables and/or columns.
Before creating the Pivot table, you need to convert your value lists to tables.
Using VBA, you can. Here is a small example:
Sub SqlSelectExample()
'list elements in col C not present in col B
Dim con As ADODB.Connection
Dim rs As ADODB.Recordset
Set con = New ADODB.Connection
con.Open "Driver={Microsoft Excel Driver (*.xls)};" & _
"DriverId=790;" & _
"Dbq=" & ThisWorkbook.FullName & ";" & _
"DefaultDir=" & ThisWorkbook.FullName & ";ReadOnly=False;"
Set rs = New ADODB.Recordset
rs.Open "select ccc.test3 from [Sheet1$] ccc left join [Sheet1$] bbb on ccc.test3 = bbb.test2 where bbb.test2 is null ", _
con, adOpenStatic, adLockOptimistic
Range("g10").CopyFromRecordset rs '-> returns values without match
rs.MoveLast
Debug.Print rs.RecordCount 'get the # records
rs.Close
Set rs = Nothing
Set con = Nothing
End Sub
Here's a way using Excel formulas:
| | A | B | C |
| -- | -------------- | -------------- | -------------- |
| 1 | | | |
| -- | -------------- | -------------- | -------------- |
| 2 | Table1_Column1 | Table2_Column1 | Table2_Column2 |
| -- | -------------- | -------------- | -------------- |
| 3 | A | 1 | X |
| -- | -------------- | -------------- | -------------- |
| 4 | B | 2 | Y |
| -- | -------------- | -------------- | -------------- |
| 5 | C | 3 | Z |
| -- | -------------- | -------------- | -------------- |
| 6 | | | |
| -- | -------------- | -------------- | -------------- |
| 7 | Col1 | Col2 | Col3 |
| -- | -------------- | -------------- | -------------- |
| 8 | = Formula1 | = Formula2 | = Formula3 |
| -- | -------------- | -------------- | -------------- |
| 9 | = Formula1 | = Formula2 | = Formula3 |
| -- | -------------- | -------------- | -------------- |
| 10 | = Formula1 | = Formula2 | = Formula3 |
| -- | -------------- | -------------- | -------------- |
| 11 | ... | ... | ... |
| -- | -------------- | -------------- | -------------- |
Formula1: IF(ROW() >= 8 + (3*3*3), "", INDIRECT(ADDRESS(3 + MOD(FLOOR(ROW() - 8)/(3*3), 3), 1)))
Formula2: IF(ROW() >= 8 + (3*3*3), "", INDIRECT(ADDRESS(3 + MOD(FLOOR(ROW() - 8)/(3) , 3), 2)))
Formula3: IF(ROW() >= 8 + (3*3*3), "", INDIRECT(ADDRESS(3 + MOD(FLOOR(ROW() - 8)/(1) , 3), 3)))
One* general formula to rule them all!
The result
The formula
MOD(CEILING.MATH([index]/PRODUCT([size of set 0]:[size of previous set]))-1,[size of current set])+1
This formula gives the index (ordered position) of each element in the set, where set i has a size of n_i. Thus if we have four sets the sizes would be [n_1,n_2,n_3,n_4].
Using that index one can just use the index function to pick whatever attribute from the set (imagine each set being a table with several columns one could use index([table of the set],[this result],[column number of attribute]).
Explanation
The two main components of the formula explained, the cycling component and the partitioning component.
Cycling component
=MOD([partitioning component]-1, [size of current set])+1
Cycles through all the possible values of the set.
The modulo function is required so the result will "go around" the size of the set, and never "out of bounds" of the possible values.
The -1 and +1 help us go from one-based numbering (our set indexes) to zero-based numbering (for the modulo operation).
Partitioning component
CEILING.MATH([index]/PRODUCT([size of set 0]:[size of previous set]):
Partitions the "cartesian index" in chunks giving each chunk an "name".
The "cartesian index" is just a numbering from 1 to the number of elements in the Cartesian Product (given by the product of the sizes of each set).
The "name" is just an increasing-by-chunk enumeration of the "cartesian index".
To have the same "name" for all indexes belonging to each chunk, we divide the "cartesian index" by the number of partitions and "ceil" it (kind of round up) the result.
The amount of partitions is the total size of the last cycle, since, for each previous result one requires to repeat it for each of this set's elements.
It so happens that the size of the previous result is the product of all the previous sets sizes (including the size of a set before the first so we can generalize, which we will call the "set 0" and will have a constant size of 1).
With screenshots
Set sizes
Prepared set sizes including the "Set0" one and the size of the Cartesian Product.
Here, the sizes of sets are:
"Set0": 1 in cell B2
"Set1": 2 in cell C2
"Set2": 5 in cell D2
"Set3": 3 in cell E2
Thus the size of the Cartesian product is 30 (2*5*3) in cell A2.
Results
Table structure _tbl_CartesianProduct with the following columns and their formulas:
Results:
Cartesian Index: =IF(ROW()-ROW(_tbl_CartesianProduct[[#Headers];[Cartesian Index]])<=$A$2;ROW()-ROW(_tbl_CartesianProduct[[#Headers];[Cartesian Index]]);NA())
concatenation: =TEXTJOIN("-";TRUE;_tbl_CartesianProduct[#[Index S1]:[Index S3]])
Index S1: =MOD(CEILING.MATH([#[Cartesian Index]]/PRODUCT($B$2:B$2))-1;C$2)+1
Index S2: =MOD(CEILING.MATH([#[Cartesian Index]]/PRODUCT($B$2:C$2))-1;D$2)+1
Index S3: =MOD(CEILING.MATH([#[Cartesian Index]]/PRODUCT($B$2:D$2))-1;E$2)+1
step "size of previous partition":
Size prev part S1: =PRODUCT($B$2:B$2)
Size prev part S2: =PRODUCT($B$2:C$2)
Size prev part S3: =PRODUCT($B$2:D$2)
step "Chunk name":
Chunk S1: =CEILING.MATH([#[Cartesian Index]]/[#[Size prev part S1]])
Chunk S2: =CEILING.MATH([#[Cartesian Index]]/[#[Size prev part S2]])
Chunk S3: =CEILING.MATH([#[Cartesian Index]]/[#[Size prev part S3]])
final step "Cycle through the set":
Cycle chunk in S1: =MOD([#[Chunk S1]]-1;C$2)+1
Cycle chunk in S2: =MOD([#[Chunk S2]]-1;D$2)+1
Cycle chunk in S3: =MOD([#[Chunk S3]]-1;E$2)+1
*: for the actual job of producing the Cartesian enumerations
A little bit code in PowerQuery could solve the problem:
let
Quelle = Excel.CurrentWorkbook(){[Name="tbl_Data"]}[Content],
AddColDim2 = Table.AddColumn(Quelle, "Dim2", each Quelle[Second_col]),
ExpandDim2 = Table.ExpandListColumn(AddColDim2, "Dim2"),
AddColDim3 = Table.AddColumn(ExpandDim2, "Dim3", each Quelle[Third_col]),
ExpandDim3 = Table.ExpandListColumn(AddColDim3, "Dim3"),
RemoveColumns = Table.SelectColumns(ExpandDim3,{"Dim1", "Dim2", "Dim3"})
in RemoveColumns
Try using a DAX CROSS JOIN. Read more at MSDN
You can use the expression CROSSJOIN(table1, table2) to create a cartesian product.

Dynamically select a column from a generic list

I have a table that is 200 columns wide and need to return the data of a specific row and column but I won't know the column until runtime. I can easily get the row I want into either a list, an individual strongly typed object, or an Array through LINQ but I can't for the life of me figure out how to find the column I need.
So For instance (on a smaller scale) my table looks like this
GrowerKey | day1 | day2 | day3 | day4 |
-----------------------------------------
3 | 1 | 3 | 2 | 2 |
4 | 6 | 1 | 9 | 1 |
5 | 8 | 8 | 2 | 4 |
and I can get the row I want with something simple like this
Dim CleanRecord As List(Of Grower_Clean_Schedule) = (From key In eng.Grower_Clean_Schedules
Where key.Grower_Key = Grower_Key).ToList
how do I then return only the value of a specific column of that row (like say the value stored in "day2") When I won't know which column until runtime?
Something like this (starting with CleanRecord which you defined in your question):
dim matchingRow = CleanRecord.First()
dim props = matchingRow.GetType().GetProperties( _
BindingFlags.Instance or BindingFlags.Public))
dim myReturnVal = (from prop in props _
where prop.Name = "day2" _
select prop.GetValue(matchingRow, Nothing).FirstOrDefault()
return myReturnVal