How to write clear and maintainable code when dealing with tables?

How to write clear and maintainable code when dealing with tables? - vba

In my projects I often take advantage of tables and underlying ListObjects and ListColumns. I like them as they're easier to reference and update than bare Range objects. Yet I still haven't found a sane and maintainable way to handle multiple ListObjects consisting of many ListColumns and being referenced across all Worksheets in a project.
Let's say I have Worksheet (with (Name) property set to "WorksheetA") that contains table (called TableA) with few columns (called Column1, Column2, ..., Column10).
Now I want to reference one of the columns from the code of another Worksheet. I could do it as follows:
WorksheetA.ListObjects("TableA").ListColumns("Column7")
Now, it's a bad practice to use string directly, as it's difficult to maintain and prone to errors.
So what now?
I could create dedicated module to store my string as constants. For example, module called "Constants":
Public Const TABLE_A As String = "TableA"
Public Const COLUMN7 As String = "Column7"
Then my reference could be converted to:
WorksheetA.ListObjects(Constants.TABLE_A).ListColumns(Constants.COLUMN7)
However, this solution has some disadvantages:
Constants module would grow ridiculously fast with each table and column added.
Reference itself grows and becomes less readable.
All constants related to tables from across all workbooks are thrown into one giant pit.
I could store constants inside WorksheetA, and make them available through Public Functions like:
Private Const TABLE_A As String = "TableA"
Private Const COLUMN7 As String = "Column7"
Public Function GetTableAName() As String
GetTableAName = TABLE_A
End Function
Public Function GetTableA() As ListObject
Set GetTableA = WorksheetA.ListObjects(TABLE_A)
End Function
Public Function GetTableAColumn7() As ListColumn
Set GetTableAColumn7 = GetTableA().ListColumns(COLUMN7)
End Function
This solution actually solves all three problems mentioned above, yet it's still a bit "dirty" and time-consuming, as adding a new table introduces a requirement to create a function for each column.
Do you have better idea how to deal with this problem?
EDIT1 (for clarity): Let's assume that user must not change any names (neither table names nor column names). If user does so, it is he/she to blame.
EDIT2 (for clarity): I've used Column7 as column name only as an example. Let's assume that columns have more meaningful names.

Here's my two cents. I'm not an educated programmer, but I do get paid to do it, so I guess it makes me professional.
The first line of defense is that I create a class to model a table. I fill the class from the table and no other code even knows where the data lives. When I initialize, I'll run code like
clsEmployees.FillFromListObject wshEmployees.ListObjects(1)
Then in the class, the code looks like
vaData = lo.DataBodyRange.Value
...
clsEmployee.EeName = vaData(i,1)
clsEmployee.Ssn = vaData(i,2)
etc
Only one ListObject per worksheet. That's my rule and I never break it. Anyone with access to the worksheet could rearrange the columns and break my code. If I want to use Excel as a database, and sometimes I do, then that is the risk I take. If it's so critical that I can't take that risk, then I store my data in SQL Server, SQLite, or JET.
Instead of putting the range in an array, I could actually call out the ListColumns names. That way if someone rearranged the columns, my code will still work. But it introduces that they could rename the columns, so I'm just trading one risk for another. It would make the code more readable, so it may be the trade you want to make. I like the speed of filling from an array, so that's the trade I make.
If my project is sufficiently small or is supposed to work directly with ListObjects, then I follow the same rules as I do for any Strings.
I use Strings in code exactly once.
If I uses it more than once, I make a procedure-level constant
If I use it in more than one procedure, I try to pass it as an argument
If I can't pass it as an argument, I make a module-level constant
If the two procedures are in different modules, I first ask myself why two procedures are in different modules that use the same constant. Shouldn't related procedures be in the same module?
If the two procedures really belong in the different modules, then I try to pass it as an argument
If none of that works, then it truly is a global constant and I set up in my MGlobals module.
If MGlobals takes up more than about half a screen, I'm doing something wrong and I need to step back and think through what I'm trying to accomplish. And then I make a custom class.

Related

How should I created an SSRS template from a SQL table?

Only my second stack question, I'm pretty new to ssrs, but I'd rather do it the right way and learn now, then have to come back and fix everything later.
I'm going to create a semi-flexible reporting (SSRS) standard for my company and getting global consensus ahead of time is impossible. My proposed solution will be to create a formatting table that I can update to alter the look and feel of all my reports of one type. (tables, charts etc.) I have thought of two ways to do this and I am looking for advice on which is better and if how I am thinking it will work...will actually work. I am totally willing to research all the specifics of how-to, but I'd really appreciate a hint on how to start.
Is there a way to reference which attribute (e.g. name, background color) your expression is in? It would be awesome if I could use the same code for all the attribute's expressions and just have that code find it's spot in the table.
something like:
attribute.value = lookup(ReportFormataDataset as RFD, attribute.name =
RFD.name and left(me.name, 3) = RFD.prefix)
Alternatively I could run a loop in VBA code to change the attributes based on what's in the table. I plan to create a report template with naming conventions to help. (e.g. hdr, ttl, bdy prefix) so it could look like:
for each reportItem in report
for each el in FormatTable
'make sure the report item is what I think it is, like header
if left(ReportItem.name, 3) = el.prefix then
'e.g. backgroundcolor = Blue
name.value = el.value
end if
end loop
end loop
but then when would I run it, I would imagine this slowing my report a lot if I did this in the expressions. Maybe with variables instead?
I found this:
tips-tricks-ensure-consistency-sql-server-reporting-services-reports
but it seems very cumbersome to maintain if I add a formatting requirement later I'll have to add that parameter to all the reports and the attribute.
I know this seems a little fishing-y but I am not sure how either of these would work and I know I could throw days of effort at either when an expert could point me in the right direction in 5 minutes so... sorry if this is not in the 'stack spirit' and thank you.

We have implemented something similar, and have used the method of setting up a shared dataset that all the reports call. This returns a single record, that includes all the images, background colours and date formats we might use across the reports (including company logo and branding colours).
If we had a date in a cell for example, we would then set the number format to
=First(Fields!FullDateFormat.Value, "ReportConstants")
Where ReportConstants is the named of the shared dataset.
This also allows us to have the same report deployed both in the UK and the US, yet both use their native date format, as this is set by the database instance, instead of within the report.
Now whether this is the right approach to take or not is another question, but it works for us, using the ~50 reports in multiple locations, without having to configure each one individually all the time.
Is this helpful?

Using bind variables in large insert statements

I am inheriting an application which has to read data from various types of files and use the OCI interface to move the data into an Oracle database. Most of the tables in question have about 40-50 columns, so the SQL insert statements become pretty lengthy.
When I inherited this code, it basically built up the insert statements via a series of strcats as a C string, then passed it to the appropriate OCI functions to set up and execute the statement. However, since much of the data is read directly from files into the column values, this leaves the application open to easy SQL injection. So I am trying to use bind variables to solve this problem.
In every example OCI application I can find, each variable is statically allocated and bound individually. This would lead to quite a bit of boilerplate, however and I'd like to reduce it to some sort of looping construct. So my solution is to, for each table, create a static array of strings containing the names of the table columns:
const char const *TABLE_NAME[N_COLS] = {
"COL_1",
"COL_2",
"COL_3",
...
"COL_N"
};
along with a short function that makes a placeholder out of a column name:
void makePlaceholder(char *buf, const char *col);
// "COLUMN_NAME" -> ":column_name"
So I then loop through each array and bind my values to each column, generating the placeholders as I go. One potential problem here is that, because the types of each column vary, I bind everything as SQLT_STR (strings) and thus expect Oracle to convert to the proper datatype on insertion.
So, my question(s) are:
What is the proper/idiomatic (if such a thing exists for SQL/OCI) to use bind variables for SQL insert statements with a large number of columns/params? More generally, what is the best way to use OCI to make this type of large insert statement?
Do large numbers of bind calls have a significant cost in efficiency compared to building and using vanilla C strings?
Is there any risk in binding all variables as strings and allowing Oracle to make the proper type conversion?
Thanks in advance!

Not sure about the C aspects of this. My answer will be from a DBA perspective.
Question 2:
Always use bind variables. It prevent SQL-injection and enhances performance.
The performance aspect is often overlooked by programmers. When Oracle receives a SQL it makes a hash of the entire SQL-text and looks in it's internal repository of execution plans to see if it has one. If bind variables was used it the SQL-text will be the same each time you run the query, not matter what the value of a variable is. However if you have concatenated the string your self Oracle will hash the SQL-text including content of (what you aught to have put in) variables, getting a unique hash every time. So if you do a query one million times Oracle will if you used bind variables make one execution plan, while if you did not use bind variables it will make one million execution plans and waste loads of resources doing that.

Parse a comma delimited field into seperate fields (MS ACCESS VBA 2003)

I inherited a database where user input fields are stored as a comma delimited string. I know. Lame. I want a way to parse these fields in a SELECT query where there are three segments of varying number of characters. Counter to all the recommendations that I insert fields into a new table or create a stored procedure to do this, this is what I came up with. I'm wondering if anyone sees any flaw in doing this as a select query (where I can easily convert from string to parsed and back again as need be).
Field_A
5,25,89
So to get the left segment, which is the most straightforward:
Field_1: Left$([Field_A],InStr([Field_A],",")-1)
To get the right-most segment:
Field_3: Right$([Field_A],Len([Field_A])-InStrRev([Field_A],","))
Middle segment was the trickiest:
Field_2: Mid([Field_A],InStr([Field_A],",")+1,InStrRev([Field_A],",")-InStr([Field_A],",")-1)
So the result is:
Field_1 Field_2 Field_3
5 25 89
Any consenting opinions?

Well, if you insist on going down this road......
This might be easier and more adaptable. Create a function in a module:
Public Function GetValueFromDelimString(sPackedValue As String, nPos As Long,
Optional sDelim As String = ",")
Dim sElements() As String
sElements() = Split(sPackedValue, sDelim)
If UBound(sElements) < nPos Then
GetValueFromDelimString = ""
Else
GetValueFromDelimString = sElements(nPos)
End If
End Function
Now in your query you can get any field in the string like this:
GetValueFromDelimString([MultiValueField],0) AS FirstElement, GetValueFromDelimString([MultiValueField],1) AS SecondElement, etc.
I feel like I am buying beer for a minor, encouraging this type of behavior :)

It sounds like you're not asking for information on how to parse a comma-delimited field into different fields, but rather looking for people to support you in your decision to do so, yes?
The fact, as you've already discovered, is that you can indeed do this with skillful application of functions in your SQL field definitions. But that doesn't mean that you should.
In the short run, it's an easy way to achieve your goals as data manager, I'll grant you that. But as a long-term solution it's just adding another layer of complexity to what seems like a poorly-designed database (I know that the latter is not your fault -- I too have inherited my share of "lame" databases).
So I applaud you on "getting the job done" for now, but would advise you to listen to "all the recommendations that you insert fields into a new table" -- they're good recommendations. It'll take more planning and effort, but in the long run you'll have a better database. And that will make everything you do with it easier, faster, and more reliable.

This is an old thread, but someone might search it. You can also do the same strategy as an update query. That way you can keep the original CSV and have 3 new destination fields that can be calculate and recalculated depending on your application purposes.

How to manage multiple tables with the same structure (redux)

I found this question, which is similar to a problem that I would like to solve:
How to manage multiple tables with the same structure
However, due to the craptastical nature of VB, the solution doesn't really work. It specifically doesn't work because VB.NET requires the implementation of each method/property in the interface to be explicitly declared.
As for the problem that I'm really trying to solve, here it is:
I have many lookup/domain tables in the database that all have the same structure
The items in these tables are typically used for drop downs in the interface
I would like to avoid a bunch of boilerplate repository methods to retrieve the contents of these tables (one method per table really sucks when you have 40 tables)
I am not using the One True Lookup Table anti-pattern and this is not an option
Does anyone have another solution for this that work work in VB?

Generic Repository should work in this case. There are many available online or you can write a simpler one for just the lookup tables.

Here is the code that we ended up using:
Public Function GetDomainTableList(tableName As String) As IEnumerable(Of Object)
Dim table = CType(GetType(FECEntities).GetProperty(tableName).GetValue(DB, Nothing), IEnumerable(Of Object))
Dim dt = From r In table
Select r
Return dt.ToList()
End Function
I had originally thought that this wouldn't work for us since I was trying to project each object returned into a DomainTableItem class that I had written. But then I realized that the SelectList constructor didn't really care about the type of object that it takes in. You just pass in a String containing the property name and it uses reflection to pull out the value.
So everything works out peachy this way and I avoided writing one method per domain/lookup table.

Database : best way to model a spreadsheet

I am trying to figure out the best way to model a spreadsheet (from the database point of view), taking into account :
The spreadsheet can contain a variable number of rows.
The spreadsheet can contain a variable number of columns.
Each column can contain one single value, but its type is unknown (integer, date, string).
It has to be easy (and performant) to generate a CSV file containing the data.
I am thinking about something like :
class Cell(models.Model):
column = models.ForeignKey(Column)
row_number = models.IntegerField()
value = models.CharField(max_length=100)
class Column(models.Model):
spreadsheet = models.ForeignKey(Spreadsheet)
name = models.CharField(max_length=100)
type = models.CharField(max_length=100)
class Spreadsheet(models.Model):
name = models.CharField(max_length=100)
creation_date = models.DateField()
Can you think about a better way to model a spreadsheet ? My approach allows to store the data as a String. I am worried about it being too slow to generate the CSV file.

from a relational viewpoint:
Spreadsheet <-->> Cell : RowId, ColumnId, ValueType, Contents
there is no requirement for row and column to be entities, but you can if you like

Databases aren't designed for this. But you can try a couple of different ways.
The naiive way to do it is to do a version of One Table To Rule Them All. That is, create a giant generic table, all types being (n)varchars, that has enough columns to cover any forseeable spreadsheet. Then, you'll need a second table to store metadata about the first, such as what Column1's spreadsheet column name is, what type it stores (so you can cast in and out), etc. Then you'll need triggers to run against inserts that check the data coming in and the metadata to make sure the data isn't corrupt, etc etc etc. As you can see, this way is a complete and utter cluster. I'd run screaming from it.
The second option is to store your data as XML. Most modern databases have XML data types and some support for xpath within queries. You can also use XSDs to provide some kind of data validation, and xslts to transform that data into CSVs. I'm currently doing something similar with configuration files, and its working out okay so far. No word on performance issues yet, but I'm trusting Knuth on that one.
The first option is probably much easier to search and faster to retrieve data from, but the second is probably more stable and definitely easier to program against.
It's times like this I wish Celko had a SO account.

You may want to study EAV (Entity-attribute-value) data models, as they are trying to solve a similar problem.
Entity-Attribute-Value - Wikipedia

The best solution greatly depends of the way the database will be used. Try to find a couple of top use cases you expect and then decide the design. For example if there is no use case to get the value of a certain cell from database (the data is always loaded at row level, or even in group of rows) then is no need to have a 'cell' stored as such.

That is a good question that calls for many answers, depending how you approach it, I'd love to share an opinion with you.
This topic is one the various we searched about at Zenkit, we even wrote an article about, we'd love your opinion on it: https://zenkit.com/en/blog/spreadsheets-vs-databases/

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas