Automated Comparison of Data Model and Actual Database Tables - vba

I have my data model's in excel sheets and my actual database is postgresql 9.5, I would like to make some automated process that should compare the tables in the database and the data model's in the excel and make the changes in the db automatically or at least list out the differences between them. How can I do this? Can It be done using vba macros? or is there any other alternative? Please Give your suggestions on this.

Comparison is one of the bigger weaknesses in Excel. My approach would be something like this:
Make use of Postgresql's built in functionality to describe its data model
and copy that to Excel (or via ODBC if you want to over-engineer it)
Reshape the output of step 1 to something that has the same format as your Excel based data model
Do the comparison (either in Excel or in an external diff tool)
Step 1 and 2 can be done in VBA with a lot of string manipulation, but can also be a copy/paste operation, depending on what tools you have available.
The transformation in step 2 can also be handled with Get & Transform (in newer Excel) or PowerQuery (in older Excel).

Related

Database to Excel charts (or pdf)

Hi guys I have a simple DB that has two fields in it (time and number 1-3), the data needs to be exported and shown in simple charts (horizontal bars from 0 to max time from my DB)? What is the best way to do that?
The easiest was to establish a direct data access from excel to SQL-Server and use Excel's abilities for the graphics.
If you need the data "exported", it is quite easy to get a table as CSV-list. Again this can be opened with Excel directly to do the graphical work there.
Depending on your environment you might think about any reporting tool, obviously the first choice was SSRS or PowerBI, which is part of the box.
You might even use a SELECT * FROM YourTable and just copy-and-paste the full result into Excel.
The main things to think about:
One-time action or regularly
Grade of automatism
Size of data / Count of rows
Location / Access-rights / Linkability of your systems
Existing tools

Power Query + VBA/Macro Vs. VBA/Macro only

I will briefly explain what I have and need here, and later if I can, I will edit this post and add a reproducible example.
My project:
Query data from Oracle databases into one worksheet in Excel, then use a LOOKUP procedure to copy data into an editable table in a second worksheet. The second worksheet needs to be in a table format for filtering, and have a drop down option to filter the data by date ranges. The data needs to be refreshed 1-2 times a week only by 1-2 approved staff members.
Concerns:
Per suggestion I installed Power Query for Excel 2010, which required dependencies before it could work. The convenience factor is great and it makes it so that SQL queries can be edited without messing around in VBA code. However, the dependencies setup (Oracle client for data connections) limits casually deploying this as a solution.
The data connections and queries and the data lookup could all be done in VBA and assigned macros.
Questions:
Should I use Power Query to query the data and then a VBA for the second sheet LOOKUP and date range filtering -- or should this all be written in VBA Excel Macros?
Which is more future proof friendly? Are there any advantages for using Power Query that would make this task more edit friendly for non-coders?
Thanks!
This probably can be solved with PowerQuery only, without VBA. I wouldn't recommend you storing queries in Excel table, the best is to move it on a server. A view or a function would be suitable. Querying the database, editing this view/function will work for only for only approved users.
This is more secure and will require only 1 Excel workbook. In PowerQuery, you can refer old copy of the table at the moment you refresh it, therefore you can keep entered data and get new.
Your project seem to me as an ad-hoc solution.

Programmatically control/intercept a Data Table refresh

Background
I have an extremely large data table that takes up to 12 hours to run for around 1 million input scenarios on a high-end 64bit machine. The scenarios are based on a number of discrete Excel models, that are then fed into a financial model for detailed calculations
To improve the process, I am looking to test and compare the speeds of:
The current manual process
Using VBA to refresh the Data Table (with Calculation, ScreenUpdating etc off)
Running a VBS to refresh the Data Table in a invisible Excel instance
So, I am looking for the best approach to programmatically manage a Data Table
Update: using code in (2) and (3) did not provide a benefit on testing a simple example with a workbook with a single large data table
Rather surprisingly there seems to be very little - possibly no - direct support in VBA for Data Tables
My current knowledge and literature search
QueryTable BeforeRefresh and AfterRefresh Events can be added with this class module code. Intellisense doesn't provide this as an option for Data Tables
Individual PivotTables and QuertyTables can be accessed like so ActiveWorkbookk.Sheets(1).QueryTables(1). Not so Data Tables
Eliminating all other Data Tables and then running a RefreshAll was suggested in this MrExcel thread as a workaround.
The workaround is certainly do-able as I only have a single Data Table, but I'd prefer a direct approach if one exists.
Yes, I'm sticking to Excel :)
Please do not suggest other tools for this approach, both the input models and the overarching model that uses the data table are
part of a well established ongoing process that will stay Excel based,
have been professionally audited,
have been streamlined and optimised by some experience Excel designers
I was simply curious if there was a way to tweak the process by refreshing a specific data table with code, which my initial test results above have concluded no to.
So, you are looking for the best approach to programmatically manage a Data Table.
Well, Excel 2013 does record a macro for me when I manually create a data table, it goes
Selection.Table ColumnInput:=Range("G4")
The signature is
Range.Table(RowInput as Range, ColumnInput as Range) as Boolean
which is documented in Range.Table Method. The Range.Table() function seems to always return true.
This is the only way to create data tables using VBA. But that's all there is to data tables anyway.
AFAIK there is no class or object for data tables, so there is no dt.refresh() or similar method. And there is no collection of data tables you could query. You have to refresh the sheet or recreate the table with Range.Table().
There is a DataTable Interface, but it is related to charts and has nothing to do with Range.Table().
As you mention, you should turn off the usual suspects, i.e.
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
Try to have as little formulas in your workbook. Remove all formulas not related to the cells you base the data table on. Remove any intermediate results. Best have one cell with one, possibly big, formula.
Example: G4 is your ColumnInput, and it contains =2*G3, with G3 containing =G1+G2,
then better put =2*(G1+G2) into G4.
You may have 6 cores in your high end machine. Divide your scenarios into 6 chunks and
have 6 Excel instances calculate them in parallel.

SSRS report builder 2.0 store STATIC DATA to use to query results

Does any one know if there is a way to import a spreadsheet into report builder 2.0 and then use my data set to make calculations against.
This might seem like a novice question as my limited experience of report builder does not help.
The reason i want to do this is so that i don't have to have my main data-set run the query on working out averages of hundreds of thousands of records as it take ages to run. by having the benchmark average data static i would want to run my query and do the calculations in report builder which will make it a 100 times faster.
Thank you for your time in advance
You may be able to overcome this by adding Calculated Fields to your dataset (DS). I am assuming that your static data can be related to your dataset by using at least one existing field. Using the Switch function, you can populate your calculated fields. Switch “evaluates a list of expressions and returns an Object value corresponding to the first expression in the list that is True.”
You can use the function like this:
=Switch(Fields!DsField1.Value = 2, “Your Value1”, Fields!DsField1.Value = 5, “Your Value2”, Fields!DsField1.Value = 10, “Your Value3”, ….)
If you have any condition that needs to be checked, you can add it before the Switch statement like this:
=IIF(Fields!DsField20.Value <>1000, Switch(Fields!DsField1.Value = 2, “Your Value1”, Fields!DsField1.Value = 5, “Your Value2”, Fields!DsField1.Value = 10, “Your Value3”, ….), Nothing)
You can have your values in an Excel sheet to make the creation of the formula easier. Simply create your formula in the first row, copy the row down to extend your formula, and cover all your values. Then from Excel simply copy and paste the column of data into your calculated field(s).
Here’s an example of My Excel formula. This is the best I could do as I could not paste the sample here. You can copy and paste these and replace them your own values.
In Cell-A2 2
In Cell-B2 YourValue1
In Cell-C2 YourOtherValue1
In Cell-D2 YourOtherOtherValue1
In Cell-E2 YourOtherOtherOtherValue1
In Cell-F2 ="Fields!DsField1.Value ="&A2&","&""""&B2&""""&","
In Cell-G2 ="Fields!DsField1.Value ="&$A2&","&""""&C2&""""&","
In Cell-H2 ="Fields!DsField1.Value ="&$A2&","&""""&D2&""""&","
In Cell-I2 ="Fields!DsField1.Value ="&$A2&","&""""&E2&""""&","**
Sorry if there is anything I have missed; I did this in a rush.
Report builder doesn't have anywhere you can 'import' the spreadsheet to, except one of the databases you are querying from. And Excel isn't a supported data source for SSRS, however it might be possible to add a report data source that uses an ODBC DSN to the appropriate Excel file. (I haven't tried it).
But, I can foresee some problems with this approach - it may get upset under multiple users and I expect you may find the file gets locked so you can't update it very easily.
An approach that might work could be to upload the static data into an Access database (as that is supported via the OLE DB Jet provider) and reference that as a data source; but the best approach is always going to be importing the static data into a table in your main database and using that.

how can you parse an excel (.xls) file stored in a varbinary in MS SQL 2005?

problem
how to best parse/access/extract "excel file" data stored as binary data in an SQL 2005 field?
(so all the data can ultimately be stored in other fields of other tables.)
background
basically, our customer is requiring a large volume of verbose data from their users. unfortunately, our customer cannot require any kind of db export from their user. so our customer must supply some sort of UI for their user to enter the data. the UI our customer decided would be acceptable to all of their users was excel as it has a reasonably robust UI. so given all that, and our customer needs this data parsed and stored in their db automatically.
we've tried to convince our customer that the users will do this exactly once and then insist on db export! but the customer can not require db export of their users.
our customer is requiring us to parse an excel file
the customer's users are using excel as the "best" user interface to enter all the required data
the users are given blank excel templates that they must fill out
these templates have a fixed number of uniquely named tabs
these templates have a number of fixed areas (cells) that must be completed
these templates also have areas where the user will insert up to thousands of identically formatted rows
when complete, the excel file is submitted from the user by standard html file upload
our customer stores this file raw into their SQL database
given
a standard excel (".xls") file (native format, not comma or tab separated)
file is stored raw in a varbinary(max) SQL 2005 field
excel file data may not necessarily be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, different "formats", ...)
requirements
code completely within SQL 2005 (stored procedures, SSIS?)
be able to access values on any worksheet (tab)
be able to access values in any cell (no formula data or dereferencing needed)
cell values must not be assumed to be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, formulas, different "formats", ...)
preferences
no filesystem access (no writing temporary .xls files)
retrieve values in defined format (e.g., actual date value instead of a raw number like 39876)
My thought is that anything can be done, but there is a price to pay. In this particular case, the price seems to bee too high.
I don't have a tested solution for you, but I can share how I would give my first try on a problem like that.
My first approach would be to install excel on the SqlServer machine and code some assemblies to consume the file on your rows using excel API and then load them on Sql server as assembly procedures.
As I said, This is just a idea, I don't have details, but I'm sure others here can complement or criticize my idea.
But my real advice is to rethink the whole project. It makes no sense to read tabular data on binary files stored on a cell of a row of a table on database.
This looks like an "I wouldn't start from here" kind of a question.
The "install Excel on the server and start coding" answer looks like the only route, but it simply has to be worth exploring alternatives first: it's going to be painful, expensive and time-consuming.
I strongly feel that we're looking at a "requirement" that is the answer to the wrong problem.
What business problem is creating this need? What's driving that? Try the Five Whys as a possible way to explore the history.
It sounds like you're trying to store an entire database table inside a spreadsheet and then inside a single table's field. Wouldn't it be simpler to store the data in a database table to begin with and then export it as an XLS when required?
Without opening up an instance Excel and having Excel resolve worksheet references I'm not sure it's doable at all.
Could you write the varbinary to a Raw File Destination? And then use an Excel Source as your input to whatever step is next in your precedence constraints.
I haven't tried it, but that's what I would try.
Well, the whole setup seems a bit twisted :-) as others have already pointed out.
If you really cannot change the requirements and the whole setup: why don't you explore components such as Aspose.Cells or Syncfusion XlsIO, native .NET components, that allow you to read and interpret native Excel (XLS) files. I'm pretty such with either of the two, you should be able to read your binary Excel into a MemoryStream and then feed that into one of those Excel-reading components, and off you go.
So with a bit of .NET development and SQL CLR, I guess this should be doable - not sure if it's the best way to do it, but it should work.