I will briefly explain what I have and need here, and later if I can, I will edit this post and add a reproducible example.
My project:
Query data from Oracle databases into one worksheet in Excel, then use a LOOKUP procedure to copy data into an editable table in a second worksheet. The second worksheet needs to be in a table format for filtering, and have a drop down option to filter the data by date ranges. The data needs to be refreshed 1-2 times a week only by 1-2 approved staff members.
Concerns:
Per suggestion I installed Power Query for Excel 2010, which required dependencies before it could work. The convenience factor is great and it makes it so that SQL queries can be edited without messing around in VBA code. However, the dependencies setup (Oracle client for data connections) limits casually deploying this as a solution.
The data connections and queries and the data lookup could all be done in VBA and assigned macros.
Questions:
Should I use Power Query to query the data and then a VBA for the second sheet LOOKUP and date range filtering -- or should this all be written in VBA Excel Macros?
Which is more future proof friendly? Are there any advantages for using Power Query that would make this task more edit friendly for non-coders?
Thanks!
This probably can be solved with PowerQuery only, without VBA. I wouldn't recommend you storing queries in Excel table, the best is to move it on a server. A view or a function would be suitable. Querying the database, editing this view/function will work for only for only approved users.
This is more secure and will require only 1 Excel workbook. In PowerQuery, you can refer old copy of the table at the moment you refresh it, therefore you can keep entered data and get new.
Your project seem to me as an ad-hoc solution.
Related
I'm using the dropdown menus to make macros in Microsoft Access, but I'm confused as to how to save a query into a table, overwriting it. The actions are as follows:
CopyObject([Example1],Table,[Example2]) #for the purpose of the query
OpenQuery([Query1])
But from there I don't know how to get the query saved into a table. SaveObject isn't what I'm looking for, I want it into a table, not saved as a quer Couldn't find any questions on here answering the problem, so I thought I'd ask. Basically, I want to copy-paste a query into a table, as I can do in regular Access, but due to the strict definitions of Access macros I don't think it'll work. I am familiar with SQL, but I'd like to not have to write SQL code as I am doing this multiple times with different names for each table.
We work with a lot of data at my job, and I want to try and find a way to limit the amount of copying from SSMS to an Excel sheet that goes to the client.
What I want to be able to do, using SSIS if possible or any other possible way (Maybe power query?), is to copy the data pulled via a SQL query to an Excel workbook sheet.
For example, I want to do a count on the amount of members by state, I'd have the query run and the results copied to the sheet called "State" in the Excel work book.
Example code:
SELECT C.State, COUNT(*) as Count
FROM [dbo].Input I
Join Cassresults C on C.ID = I.ID
group by C.State
order by Count desc
The Excel workbook will never change for the client. The only thing that may change are the queries, but those are easily updated.
Is there a way to actually do this or am I nuts for thinking so? I hope I explained it well enough.
SSAS, SSIS, PowerQuery, PowerBI, Excel PowerPivot, SSRS, and Excel Data Querys all are geared for this type of use. I would definitely NOT recommend VBA as your users will constantly get a security warning and it is more complex than needed.
For Excel probably a good starting location go to the data tab and click "From Other Sources" and check out the different source types. From Micrsoft Query gives you the ability to write a query or copy from SSMS.
The only thing is will Data Sources Change? If so every workbook you create and distribute will the become obsolete and need to be changed. SSRS is a good choice to allow users to grab the report (and export to Excel) that they need.
When doing SSAS it is great as well but start with PowerPivot in Excel, again data connections move Sharepoint data connection library is a way to combat that.
This is like a BI and reporting design question and you will get a plethora of answers.
Right so, I have been given a LOT of "consumer data" to sort, 3 excel files, each containing multiple worksheets, up to 7 worksheets, each up to 1M rows (max worksheet size in excel 2013 is just over 1 rows)
I need to pull out of these all people within a region, so have a list of post codes in this region (say 30 post code areas)
How can I achieve this most easily?
If the data was in SQL server, i'd just write a long SQL statement selecting all where postcode LIKE 'B75'% OR 'B74'% etc etc.
But in excel I can only run a "filer" on one worksheet at a time... (I think)
Is it going to be easiest to throw all the data into sql server, or have I overlooked a method?
First solution is to let Excel do the task of sorting for you. You need to add filters to columns and select Sorting options.
Other solution is to Export Data to SQL Table(s). To do this, Open SQL Management Studio, Right Click on Database to which you want to export data to and Select "From Excel File". Do this for each Excel file you have. After importing all data in Database, sort Data using SQL Query.
Second Solution is reliable but first solution is faster. You need to decide which one you should select.
Background
I have an extremely large data table that takes up to 12 hours to run for around 1 million input scenarios on a high-end 64bit machine. The scenarios are based on a number of discrete Excel models, that are then fed into a financial model for detailed calculations
To improve the process, I am looking to test and compare the speeds of:
The current manual process
Using VBA to refresh the Data Table (with Calculation, ScreenUpdating etc off)
Running a VBS to refresh the Data Table in a invisible Excel instance
So, I am looking for the best approach to programmatically manage a Data Table
Update: using code in (2) and (3) did not provide a benefit on testing a simple example with a workbook with a single large data table
Rather surprisingly there seems to be very little - possibly no - direct support in VBA for Data Tables
My current knowledge and literature search
QueryTable BeforeRefresh and AfterRefresh Events can be added with this class module code. Intellisense doesn't provide this as an option for Data Tables
Individual PivotTables and QuertyTables can be accessed like so ActiveWorkbookk.Sheets(1).QueryTables(1). Not so Data Tables
Eliminating all other Data Tables and then running a RefreshAll was suggested in this MrExcel thread as a workaround.
The workaround is certainly do-able as I only have a single Data Table, but I'd prefer a direct approach if one exists.
Yes, I'm sticking to Excel :)
Please do not suggest other tools for this approach, both the input models and the overarching model that uses the data table are
part of a well established ongoing process that will stay Excel based,
have been professionally audited,
have been streamlined and optimised by some experience Excel designers
I was simply curious if there was a way to tweak the process by refreshing a specific data table with code, which my initial test results above have concluded no to.
So, you are looking for the best approach to programmatically manage a Data Table.
Well, Excel 2013 does record a macro for me when I manually create a data table, it goes
Selection.Table ColumnInput:=Range("G4")
The signature is
Range.Table(RowInput as Range, ColumnInput as Range) as Boolean
which is documented in Range.Table Method. The Range.Table() function seems to always return true.
This is the only way to create data tables using VBA. But that's all there is to data tables anyway.
AFAIK there is no class or object for data tables, so there is no dt.refresh() or similar method. And there is no collection of data tables you could query. You have to refresh the sheet or recreate the table with Range.Table().
There is a DataTable Interface, but it is related to charts and has nothing to do with Range.Table().
As you mention, you should turn off the usual suspects, i.e.
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
Try to have as little formulas in your workbook. Remove all formulas not related to the cells you base the data table on. Remove any intermediate results. Best have one cell with one, possibly big, formula.
Example: G4 is your ColumnInput, and it contains =2*G3, with G3 containing =G1+G2,
then better put =2*(G1+G2) into G4.
You may have 6 cores in your high end machine. Divide your scenarios into 6 chunks and
have 6 Excel instances calculate them in parallel.
problem
how to best parse/access/extract "excel file" data stored as binary data in an SQL 2005 field?
(so all the data can ultimately be stored in other fields of other tables.)
background
basically, our customer is requiring a large volume of verbose data from their users. unfortunately, our customer cannot require any kind of db export from their user. so our customer must supply some sort of UI for their user to enter the data. the UI our customer decided would be acceptable to all of their users was excel as it has a reasonably robust UI. so given all that, and our customer needs this data parsed and stored in their db automatically.
we've tried to convince our customer that the users will do this exactly once and then insist on db export! but the customer can not require db export of their users.
our customer is requiring us to parse an excel file
the customer's users are using excel as the "best" user interface to enter all the required data
the users are given blank excel templates that they must fill out
these templates have a fixed number of uniquely named tabs
these templates have a number of fixed areas (cells) that must be completed
these templates also have areas where the user will insert up to thousands of identically formatted rows
when complete, the excel file is submitted from the user by standard html file upload
our customer stores this file raw into their SQL database
given
a standard excel (".xls") file (native format, not comma or tab separated)
file is stored raw in a varbinary(max) SQL 2005 field
excel file data may not necessarily be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, different "formats", ...)
requirements
code completely within SQL 2005 (stored procedures, SSIS?)
be able to access values on any worksheet (tab)
be able to access values in any cell (no formula data or dereferencing needed)
cell values must not be assumed to be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, formulas, different "formats", ...)
preferences
no filesystem access (no writing temporary .xls files)
retrieve values in defined format (e.g., actual date value instead of a raw number like 39876)
My thought is that anything can be done, but there is a price to pay. In this particular case, the price seems to bee too high.
I don't have a tested solution for you, but I can share how I would give my first try on a problem like that.
My first approach would be to install excel on the SqlServer machine and code some assemblies to consume the file on your rows using excel API and then load them on Sql server as assembly procedures.
As I said, This is just a idea, I don't have details, but I'm sure others here can complement or criticize my idea.
But my real advice is to rethink the whole project. It makes no sense to read tabular data on binary files stored on a cell of a row of a table on database.
This looks like an "I wouldn't start from here" kind of a question.
The "install Excel on the server and start coding" answer looks like the only route, but it simply has to be worth exploring alternatives first: it's going to be painful, expensive and time-consuming.
I strongly feel that we're looking at a "requirement" that is the answer to the wrong problem.
What business problem is creating this need? What's driving that? Try the Five Whys as a possible way to explore the history.
It sounds like you're trying to store an entire database table inside a spreadsheet and then inside a single table's field. Wouldn't it be simpler to store the data in a database table to begin with and then export it as an XLS when required?
Without opening up an instance Excel and having Excel resolve worksheet references I'm not sure it's doable at all.
Could you write the varbinary to a Raw File Destination? And then use an Excel Source as your input to whatever step is next in your precedence constraints.
I haven't tried it, but that's what I would try.
Well, the whole setup seems a bit twisted :-) as others have already pointed out.
If you really cannot change the requirements and the whole setup: why don't you explore components such as Aspose.Cells or Syncfusion XlsIO, native .NET components, that allow you to read and interpret native Excel (XLS) files. I'm pretty such with either of the two, you should be able to read your binary Excel into a MemoryStream and then feed that into one of those Excel-reading components, and off you go.
So with a bit of .NET development and SQL CLR, I guess this should be doable - not sure if it's the best way to do it, but it should work.