How to query data in Excel from Visual Studio 2013? - sql

Background:
I am transferring data from one Excel document doc0 to a templated Excel document doc1 to speed up processes at work. My only real restriction is that I cannot modify the document's formatting, so regular VBA is not an option. I can only pull data out of doc0 modify it and place it in doc1. I am using Visual Studio 2013 for doing so.
What I need to do is:
Organize doc0 numerically by Col 1 first, then Col 3 second. Then place the top 10 results in a specific cell range in doc1.
Get a count for jobs assigned to each worker and return that result to Visual Studio. Worker names are listed in Col 4.
I know how to query using SQL, but am open to using other functions/languages that can perform the same task.
Question:
How can I query the data to perform the actions above?
A simple example can be seen with the link below. The blue represents doc0, the red the results to be displayed in doc1 and the green is the results that I need to have returned to corresponding textboxes in Visual Studio.

There are a few options. ADO.NET is able to connect to your excel sheet using OleDB to read data with simple query capability. Examples can be found in KB316934.
Connect to an Excel file to read and write data:
Connection String
To access an Excel workbook by using the Jet OLE DB Provider, use a connection string that has the following syntax:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Book1.xls;Extended Properties="Excel 8.0;HDR=YES;"
Depending on your excel version, the connection string may slightly differ. Look them up here. E.g. 2013 would look like:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcel2007file.xlsx;
Extended Properties="Excel 12.0 Xml;HDR=YES";
Read and write data
Use the sheet name followed by a dollar sign (for example, [Sheet1$] or [My Worksheet$]). A workbook table that is referenced in this manner includes the whole used range of the worksheet.
select * from [Sheet1$]
Use a range with a defined name (for example, [MyNamedRange]):
Select * from [MyNamedRange]
Use a range with a specific address (for example, [Sheet1$A1:B10]):
Select * from [Sheet1$A1:B10]
Writing is done in a similar matter if you're using OleDB
INSERT INTO [Sheet1$] (F1, F2) values ('111', 'ABC')
UPDATE [Sheet1$] SET F2 = 'XYZ' WHERE F1 = '111'
You may need to create a temporary copy from which you can query data, as you may be reading and writing to the documents using different techniques.
Full example at the link (unfortunately in VB.NET).
Alternative solutions
If you really want full fidelity access to the Excel file, without depending on Excel being present or running, you could also investigate:
Excel Package Plus
NPIO
Aspose Cells.NET (commercial)
These packages do not support querying, so you'll need to extract the data into objects and use Linq-to-Objects to query/sort the data before writing it back to the files.

Related

How to use a SQL Select statement with Power Query against an Access database?

I've got a query that joins 4 tables that I need to run against 4 different Access .mdb files (all have the same schema) so I can compare the results in Excel. Instead of creating 16 Power Queries and joining them into 4 queries (20 total query objects) I want to write a SQL statement that joins the tables and run it against each of the 4 different data sources. There's a chance that the SQL statement may need to be updated, so having it stored in one place will make future maintenance easier.
I could not find examples of this online and the way that Power Query writes M for an Access connection is based on one table at a time. I did not want a solution that used VBA.
Poking around with the various Power Query connectors I found that I can use the ODBC connector to connect to an Access database. I was able to adjust the parameters and pass it a standard SQL statement.
I put the SQL statement in a cell (C16 in the image) and named that range Package_SQL. I also have 4 cells where I put the path and filename of the 4 Access .mdb files I want to query. I name those ranges Database1 through Database4.
This is the configuration screen to set the database paths and set the SQL statement
let
// Get the Access database to work with.
dbPath = Excel.CurrentWorkbook(){[Name="Database1"]}[Content]{0}[Column1],
// Get the SQL statement from the named range
SQL = Excel.CurrentWorkbook(){[Name="Package_SQL"]}[Content]{0}[Column1],
Source = Odbc.Query("dbq=" & dbPath & "; defaultdir=C:\Temp;driverid=25;
fil=MS Access;maxbuffersize=2048;pagetimeout=5;dsn=MS Access Database", SQL),
#"Changed Type" = Table.TransformColumnTypes(Source,
{{"Issue_Date", type date}, {"Revision_Issue_Date", type date}})
in
#"Changed Type"
As you can see the magic is done in the following line. I didn't want the defaultdir to be hard coded to a folder that everyone may not have so I set it to C:\Temp. You may need to change it or even remove it and see if it makes a difference.
Source = Odbc.Query("dbq=" & dbPath & "; defaultdir=C:\Temp; driverid=25;
fil=MS Access;maxbuffersize=2048; pagetimeout=5; dsn=MS Access Database", SQL),
I made 4 instances of that query and created another query to combine the results. The query runs as fast as most any other Access query. I am very satisfied with this solution. The query can be altered and/or repurposed from the Excel sheet without digging through the Power Query scripts.
Note that this solution does not use any VBA.

Excel - Off Page Reference to Microsoft Query

I am utilizing Microsoft Query in Excel to tap into an ERP table structure like Crystal would do.
In writing the SQL, is there a way to have a filter pulled from the active Excel worksheet that is embedded in the SQL instead of prompting and editing the query?
My main problem is a Like [Prompt]% in the Excel GUI for the users to change like order numbers.
Is it possible to do an off page reference from MS Query to Excel?
If by "Microsoft Query", you're talking about the window that looks like it was coded for Windows 95, stop using it. This is provided for retro-compatibility.
Anyway, if you've displayed the criteria bar in MS query, you can type a name between brackets e.g. [Something] and MS query will prompt you to fill a value.
Not what you want yet but getting close. When you return to Excel and refresh the query, the prompt will now offer you the possibility to use a cell instead of a value you need to type every type.
In the more modern connection utility accessible via menu data > Connections (+ available even if you created your table via MS Query btw), you can achieve that by using question marks in the WHERE clause.
For instance, instead of SomeField = 'SomeValue', write SomeField = ?
Then, click on the Parameters button and you'll see all the parameters you've set, each of them can be attached to a cell's value.

What could be SQL Query of a data source that is a spreadsheet, to be returned to a seperate spreadsheet? Including UDFs in the query?

I currently have a data source of a large table, sitting in workbook1. From workbook2, which is currently empty, I wish to set up a DSN connection to workbook1, so that I can query it from workbook 2.
In the SQL query result, I wish to display extra columns which are calculated using User-Defined VBA functions, the arguments of which will be other fields from the data source.
Example:
Workbook1 is Field1, F2, F3 and F4. I wish to query this and display all records, but additionally I wish to have F5=UDF(F3,F4).
I have been advised already that the solution to this is:
SELECT UDF(F3,F4) as F5
FROM \SourceWorkBookLocation\SourceWorkBook
IN ACCESS:
The problem I am having in access is not at the top of my list right now, relates to data types and trying to determine if a number in a string is <25. But the main problem is in MS Query:
IN EXCEL/MS QUERY:
The function is just not recognized; "undefined function"
I am not sure how to get it to see the function? My end goal here is to build a front end in excel, and have vba querying appropriately using user input variables passed to the queries. The querying will be done on a separately updated workbook.
Any ideas on how to get MS Query to see my UDF and accept what I am doing? Could it be a driver issue? There are a range of excel drivers to choose from.
Thanks
Looking at the info you have provided, you have tried to use two Excel workbooks as tables to query using Excel VBA UDF. Now I assume you are going to use these workbooks as your tables but in MS Access.
All most all databases is able to read standard SQL. See the thing is that each database is able to handle functions writting in their own space. In your case please write your UDF in Access VBA. Then try to execute to the same.
This is a common issue sometimes people do face, either tyring to access MS Access UDFs from Excel or vise versa. In a nutshell, when you're running MS Access, queries can call back into VBA. But when you're going through ODBC or ADO, the JET engine doesn't have the whole VBA model to draw on because it's simply not running.
You could try to do something like this:
Dim objExcl As Object
Set objExcl = CreateObject("Excel.Application")
objExcl.OpenCurrentDatabase "ExcelFileName/Path"
objExcl.Run ("UDFName")
objExcl.CloseCurrentDatabase
Set objExcl = Nothing
Frankly I prefer moving the UDF in to Acces..
References:
Create Access UDF
http://www.sqlexamples.info/SQL/inlineudf.htm

SQL Import from Excel using non-contiguous range?

I have some Excel spreadsheets that I cannot change as they are used by another department and they will not change them in future. They are .xlsm with over 500 columns (A:TH). I'm trying to import them into SQL server 2008 on a 64bit machine but I'm having huge problems. All forms of Excel import appear to truncate the columns I select to the first 255.
Ultimately there will 5 separate tables to store this data with 1 common key. I could write a short VBA script to sort the data in Excel into arranged columns of tables at source but I wanted to ask if the following was possible first...
This works fine and selects the columns A:IV
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0;Database=C:\NEW.xlsm',
'SELECT * FROM [Details Sheet$A:IV]')
Is there a clever way to do something similar with a non-contiguous range such as
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0;Database=C:\NEW.xlsm',
'SELECT * FROM [Details Sheet$C:C,IW:LZ]')
ie. pick up the key in column C and the additional columns IW:LZ? The problem for me is that using the full range C:LZ and SELECT [ID],[THIS],[THAT] FROM etc won't work for fields beyond 255 columns in the range, very annoying!
Have you tried using SSIS to import the Excel files? It can be very picky about data types, but I've never run into a limitation that I couldn't work around with a bit of a Script Component.
It's designed to be a high-performance ETL tool for jobs like what you're trying to accomplish. If you're new to it, check out this article on importing the entirety of Wikipedia as XML into multiple tables.
A quick note is that you may need to install additional Office drivers to read Excel 2007 format, especially on 64-bit machine.

Combining data from Excel with database

This is probably a simple question, but I really don't know what I'm doing in Excel, so hopefully someone can help me out.
I've been given an Excel spreadsheet that has two relevant columns to my task. The first column is an "External ID", and the second column is an "Internal ID". I need to select a bunch of data out of our databases (with various joins) using the Internal ID as the key, but then all of this data needs to be linked back to the External ID, and the only link between Internal/External is this spreadsheet.
For example, say a row of the spreadsheet looks like this:
ExtID IntID
AB1234 2
I need to select all the data relevant to the item with ID #2 in our database, but I have no way to get "AB1234" from the database, so I need to somehow relate this data back to "AB1234" using the spreadsheet.
What's the easiest way to accomplish this? The version of Excel is Excel 2007, and the database is Oracle, if that's relevant.
Note that I only have read permission to the production databases, so creating tables and importing the spreadsheet data to do a join is not an option.
Edited based on a comment
1 - Use MS Access to import the Excel sheet as a table.
2 - Link to your database table, also from within MS Access
External Data tab->other data sources->ODBC connection->choose yours->pick the table(s) you want
3 - Write an Access query to compare the values you want
Create->Query Design->Drop the tables you want, drag lines between them for relationships, click Run
Usually I use copy-paste and a good column-mode editor with macros to accomplish such tasks. It works fine if you only have a couple of Excel files.
Alot depends on how familiar you are with the tools you have available to you.
DO you have a tool you are familiar with that would make it easy to use the IntID to find those records? If so, can you do the query and paste the results back into the original spreadsheet in the column to the right of the column with the IntID?
If so, you will have what you want, a spreadsheet with the following columns:
ExtID (original)
IntID (original)
IntID (from Oracle)
Col1 (from Oracle)
Col2 (from Oracle) etc....
I'm not familiar with Oracle, but I know a lot of databases let you prepend a table name with # or something like that and create a temp table. Others have a temporary database where you can create things. Sometimes you can create a temp table even if you can't do anything else but select.
If you have access to do that, I would do the function as JosephStyons suggests (#2), insert your records into the temp table, and do a query based on that.
With Excel and VBA, you can use ActiveX Data Objects (ADO) as a high level way of using the OLE DB provider for a particular database. This lets you read the data from the database and you can then query that data and store the results in the spreadsheet.
Oracle OLE DB provider
ADO Guide