Im trying to use MS Excel Power Query to get value from SQL DB based on item in each row.
My Excel Table has the following in A1, B1:
Date: =TODAY()
A2, B2 has the following headers and A3 has the fruits list. C2 - F2 contains other information. Hence, the val needs to be populated in col. B
Fruit, Val
Apple
Orange
Banana
The SQL query looks like below:
select val
from MY_TABLE
WHERE fruit = ?
AND date = ?
The ? is the parameter and it links to cell $B$1 (date), $A3 (the first item in the fruit list)
I am using the ODBC data connection where I input my query and insert the final parameter as ?
Then from editing the connections > properties, I change the parameters under the 'Definition' tab, selecting the appropriate cells.
But when I drag this to the next cell, it doesn't update. I tried changing $A3 to $A4, but once again the value is returned in cell B3 only.
Any idea how I can update this for each row?
I know I could use the MS SQL data connection where I can use a query like
SELECT val
FROM MY_TABLE
WHERE fruit IN (
'Apple',
'Orange',
'Banana'
)
But the excel sheet is used by many people and hence, the fruits list is updated at regular intervals. So using a static query is not ideal.
What im trying to achieve is that whenever the fruits list gets updated, the user can choose to flash fill to the next cell, which will update the Col B, by referencing the equivalent cell A.
I was not able to reproduce the exact problem you are facing with creating dynamic SQL query parameters in this way (for some reason the Parameters button under the Definition tab is greyed out, I am using Excel for Microsoft 365 on Windows 10). Anyway, if you were to succeed in doing this, wouldn't you end up with a unique query for each cell? I would imagine that would hurt performance when clicking on Data > Refresh All.
In any case, I believe one of the reasons for using Power Query is to have it write SQL queries for you: Power Query Editor > Query Settings pane on the right > Applied Steps, right-click on the last one and click on View Native Query to see the SQL query being sent to the server. As you further process the data, this underlying SQL query will be automatically edited depending on the statements supported by query folding. Of course, the connector needs to support this, so I suggest using the MS SQL Server connector. Note that sometimes the View Native Query option is greyed out but query folding is still taking place, the only way to know for sure is by using a profiling tool on the database.
Here is a way to use Power Query so that the whole Val column gets updated in a single data refresh.
Click on cell B1 and name it cellDate by using in the name box left of the formula bar, then right-click on cell B1 > Get Data from Table/Range... to open the Power Query Editor.
Replace the content of the Power Query Editor formula bar with this:
= Date.From(Excel.CurrentWorkbook(){[Name="cellDate"]}[Content][Column1]{0})
You now have a query that returns the date from cell B1. Now click on the query that contains the table you are importing from the database (named Fruits in this example). Filter the Date column using the drop-down list and select any random date.
In the formula bar, replace #date(2021, 9, 10) with cellDate. Now every time you change the date in cell B1 and refresh the data, this filter will be updated. If you are ignoring Privacy Level settings or using a Public Privacy Level for your workbook, this filter step should be folded to the data source.
Close and load these queries as connections only.
Select the range of cells containing the fruit names, create a Table, name it listFruits and right-click > Get Data from Table/Range... to open the Power Query Editor.
In the Query List on the left, right-click on listFruits > Duplicate. Rename it as listFruitsValues. On the Home tab > Merge Queries. Select Fruits as the second table and click on the Fruit column in each table. Select as Join Kind: Left Outer (all from first, matching from second), then click on OK. Note that from this step onwards, the query is not folded back to the data source.
Click on the expand button of the Fruits column, select only the Val column, uncheck Use original column name as prefix, then OK. Remove the Fruit column.
This is what the Power Query Editor window should look like at this stage.
Now you can load the listFruitsValues query in the worksheet next to the Fruit table. Here is what is that looks like with the default table formatting.
Now if any edit is made to the date and/or the list of fruits, clicking on Data > Refresh All will update the Val column accordingly.
On a final note, I would suggest considering a different approach if the source table filtered for the date (i.e. Fruits in this example) is not too large. The issue with the approach presented above is that the users need to click on the Refresh All button after every edit of the fruit list. This can be avoided by simply loading the Fruits query in a separate worksheet and using the following formula to populate the Val column:
=XLOOKUP(A4,Fruits[Fruit],Fruits[Val])
By creating a single Table with the Fruit and Val columns, the values are instantaneously updated when changes are made to the list of fruits and the Fruits query only needs to be refreshed when the date is changed.
How do I connect the Excel cell with the date to the SQL Query? I use Power Query. The Database is SQL Server.Please Help me.
Example of Query:
Select Account, Date
From Accountdate
Where Date = "Value in Excel cell"
Type a value into an Excel cell
Keep that cell selected. From the Power Query ribbon choose From Table.
Uncheck My Table Has Headers before clicking OK.
In the Power Query Query Editor, right click on that single value which you entered before and choose Drill Down
Click Apply & Close
In the Power Query, choose From SQL
Connect to your data source (don't worry about using the parameter yet.)
Once you're in the Query Editor with the correct SQL table being shown, choose the column you want to filter by your parameter. Go ahead and do a filter using the filter dropdown. Now change the formula bar and replace the number that you filtered by with the name of the query that you created in Step 5. That query by default will be called Table1 so your query in this step might look something like this: = Table.SelectRows(dbo_MySqlTable, each [ID] > Table1)
You will probably get a prompt asking you to classify the permissions to apply to each data source to make sure there are no security leaks. Once you've done that, click OK.
Click Apply & Close.
I'm trying to create a sub-table from another table of all the last name fields sorted A-Z which have a phone number field that isn't null. I could do this pretty easy with SQL, but I have no clue how to go about running a SQL query within Excel. I'm tempted to import the data into postgresql and just query it there, but that seems a little excessive.
For what I'm trying to do, the SQL query SELECT lastname, firstname, phonenumber WHERE phonenumber IS NOT NULL ORDER BY lastname would do the trick. It seems too simple for it to be something that Excel can't do natively. How can I run a SQL query like this from within Excel?
There are many fine ways to get this done, which others have already suggestioned. Following along the "get Excel data via SQL track", here are some pointers.
Excel has the "Data Connection Wizard" which allows you to import or link from another data source or even within the very same Excel file.
As part of Microsoft Office (and OS's) are two providers of interest: the old "Microsoft.Jet.OLEDB", and the latest "Microsoft.ACE.OLEDB". Look for them when setting up a connection (such as with the Data Connection Wizard).
Once connected to an Excel workbook, a worksheet or range is the equivalent of a table or view. The table name of a worksheet is the name of the worksheet with a dollar sign ("$") appended to it, and surrounded with square brackets ("[" and "]"); of a range, it is simply the name of the range. To specify an unnamed range of cells as your recordsource, append standard Excel row/column notation to the end of the sheet name in the square brackets.
The native SQL will (more or less be) the SQL of Microsoft Access. (In the past, it was called JET SQL; however Access SQL has evolved, and I believe JET is deprecated old tech.)
Example, reading a worksheet: SELECT * FROM [Sheet1$]
Example, reading a range: SELECT * FROM MyRange
Example, reading an unnamed range of cells: SELECT * FROM [Sheet1$A1:B10]
There are many many many books and web sites available to help you work through the particulars.
Further notes
By default, it is assumed that the first row of your Excel data source contains column headings that can be used as field names. If this is not the case, you must turn this setting off, or your first row of data "disappears" to be used as field names. This is done by adding the optional HDR= setting to the Extended Properties of the connection string. The default, which does not need to be specified, is HDR=Yes. If you do not have column headings, you need to specify HDR=No; the provider names your fields F1, F2, etc.
A caution about specifying worksheets: The provider assumes that your table of data begins with the upper-most, left-most, non-blank cell on the specified worksheet. In other words, your table of data can begin in Row 3, Column C without a problem. However, you cannot, for example, type a worksheet title above and to the left of the data in cell A1.
A caution about specifying ranges: When you specify a worksheet as your recordsource, the provider adds new records below existing records in the worksheet as space allows. When you specify a range (named or unnamed), Jet also adds new records below the existing records in the range as space allows. However, if you requery on the original range, the resulting recordset does not include the newly added records outside the range.
Data types (worth trying) for CREATE TABLE: Short, Long, Single, Double, Currency, DateTime, Bit, Byte, GUID, BigBinary, LongBinary, VarBinary, LongText, VarChar, Decimal.
Connecting to "old tech" Excel (files with the xls extention): Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyFolder\MyWorkbook.xls;Extended Properties=Excel 8.0;. Use the Excel 5.0 source database type for Microsoft Excel 5.0 and 7.0 (95) workbooks and use the Excel 8.0 source database type for Microsoft Excel 8.0 (97), 9.0 (2000) and 10.0 (2002) workbooks.
Connecting to "latest" Excel (files with the xlsx file extension): Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel2007file.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;"
Treating data as text: IMEX setting treats all data as text. Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel2007file.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;IMEX=1";
(More details at http://www.connectionstrings.com/excel)
More information at http://msdn.microsoft.com/en-US/library/ms141683(v=sql.90).aspx, and at http://support.microsoft.com/kb/316934
Connecting to Excel via ADODB via VBA detailed at http://support.microsoft.com/kb/257819
Microsoft JET 4 details at http://support.microsoft.com/kb/275561
tl;dr; Excel does all of this natively - use filters and or tables
(http://office.microsoft.com/en-gb/excel-help/filter-data-in-an-excel-table-HA102840028.aspx)
You can open excel programatically through an oledb connection and execute SQL on the tables within the worksheet.
But you can do everything you are asking to do with no formulas just filters.
click anywhere within the data you are looking at
go to data on the ribbon bar
select "Filter" its about the middle and looks like a funnel
you will have arrows on the tight hand side of each cell in the the first row of your table now
click the arrow on phone number and de-select blanks (last option)
click the arrow on last name and select a-z ordering (top option)
have a play around.. some things to note:
you can select the filtered rows and pasty them somewhere else
in the status bar on the left you will see how many rows meet you filter criteria out of the total number of rows. (e.g. 308 of 313 records found)
you can filter by color in excel 2010 on wards
Sometimes i create calculated columns that give statuses or cleaned versions of data you can then filter or sort by theses too. (e.g. like the formulae in the other answers)
DO it with filters unless you are going to do it a lot or you want to automate importing data somewhere or something.. but for completeness:
A c# option:
OleDbConnection ExcelFile = new OleDbConnection( String.Format( "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0;HDR=YES\"", filename));
ExcelFile.Open();
a handy place to start is to take a look at the schema as there may be more there than you think:
List<String> excelSheets = new List<string>();
// Add the sheet name to the string array.
foreach (DataRow row in dt.Rows) {
string temp = row["TABLE_NAME"].ToString();
if (temp[temp.Length - 1] == '$') {
excelSheets.Add(row["TABLE_NAME"].ToString());
}
}
then when you want to query a sheet:
OleDbDataAdapter da = new OleDbDataAdapter("select * from [" + sheet + "]", ExcelFile);
dt = new DataTable();
da.Fill(dt);
NOTE - Use Tables in excel!:
Excel has "tables" functionality that make data behave more like a table.. this gives you some great benefits but is not going to let you do every type of query.
http://office.microsoft.com/en-gb/excel-help/overview-of-excel-tables-HA010048546.aspx
For tabular data in excel this is my default.. first thing i do is click into the data then select "format as table" from the home section on the ribbon. this gives you filtering, and sorting by default and allows you to access the table and fields by name (e.g. table[fieldname] ) this also allows aggregate functions on columns e.g. max and average
Might I suggest giving QueryStorm a try - it's a plugin for Excel that makes it quite convenient to use SQL in Excel.
In the SQL scripts Excel tables are visible as if they were regular database tables.
All four SQL data operations are supported: select/update/insert/delete.
The engine that executes the queries is SQLite so you can use joins, common table expressions, window functions, etc... And you get the fancy stuff like code completion, auto-formatting, symbol tooltips etc...
It has a completely free community edition for use by individuals and small companies. If you're in a company that has more than 5 employees or more than $1M in yearly revenue, you'll need a paid license but you can use a free trial key for evaluation purposes.
This blog post describes the SQL functionality of the plugin in much more detail.
Disclaimer: I'm the author.
You can do this natively as follows:
Select the table and use Excel to sort it on Last Name
Create a 2-row by 1-column advanced filter criteria, say in
E1 and E2, where E1 is empty and E2 contains the formula =C6=""
where C6 is the first data cell of the phone number column.
Select the table and use advanced filter, copy to a range, using
the criteria range in E1:E2 and specify where you want to copy the
output to
If you want to do this programmatically I suggest you use the Macro Recorder to record the above steps and look at the code.
The accepted answers here are old technology and shouldn't be attempted.
Back when this question was written, Power Query wasn't a well known option and wasn't available unless you were on the latest version of Office and installed it as a separate Add-in.
Now, Power Query is included in Excel and used by default to get data. It is the right way to do this. It is simple, fast and effective.
Here is the answer to the question in Power Query. Search on "getting started with Power Query" if you need help replicating this. Once you get started with Power Query, you'll see this is very basic and easy to do with the Advanced Editor:
let
Source = Excel.CurrentWorkbook(){[Name="Names"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"lastname", type text}, {"firstname", type text}, {"phonenumber", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([phonenumber] <> null)),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"lastname", "firstname", "phonenumber"}),
#"Sorted Rows" = Table.Sort(#"Removed Other Columns",{{"lastname", Order.Ascending}})
in
#"Sorted Rows"
You can use SQL in Excel. It is only well hidden.
See this tutorial:
http://smallbusiness.chron.com/use-sql-statements-ms-excel-41193.html
If you need to do this once just follow Charles' descriptions, but it is also possible to do this with Excel formulas and helper columns in case you want to make the filter dynamic.
Lets assume you data is on the sheet DataSheet and starts in row 2 of the following columns:
A: lastname
B: firstname
C: phonenumber
You need two helper columns on this sheet.
D2: =if(A2 = "", 1, 0), this is the filter column, corresponding to your where condition
E2: =if(D2 <> 1, "", sumifs(D$2:D$1048576, A$2:A$1048576, "<"&A2) + sumifs(D$2:D2, A$2:A2, A2)), this corresponds to the order by
Copy down these formulas as far as your data goes.
On the sheet which should display your result create the following columns.
A: A sequence of numbers starting with 1 in row 2, this limits the total number of rows you can get (kind like a limit in sequel)
B2: =match(A2, DataSheet!$E$2:$E$1048576, 0), this is the row of the corresponding data
C2: =iferror(index(DataSheet!A$2:A$1048576, $B2), ""), this is the actual data or empty if no data exists
Copy down the formulas in B2 and C2 and copy-past column C to D and E.
If you have GDAL/OGR compiled with the against the Expat library, you can use the XLSX driver to read .xlsx files, and run SQL expressions from a command prompt. For example, from a osgeo4w shell in the same directory as the spreadsheet, use the ogrinfo utility:
ogrinfo -dialect sqlite -sql "SELECT name, count(*) FROM sheet1 GROUP BY name" Book1.xlsx
will run a SQLite query on sheet1, and output the query result in an unusual form:
INFO: Open of `Book1.xlsx'
using driver `XLSX' successful.
Layer name: SELECT
Geometry: None
Feature Count: 36
Layer SRS WKT:
(unknown)
name: String (0.0)
count(*): Integer (0.0)
OGRFeature(SELECT):0
name (String) = Red
count(*) (Integer) = 849
OGRFeature(SELECT):1
name (String) = Green
count(*) (Integer) = 265
...
Or run the same query using ogr2ogr to make a simple CSV file:
$ ogr2ogr -f CSV out.csv -dialect sqlite \
-sql "SELECT name, count(*) FROM sheet1 GROUP BY name" Book1.xlsx
$ cat out.csv
name,count(*)
Red,849
Green,265
...
To do similar with older .xls files, you would need the XLS driver, built against the FreeXL library, which is not really common (e.g. not from OSGeo4w).
You can experiment with the native DB driver for Excel in language/platform of your choice. In Java world, you can try with http://code.google.com/p/sqlsheet/ which provides a JDBC driver for working with Excel sheets directly. Similarly, you can get drivers for the DB technology for other platforms.
However, I can guarantee that you will soon hit a wall with the number of features these wrapper libraries provide. Better way will be to use Apache HSSF/POI or similar level of library but it will need more coding effort.
Microsoft Access and LibreOffice Base can open a spreadsheet as a source and run sql queries on it. That would be the easiest way to run all kinds of queries, and avoid the mess of running macros or writing code.
Excel also has autofilters and data sorting that will accomplish a lot of simple queries like your example. If you need help with those features, Google would be a better source for tutorials than me.
I might be misunderstanding me, but isn't this exactly what a pivot table does? Do you have the data in a table or just a filtered list? If its not a table make it one (ctrl+l) if it is, then simply activate any cell in the table and insert a pivot table on another sheet. Then Add the columns lastname, firstname, phonenumber to the rows section. Then Add Phone number to the filter section and filter out the null values. Now Sort like normal.