sorting BIG excel data - sql

Right so, I have been given a LOT of "consumer data" to sort, 3 excel files, each containing multiple worksheets, up to 7 worksheets, each up to 1M rows (max worksheet size in excel 2013 is just over 1 rows)
I need to pull out of these all people within a region, so have a list of post codes in this region (say 30 post code areas)
How can I achieve this most easily?
If the data was in SQL server, i'd just write a long SQL statement selecting all where postcode LIKE 'B75'% OR 'B74'% etc etc.
But in excel I can only run a "filer" on one worksheet at a time... (I think)
Is it going to be easiest to throw all the data into sql server, or have I overlooked a method?

First solution is to let Excel do the task of sorting for you. You need to add filters to columns and select Sorting options.
Other solution is to Export Data to SQL Table(s). To do this, Open SQL Management Studio, Right Click on Database to which you want to export data to and Select "From Excel File". Do this for each Excel file you have. After importing all data in Database, sort Data using SQL Query.
Second Solution is reliable but first solution is faster. You need to decide which one you should select.

Related

Power Query + VBA/Macro Vs. VBA/Macro only

I will briefly explain what I have and need here, and later if I can, I will edit this post and add a reproducible example.
My project:
Query data from Oracle databases into one worksheet in Excel, then use a LOOKUP procedure to copy data into an editable table in a second worksheet. The second worksheet needs to be in a table format for filtering, and have a drop down option to filter the data by date ranges. The data needs to be refreshed 1-2 times a week only by 1-2 approved staff members.
Concerns:
Per suggestion I installed Power Query for Excel 2010, which required dependencies before it could work. The convenience factor is great and it makes it so that SQL queries can be edited without messing around in VBA code. However, the dependencies setup (Oracle client for data connections) limits casually deploying this as a solution.
The data connections and queries and the data lookup could all be done in VBA and assigned macros.
Questions:
Should I use Power Query to query the data and then a VBA for the second sheet LOOKUP and date range filtering -- or should this all be written in VBA Excel Macros?
Which is more future proof friendly? Are there any advantages for using Power Query that would make this task more edit friendly for non-coders?
Thanks!
This probably can be solved with PowerQuery only, without VBA. I wouldn't recommend you storing queries in Excel table, the best is to move it on a server. A view or a function would be suitable. Querying the database, editing this view/function will work for only for only approved users.
This is more secure and will require only 1 Excel workbook. In PowerQuery, you can refer old copy of the table at the moment you refresh it, therefore you can keep entered data and get new.
Your project seem to me as an ad-hoc solution.

Exporting SQL Server data to Excel

We work with a lot of data at my job, and I want to try and find a way to limit the amount of copying from SSMS to an Excel sheet that goes to the client.
What I want to be able to do, using SSIS if possible or any other possible way (Maybe power query?), is to copy the data pulled via a SQL query to an Excel workbook sheet.
For example, I want to do a count on the amount of members by state, I'd have the query run and the results copied to the sheet called "State" in the Excel work book.
Example code:
SELECT C.State, COUNT(*) as Count
FROM [dbo].Input I
Join Cassresults C on C.ID = I.ID
group by C.State
order by Count desc
The Excel workbook will never change for the client. The only thing that may change are the queries, but those are easily updated.
Is there a way to actually do this or am I nuts for thinking so? I hope I explained it well enough.
SSAS, SSIS, PowerQuery, PowerBI, Excel PowerPivot, SSRS, and Excel Data Querys all are geared for this type of use. I would definitely NOT recommend VBA as your users will constantly get a security warning and it is more complex than needed.
For Excel probably a good starting location go to the data tab and click "From Other Sources" and check out the different source types. From Micrsoft Query gives you the ability to write a query or copy from SSMS.
The only thing is will Data Sources Change? If so every workbook you create and distribute will the become obsolete and need to be changed. SSRS is a good choice to allow users to grab the report (and export to Excel) that they need.
When doing SSAS it is great as well but start with PowerPivot in Excel, again data connections move Sharepoint data connection library is a way to combat that.
This is like a BI and reporting design question and you will get a plethora of answers.

How to perform analysis on a huge Excel sheet

I have an Excel sheet with around 50k rows of data and 10 columns. The sheet is about wholesalers and their products. In the current version there are around 30 unique wholesalers, each of them with around between 1000 and 3000 different products (I have queried this information from the database). What I want to do is to extract the distinct wholesalers, put them in another sheet and then for each wholesaler to find the total count of products that they offer. I was able to get a distinct list of the wholesalers (via a macro), but now I am confused how to use it in order to get the total count of their product: something like for each wholesaler do:
Select Count(*)
From worksheet s
Where s.wholesaler == "one of the value from the list"
And in general my question is what is the best way to query worksheet with loads of data? (like to use macros, pivot tables or some other excel magic)
If you have a SQL query then use it :). Excel allows you to run SQL queries. See Data ribbon, External Data-From other sources -> Microsoft Query. Or checkout my SQL extension for Excel: http://blog.tkacprow.pl/?page_id=130

Oracle SQL - Is there are more efficient way to organise a massive case statement

Currently I have a report which looks at different types of documents. Each document has an assigned timescale it should be completed by (i.e. 2 days, 4 days, etc). There are more than 100 types of document. Currently, this assigned timescale for each document is held in an excel spreadsheet and matched to the data in excel using a vlookup formula (based on assessment ID). Unfortunately there is no place in our database to put this assigned timescale, but I would like to be able to run a report from the database and just send it to users without having to do this extra manipulation in excel. I know that I could achieve this by writing a massive case statement (below is just an example)
i.e.
SELECT
ID,
CASE WHEN ID = 1 then '1 day'
WHEN ID = 2 then '42 days'
WHEN ID = 3 then '16 days'
ELSE 'CHECK' end as 'Timescale'
FROM TABLE1
But I did wonder if there was a more efficient way of doing this in the SQL (besides requesting an additional field in the database to record this!)? It might be that there isn't, but thought it was worth asking! Thanks.
If you have 100 different time scales it would be reasonable to add a TIMESCALE table to your database and get away from storing information which is important to your business in a spreadsheet. Nothing against Excel, fine product, some of my best friends are Excel spreadsheets - but I don't store business-critical information in them.
Share and enjoy.
So you want to join between an Oracle table and an excel sheet...
I think this is not entirely impossible. There are 2 ways.
Way 1. You can do the join in Oracle. That means that you have to write a Java Stored Procedure that can read the excel sheet. The next step is to create PL/SQL wrappers for wrapping this Java Stored Procedure. After that you can write an SQL statement that calls the Java Stored Procedure via the PL/SQL wrappers, this SQL statement can make a join with your Oracle-table.
Yes indeed, this is very complex.
Way 2. I think you can connect from an excel sheet to Oracle via ODBC. It should be possible to fetch data from Oracle within excel. So excel can do the join for you.
Yes indeed, this is very complex.
You can also put this extra data in a new Timescale table (like Bob Jarvis suggested) but you will have to synchronize between the excel sheet and the Oracle table.
You can also move all data to Oracle. Or maybe you can move all data to excel (probably not) ?

Using an Excel macro to query a spreadsheet

So I have some data in some spreadsheets and I've found that for all the macros and filtering and forumlas I've written to simplify it and narrow it down to what I want, it would have been much easier to just write some SQL against a few tables.
I guess I'm wondering: is it possible to have a macro in a workbook that queries data in some sheets and then populates another sheet with the result set? If so, how would I do it?
(It is Excel 2003)
No need for a macro for this.
Go to DATA-> Import External Data -> Import Data then basically follow the prompts. You may need to make a new data connection, (New Source at the bottom) but once connected you can write queries natively in Excel.
I'm guessing someone familiar with DBs would be able to figure it out pretty quickly. If not, here's a tutorial.
Why do you need to use a macro when you can simply query the excel file like this:
SELECT Column1, Cloumn2, Column3
FROM [SheetName$Range]
WHERE Condition
Example:
SELECT ProductID, Qty, Price
FROM [SheetName$A10:C21]
WHERE ProductID = 545