Outgrew google sheets but do not have expertise in SQL. Is there an interim solution? - sql

Our nonprofit uses google sheets to transform data. The first file has the raw data, which comes to us in a CSV. Data gets passed from one file to another with =importrange. Intermediate files transform various parts of it with lot of google sheet formulas such as =split, =vlookup, =if, =textjoin, =concatenate, etc. The final file has the data in the form that we can use to create pages in our website.
The first file has about 150 columns. The new 10M cell limit should let us get about 60k rows, but even that number freezes up, and we need to get up into the millions of rows. All of the transformer files, together, add up to about 3k columns.
We assume that the ultimate solution is to re-create it all in a SQL database, but we do not have any expertise of that type, nor the funding to hire someone.
Is there an easy way to transform a google sheet (with formulas) into
a SQL file?
Is there an easy interim solution, which we can use for a
while?

Related

Power Query + VBA/Macro Vs. VBA/Macro only

I will briefly explain what I have and need here, and later if I can, I will edit this post and add a reproducible example.
My project:
Query data from Oracle databases into one worksheet in Excel, then use a LOOKUP procedure to copy data into an editable table in a second worksheet. The second worksheet needs to be in a table format for filtering, and have a drop down option to filter the data by date ranges. The data needs to be refreshed 1-2 times a week only by 1-2 approved staff members.
Concerns:
Per suggestion I installed Power Query for Excel 2010, which required dependencies before it could work. The convenience factor is great and it makes it so that SQL queries can be edited without messing around in VBA code. However, the dependencies setup (Oracle client for data connections) limits casually deploying this as a solution.
The data connections and queries and the data lookup could all be done in VBA and assigned macros.
Questions:
Should I use Power Query to query the data and then a VBA for the second sheet LOOKUP and date range filtering -- or should this all be written in VBA Excel Macros?
Which is more future proof friendly? Are there any advantages for using Power Query that would make this task more edit friendly for non-coders?
Thanks!
This probably can be solved with PowerQuery only, without VBA. I wouldn't recommend you storing queries in Excel table, the best is to move it on a server. A view or a function would be suitable. Querying the database, editing this view/function will work for only for only approved users.
This is more secure and will require only 1 Excel workbook. In PowerQuery, you can refer old copy of the table at the moment you refresh it, therefore you can keep entered data and get new.
Your project seem to me as an ad-hoc solution.

Filter certain SQL data formatted in one column into a new column

Before I begin I found this to be most relevant with the research I have done.
How to split the data from one column into separate columns using the contents of another column in SQL
Attached are pictures of my progress so far. How can I display this information such as it is shown in the excel file without disrupting the GROUP BY filter in my Query?
It's a Fishbowl Database, newest version. I am running the queries through Flamerobin which you see in the picture. Trying to organize the query to display correctly so I can format it into 'iReports' and export it into an excel spreadsheet like the one shown. Maybe there is some part of this that would better be done in excel?
Notice the numbers for Qty are different, that's ok right now.
My reputation is too low to post pictures I am sorry. Here are the two JPGs in my Dropbox. I really appreciate the help.
https://www.dropbox.com/sh/r2rw5r2awsyvzs9/AAAXXg27CMPOYtZFqPX3Dx6la?dl=0

SQL Server 2008 - TSQL Read CSV file

I am working on a project that basically entails on importing a CSV file into a SQL Server 2008 R2 database. The CSV file is generated from an Excel file that is populated by a "manager" with PR hours for his employees. This also includes some additional information such as which job and phase the employees were working on and also includes the number of hours for an equipment (if used).
Once you generate a CSV file for that, it's not exactly the usual straighforward "column" based CSV file. It's more like a "row" based CSV file with each row being kind of unique. Due to this caveat involved, I cannot do a straight dump (using BULK insert or OPENROWSET) to SQL, which would essential create a (temp) table with the appropriate column filled data.
I am looking to use the fields within the CSV file based on the "location" of that field in the row.
So, basically the positions of the data will remain the same, since every CSV is based on a TEMPLATE file - so all I have to do is navigate through the CSV file using SQL code to find the right field based on it's position in the ROW. I hope that gives you guys a better understanding of what I am trying to achieve here. Sorry for the long wall of text.
I researched a bit and here's what I have come up with so far:
Reads CSV files into a temp table through a custom SQL function (Reading lines from a file)
https://www.simple-talk.com/sql/t-sql-programming/reading-and-writing-files-in-sql-server-using-t-sql/
This one is interesting. Dumps the whole file as a BLOB and then you can sift through the data.
http://www.mssqltips.com/sqlservertip/1643/using-openrowset-to-read-large-files-into-sql-server/
Finally, this one essential splits out the rows and creates separates records per row. Interesting..
http://ask.sqlservercentral.com/questions/17408/how-to-read-a-text-file.html
If anyone has any suggestions or steps that I could follow to get through this, I would greatly appreciate it.
To the Mods: If I have posted something (especially the links) that shouldn't be here, please feel free to remove it. I apologize if I did.
Thanks much.. Hope to hear some positive responses! :)
Warm Regards,
Pranav
If the file is not too large, another option is to post-process the file in Excel using a VBA macro. Of course, you'd need to come up to speed using the Excel object model and VBA, but the recording function makes it fairly simple. One advantage of the VBA approach is that it seems you really do want to do row by row processing, and VBA is better for that, whereas SQL is better for set-based operations.

Creating Power point slides with tables in Excel

I have a large workbook that has several connections and queries to an Oracle database to gather data.
I have roughly 6 sheets in this workbook that contain my final data.
I would like to move all of this data to a PowerPoint presentation. I have seen many examples of how to move charts and graphs, but I have none of these in my workbook nor do I need them.
3 of my sheets display data generated by a pivot table on a separate sheet. I have done this because I am trying to avoid showing the pivot filter arrows.
The other three sheets are a table created from the Oracle query. Each sheet has a separate query to display data specific to a certain customer.
I would like to take the data I have in my spreadsheets and build tables in PowerPoint containing that data. I have tried importing the objects to PowerPoint, but since the data can change from minute to minute having to update the links and then refresh the data is rather clumsy. Also, I never know how many rows of data I will have. This is also due to the fact that the data can change minute by minute.
In short I am trying to look at Sheet one. Take all of the data there and build a table in PowerPoint to match. When building the table in PowerPoint only place a max number of 6 rows per PowerPoint slide. Continue to add slides until all of the data is moved.
You'll probably need to cobble together bits of the following that my friends Brian and Naresh have allowed my to post on the PowerPoint FAQ site I maintain, but between the two, it should get you there:
Controlling Office Applications from PowerPoint (by Naresh Nichani and Brian Reilly)
http://www.pptfaq.com/FAQ00795_Controlling_Office_Applications_from_PowerPoint_-by_Naresh_Nichani_and_Brian_Reilly-.htm
Where it starts: the DisplayData project by Naresh Nichani and Brian Reilly
http://www.pptfaq.com/FAQ00784_Where_it_starts-_the_DisplayData_project_by_Naresh_Nichani_and_Brian_Reilly.htm

how can you parse an excel (.xls) file stored in a varbinary in MS SQL 2005?

problem
how to best parse/access/extract "excel file" data stored as binary data in an SQL 2005 field?
(so all the data can ultimately be stored in other fields of other tables.)
background
basically, our customer is requiring a large volume of verbose data from their users. unfortunately, our customer cannot require any kind of db export from their user. so our customer must supply some sort of UI for their user to enter the data. the UI our customer decided would be acceptable to all of their users was excel as it has a reasonably robust UI. so given all that, and our customer needs this data parsed and stored in their db automatically.
we've tried to convince our customer that the users will do this exactly once and then insist on db export! but the customer can not require db export of their users.
our customer is requiring us to parse an excel file
the customer's users are using excel as the "best" user interface to enter all the required data
the users are given blank excel templates that they must fill out
these templates have a fixed number of uniquely named tabs
these templates have a number of fixed areas (cells) that must be completed
these templates also have areas where the user will insert up to thousands of identically formatted rows
when complete, the excel file is submitted from the user by standard html file upload
our customer stores this file raw into their SQL database
given
a standard excel (".xls") file (native format, not comma or tab separated)
file is stored raw in a varbinary(max) SQL 2005 field
excel file data may not necessarily be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, different "formats", ...)
requirements
code completely within SQL 2005 (stored procedures, SSIS?)
be able to access values on any worksheet (tab)
be able to access values in any cell (no formula data or dereferencing needed)
cell values must not be assumed to be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, formulas, different "formats", ...)
preferences
no filesystem access (no writing temporary .xls files)
retrieve values in defined format (e.g., actual date value instead of a raw number like 39876)
My thought is that anything can be done, but there is a price to pay. In this particular case, the price seems to bee too high.
I don't have a tested solution for you, but I can share how I would give my first try on a problem like that.
My first approach would be to install excel on the SqlServer machine and code some assemblies to consume the file on your rows using excel API and then load them on Sql server as assembly procedures.
As I said, This is just a idea, I don't have details, but I'm sure others here can complement or criticize my idea.
But my real advice is to rethink the whole project. It makes no sense to read tabular data on binary files stored on a cell of a row of a table on database.
This looks like an "I wouldn't start from here" kind of a question.
The "install Excel on the server and start coding" answer looks like the only route, but it simply has to be worth exploring alternatives first: it's going to be painful, expensive and time-consuming.
I strongly feel that we're looking at a "requirement" that is the answer to the wrong problem.
What business problem is creating this need? What's driving that? Try the Five Whys as a possible way to explore the history.
It sounds like you're trying to store an entire database table inside a spreadsheet and then inside a single table's field. Wouldn't it be simpler to store the data in a database table to begin with and then export it as an XLS when required?
Without opening up an instance Excel and having Excel resolve worksheet references I'm not sure it's doable at all.
Could you write the varbinary to a Raw File Destination? And then use an Excel Source as your input to whatever step is next in your precedence constraints.
I haven't tried it, but that's what I would try.
Well, the whole setup seems a bit twisted :-) as others have already pointed out.
If you really cannot change the requirements and the whole setup: why don't you explore components such as Aspose.Cells or Syncfusion XlsIO, native .NET components, that allow you to read and interpret native Excel (XLS) files. I'm pretty such with either of the two, you should be able to read your binary Excel into a MemoryStream and then feed that into one of those Excel-reading components, and off you go.
So with a bit of .NET development and SQL CLR, I guess this should be doable - not sure if it's the best way to do it, but it should work.