I have T-SQL queries stored on a hard drive: I:\queries\query1.sql and I:\queries\query2.sql.
I usually work in a way that I execute a query from a drive, and then I copy results into Excel, and then I work on it.
My problem here is that query1.sql is already long, and now I would like to extend it by getting a result of query2.sql, and join it with a result of query1.sql.
What I could do is appending a code from query2.sql to query1.sql. But then the query is getting really long and hard to maintain.
I would like to do something like this:
SELECT * FROM ("Result of I:\queries\query1.sql") q1
LEFT JOIN ("Result of I:\queries\query2.sql") q2 ON q1.ID=q2.ID
Is there any way to write a query or stored procedure, which will be again stored on a drive to do this?
Basically, you need to ask your DBA for a database when you are able to store things in the database. This can be on the same system where the data is stored. Or, it could be on a linked system. Gosh, you could run SQL Server locally and store the information and data there.
Then, the queries that you are storing in files should be views in the database. You can then run the queries and store and combine the results locally.
You are essentially recreating database functionality using text files and data files -- going through a lot of effort when SQL Server already supports this functionality.
To expand on Gordon's comment (+1), why are you running scripts off of a drive? Most DBA's I've known would treaten bodily harm over this as executing code that they can't control / troubleshoot / see source code control on brings a whole host of security and supportability issues.
Far better to store this code in a Stored Procedure, which will have a saved query execution plan, can be tracked using various DMV's, and have permissions assigned to it, then your outside Excel doc can just set a connection and execute the SP.
Related
Good morning,
I just received a new assignment and I am struggling with finding an appropriate solution. I have searched through the SO Forums, and through Google, but have not found a workable solution. Below is my scenario:
We are working out of Microsoft Access to connect to an SQL Server Database via an ODBC Connection.
I wasgiven an incredibly large pass-through SQL query, larger than is able to be processed in MS Access. In this pass-through query, there is a subquery in a WITH...AS method.
I am hoping to be able to split this one, singularly large, SQL pass through query into two: Query One (the subquery), and Query Two (which references the results of the subquery)
I know that by using general Access queries, I can write a Macro like follows...
Sub myQuery()
' Edited from http://www.dbforums.com/showthread.php?1667831-Run-multiple-queries-in-sequence-on-click
' On Error GoTo ErrHandler
' Run the first query
MsgBox "Starting first query"
DoCmd.OpenQuery "first_Query"
DoEvents
' Run the second query
MsgBox "Done. Now starting second query"
DoCmd.OpenQuery "second_Query"
DoEvents
MsgBox "Done!"
End Sub
However, these need to be pass-through queries. I believe that the enormously large SQL String is created via a number of user inputs. Regardless, I don't have the ability to change the pass-through SQL that I was given.
Is there anyway I can write a macro that calls the first pass-through query, and then calls the second pass-through query that REFERENCES the result of the first?
Here is an example with what I am working with...
WITH queryOne AS
(
SELECT fooID
FROM tblFoo
WHERE foodate > ...
)
SELECT foo, fooone, footwo, foothree
FROM tblOtherFoo
WHERE fooID = OtherFooID
However, the query is 50000+ characters, exceeding that ~37k limit.
Please feel free to ask any questions. I am stumped by this and would appreciate any feedback or alternative resources.
Thank you!
It not clear what you mean by something that references the first or previous? Why break up something you been given that supposed works just fine?
So just place that existing t-sql you been given into a stored procedure. In T-SQL you can easy have some SQL operate on some previous SQL, but why break up such a HUGE massive monster slew of code and introduce bugs? It will take you YEARS AND YEARS to break up a KNOWN working huge T-SQL that been built and developed for you (something that long likely took a few years and a team of developers to create).
A conservative estimate would be such a routine cost $50,000, or even $100,000 to develop.
No question that the working T-SQL you been given might reference previous data, or even do selects into #Temp tables that additional T-SQL can work on.
If you ALREADY have a working PT code and query given to you?
Simply take that T-SQL query, and simply paste it into a stored procedure. You will do this in SQL Server and NOT even touch or bother with Access.
So don't create some macro in Access that calls multiple separate queries, but place all of the T-SQL in a stored procedure, and simply call that huge mess one time from Access.
It possible that the T-SQL you been given is incorrect, but assuming that the T-SQL is correct, then simply place all that long mess into a working stored procedure. You do not place this SQL in MS Access and you don’t need to have that mess inside of Access.
So get that T-SQL working in SQL Server – don’t bother with MS Access until this long query mess is working in SQL Server. ONLY THEN do you fire up Access.
So you THEN create a simple PT query in Access that calls the huge long T-SQL mess you been given. But that “mess” is to be placed in SQL Server – not in Access.
So create a PT query in Access that calls your “supposed” working T-SQL you been given. The SQL you save in the Access PT query will be this
Exec my_StupidLongSQLProc
Save the above as a PT query. Then in VBA code go:
Currentdb.QueryDefs("MyPTQuery").Execute
If you need to pass some values from Access, then go:
With CurrentDb.QueryDefs("MyPTQuery")
.sql = "exec My_StupidLongSQLproc " & p1 & "," p2
.Execute
End with
In above we pass two VBA values from Access to the big mess of sql you have – the stored procedure in above is just an example that access two parameters passed from Access VBA. If the T-SQL you been given does not require values from Access, then the first single .execute will do the job.
And if you REALLY did get such a long routine that is correct T-SQL, then it likely already has parameters in the working T-SQL (and again you don’t want to mess with or change such a huge long working T-SQL that you been given).
So you only need one line of code in Access, and your existing long T-SQL you been given if written correctly can be placed in a stored procedure (assuming that you actually been given a correctly working PT query).
So if you REALLY did get a huge massive working T-SQL statement, then simply place that KNOWN AND WORKING T-SQL in SQL Server as a stored procedure and call it with one line of code as per above.
So trying to split this up from Access will only server to cause world-wide poverty and ANY tiny miss step or breaking up of that huge long routine will cause world-wide poverty and starving children as you try and “fix” this great working T-SQL that you been given. As noted, something that long to create would take a teams of developers HUGE resources. If you touch or break up one line of code and mess it up, then you need that team of developers to spend several months trying to fix what you broke.
So the INSTANT you start breaking up such a huge long mess is the instant you lost this battle and will waste several years of your life trying to fix this crazy long T-SQL that you been given that is already claimed to be proper working code.
I'm not sure if a stored procedure is the correct path here so first, I'll explain what I need to accomplish.
I need to query one table with a variable as such:
SELECT *
FROM db.partList
WHERE column1 = 'order_no';
Then, I need to query another table as such:
SELECT [serial_no]
FROM db.resultsList;
Finally I need to iterate through the results of the above, and return the first [serial_no] from db.partList that is not in the list produced.
The original programmer was doing this in a way that was blowing up the customer's network unnecessarily. There shouldn't be any reason this can't be done locally. Now I'm here to clean it up.
Thanks in advance.
So my questions are, would this be correct use of a stored procedure? If so, could someone perhaps give me some sample code to start working with? I don't often have to dive that deep into SQL Server.
This is SQL Server 2012.
I've got some example code of how I would do it in other languages if needed. I'm just not familiar enough with stored procedures to do this quickly.
In a SSIS ETL, I have a query that I need to run on a server/db that does not allow us to create stored procedures.
I would normally use the stored procedure in my variable as the source for my OLE DB source:
However, since we can't put the stored procedure on this server, I was going to store the code for the stored procedure into a variable by executing a SQL statement, retrieving the text from our home database, then use the text stored in this variable as the SQL command for the source:
This way, I can still remotely change the SSIS OLE DB Source object WHERE clause (as long as I don't change the SELECT portion).
I can't imagine that this is very common, so I wanted to get some opinions - is there a better way to do this? I don't want to put all of the code for this SP into the OLE DB Source editor directly because we can't afford to redeploy in case of a WHERE clause update.
You've got the part down that many folks don't do and that's using Variables to drive your package execution. You are further correct in that you can't exactly swap out your columns. To be pedantic, which I am, you can completely change out the query as long as the same metadata is presented.
So, then this question becomes how best to accomplish allowing a package to have a query's filter driven by an external force. Factoring in maintainability, ease of debugging, etc.
My gut reaction is 3 Variables
QueryBase: String. Hardcoded. SELECT * FROM MyTable except of course I'd enumerate my columns
Query: String. EvaluateAsExpression = True Expression: #[User::QueryBase] + #[User::QueryFilter]
QueryFilter: String
So, we use Query in the OLE DB Source much as you have your longer variable name in there. The only downside to this approach, pre SSIS-2012 is the limitation on string length in an expression. It was ... 4k I believe. If you assign a value of 5k characters, it's fine. It's just in the expression language, adding two strings together can't exceed 4k.
I didn't specify what QueryFilter is going to have in it or the magic to get it there. That, I would base on the bigger picture of your environment, usage, etc. but the general concept is that it will eventually turn into WHERE Condition1 IS NOT NULL but maybe in a full reload situation, it becomes an empty string.
So, what are our options for changing the value of QueryFilter
/SET is an optional parameter passed to the invoking process (dtexec.exe) that makes SSIS packages go. If you have a very limited set of choices and aren't interested in building additional infrastructure out to support the parameters, just hard code some examples. Approximately dtexec /file p1.dtsx /set \Package.Variables[User::QueryFilter].Properties[Value];" WHERE Condition1 IS NOT NULL" Save it into .bat files, different sql agent jobs, whatever. Click and run and you're done.
Configuration approach. SSIS offers native ability to use configurations from a SQL Server table, XML, Registry, Parent Package and Environment Variable for 2005 to current edition. The only downside to this approach is that it would not support concurrent execution with different parameters like the first would.
Environment approach. 2012 and 2014, with their new Project Deployment Model, give us the concept of Environments within the SSISDB catalog which is similar to configuration with a SQL Server table but it is done after development is complete and the packages are deployed. It's rather nice as it builds out a history of values used so if someone asks why is the data all wrong, you can write a query to pull back the parameters used and Oh look someone used the initial load filter instead of the daily. Whoopsidaisy. Same concern over concurrent execution and changing values.
Table driven approach. Instead of using the Configuration with SQL Server table backing, you roll your own table and then add into your package an Execute SQL Task to retrieve the filter, Single Row, into our QueryFilter Variable.
Script Task. Use whatever floats your boat to determine what the filter should be.
Message Queue. They have built in a Message Queue Task and might be of use here if you're already doing it. Otherwise, too much effort to manage
I have an excel file that will select roughly 1100 rows with 5 columns of data. Most columns are 5 digits long and are integers. I am using a macro to connect to a SQL server database and insert these rows into one maybe two tables. This is all its doing and then it closes the connection. So the user opens an excel file that has the rows, clicks a button and it executes the macro.
My question is, should the query be written in Excel since its simple and merely inserts the data into a few tables. Or is it more efficient calling a stored procedure and passing all of the values in the stored procedure and have it allocate where the values go in the different tables. When I mean efficient, i mean which is the quickest? I know this will probably take a few seconds to complete. I just feel going to a stored procedure is an extra point along the path that the data has to get to before it reaches the tables. Am I wrong? Any thoughts?
There are some advantages to using stored procedures in SQL Server. One is that SQL Server precompiles and saves the query execution plan, which increases performance. With your current method, SQL Server will generally need to generate the execution plan each time. Stored procedures can also reduce client/server network traffic.
So, even though it may seem like an extra point along the path, it actually can be faster.
In addition to #mark d.'s answer, another reason for using a stored procedure is security.
Your comment says that a customer is entering the data into Excel, so if you are putting direct SQL into your spreadsheet, then there is a risk that someone will open your spreadsheet and find out information about your database. But if you use a stored procedure then there is far less that can be learned.
Either way, make sure that you aren't hardcoding any connection string/account credentials into the spreadsheet.
I'm providing maintenance support for some SSIS packages. The packages have some data flow sources with complex embedded SQL scripts that need to be modified from time to time. I'm thinking about moving those SQL scripts into stored procedures and call them from SSIS, so that they are easier to modify, test, and deploy. I'm just wondering if there is any negative impact for the new approach. Can anyone give me a hint?
Yes there are issues with using stored procs as data sources (not in using them in Execute SQL tasks though in the control flow)
You might want to read this:
http://www.jasonstrate.com/2011/01/31-days-of-ssis-no-more-procedures-2031/
Basically the problem is that SSIS cannot always figure out the result set and thus the columns from a stored proc. I personally have run into this if you write a stored proc that uses a temp table.
I don't know that I would go as far as the author of the article and not use procs at all, but be careful that you are not trying to do too much with them and if you have to do something complicated, do it in an execute sql task before the dataflow.
I can honestly see nothing but improvements. Stored procedures will offer better security, the possibility for better performance due to cached execution plans, and easier maintenance, like you pointed out.
Refactor away!
You will not face issues using only simple stored procedures as data source. If procedure is using temp tables and CTE - there is no guarantee you will not face issues. Even when you can preview results in design time - you may get errors in a run time.
My experience has been that trying to get a sproc to function as a data source is just not worth the headache. Maybe some simple sprocs are fine, and in some cases TVFs will work well instead, but if you need to do some complex operations there's no alternative to a sproc.
The best workaround I've found is to create an output table for each sproc you need to use in SSIS.
Modify the sproc to truncate the new output table at start, and to write its output to this instead of (or in addition to) ending with a SELECT statement.
Call the sproc with an Exec SQL task before your data flow.
Have your data flow read from the output table - a much simpler task.
If you want to save space, truncate the output table again with another Exec SQL. I prefer to leave it, as it lets me examine the data later and lets me rerun the output data flow if it fails without calling the sproc again.
This is certainly less elegant than reading directly from a sproc's output, but it works. FWIW, this pattern follows the philosophy (obligate in Oracle) that a sproc should not try to be a parameterized view.
Of course, all this assumes that you have privs to adjust the sproc in question. If necessary, you could write a new wrapper sproc which truncates the output table, then calls the old sproc and redirects its output to the new table.