Replacing Cursors in SQL Script? - sql

I have created an SQL script that runs really fast and is doing everything I want. It uses a cursor to loop through the parent record and then does some calculations on each and then outputs the results into a temporary table. I have another cursor in that one to extract all the children records of that parent and again does some work and puts it into a temporary table.
My senior Dev is saying cursors are awful and I should do it another way, but also doesn't tell me what a better way is.
So how do I loop through records and do steps of calculations and create an output for each record without using a cursor?
I'm sorry due to work product and how large the script is I can't post it's code. The format of it is:
cursor loops through table that holds parent records
For each parent record it takes field values and does conversions from strings to time.
Those conversions are then used in between statements to figure out if a time falls between the 2 field times
An insert statement with the output is put into a temp table and summed at the end.
Another cursor is created in the parent cursor to pull child records of the parent record from another table. The same process as the parent happens.
I'm not actually upset with my script, its working as intended, its running very quickly so far, but I am open to better practices if possible.

First of all, I hope you're aware that SQL Server 2008 is out of support even with SP4 for a few months now.
Second, as others already said, your Senior DBA is right about cursors. And if your code is too big to post it here, it probably is too time consuming for him to go through it, understand your code and then change it for you. I would expect a senior to give some hints on what to search for, though.
About your question, I find it very hard to think of an answer because your description only gives me a vague idea what you're trying to accomplish. E.g., what are the "field values" that you convert to time?
As of my experience, SQL Server does a pretty good job interpreting datetime strings. You may also find cast/convert and datepart useful.
As far as I understood your parent/child table, you'd probably want to use a table join here. They are well explained here: https://www.w3schools.com/sql/sql_join.asp
You may aggregate the result set to sum(). But again, my understanding of your endeavour is to vague.

Related

Determine if a SQL Insert/Update statement affects the result from a stored Select Statement

Thought this would be a good place to ask for some "brainstorming." Apologies if it's a little broad/off subject.
I was wondering if anyone here had any ideas on how to approach the following problem:
First assume that I have a select statement stored somewhere as an object (this can be the tree form of the query). For example (for simplicity):
SELECT A, B FROM table_A WHERE A > 10;
It's easy to determine the below would change the result of the above query:
INSERT INTO table_A (A,B) VALUES (12,15);
But, given any possible Insert/Update/Whatever statement, as well as any possible starting Select (but we know the Selects and can analyze them all day) I'd like to determine if it would affect the result of the Select Statement.
It's fine to assume that there won't be any "outside" queries, and that we know about all the queries being sent to the DB. It is also assumed we know the DB schema.
No, this isn't for homework. Just a brain teaser I've been thinking about and started to get stuck on (obviously, SQL can get very complicated.)
Based on the reply to the comment, I'd say that without additional criteria, this ranges between very hard and impossible.
Very hard (leastways, it would be for me) because you'd have to write something to parse and interpret your SQL statements into a workable frame of reference for your goals. Doable, but can it be worth the effort?
Impossible because some queries transcend phrases like "Byzantinely complex". (Think nested queries, correlated subqueries, views, common table expressions, triggers, outer joins, and who knows what all.) Without setting criteria such as "no subqueries, no views or triggers, no more than X joins" and so forth, the problem becomes open-ended enough to warrant an NP Complete answer.
My first thought would be to put a trigger on table_A, where if any of the columns you're affecting (col A in this case) changes to meet (or no longer meet) the condition (> 10 here), then the trigger records that an "affecting" change has taken place.
E.g. have another little table to record a "last update timestamp", which the trigger could pop a getdate() into when it detects such a change.
Then, you could check that table to see if the timestamp has changed since the last time you ran the select query - if it has, then you know you need to re-run it, if it hasn't, then you know the results would be the same.
The table could hold many such timestamps (one per row, perhaps with the table/trigger name as a key value in another column) to service many such triggers.
Advantage? Being done in a trigger on the table means no risk of a change that could affect the select statement being missed.
Disadvantage? I guess depending on how your select statements come into existence, you might have an undesirable/unmanageable overhead in creating the trigger(s).

How to simulate ifs in a sql query that is not database server dependent?

Given the below table:
|idAsPrimaryKey|Id - it has a diff name, but it is easier like this|column A|
How can I select in a single sql query, not database server specific, something similar to:
List of results = null
for each different id:
if there is a row for this id that has for column A the value V1
ListOfResults add this found row
else
if there is a row for this id that has for column A the value V2
ListOfResults add this found row
else
add to ListOfResults the first row found for this id
Quite easy, since you don't seem to know anything about SQL, here's a "teach a man how to fish..." answer.
You have an amount of data and "only" a language how to get data, nothing to really "program". (Of course there are functions and procedures and so on, but those are used in other circumstances or the programmer makes things more complicated than necessary)
Because of this, you have to find a way, how to combine the data, sometimes even with itself, to get what you want. This blog post explains the basics of joins (that's how you combine tables or data from subqueries): A Visual Explanation of SQL Joins (for critics of this post, please read on...)
With this basic knowledge you should now try to create a query, where you join your table to itself two times. To choose the right value for your ListOfResults you then have to use the COALESCE() function. It returns the first of its parameters which isn't NULL.
Here comes the critic for the link I posted above. The Venn diagramms used in the first link don't represent how much data you get back from joining. For this to learn, read this answer here on SO: sql joins as venn diagram
Okay, now you learned, that you might get more data back than you might expect. And here comes another problem in your wording of your question. There's no "first" row in relational databases, you have to exactly describe which row you want, else the data you get back is actually worth nothing. You get random data. A solution for both problems is using GROUP BY and (important!) an appropriate aggregate function.
This should be enough info for you to solve the problem. Feel free to ask more questions if anything is unclear.

Regarding the dividing of PL/SQL apps into several units

Here's my application workflow.
I have a ref cursor that is populated with all my employees IDs..It's just an identification number really.
But now I want to fetch a lot of information for every employee...(as fetched form the ref cursor).It's not simply data, but a lot of computed,derived data too. The sort of derivation that's more easily done via cursors and procedures and so on....
For example, the sum of all the time intervals during which an employee was stationed in Department 78...(that could be just one of the columns for each employee).
So I think I could accomplish this with a really large (by large, I mean really difficult to maintain, difficult to understand, difficult to optimize, difficult to reuse, refactor..etc etc) SQL query, but that really isn't something I'd do unless as a real last resort.
So I'm trying to find ways to use all of PL/SQL's might to split this into as many separate units (perhaps functions or procedures) as possible so as to be able to handle this in a simple and elegant way...
I think that some way to merge datasets (ref cursors probably) would solve my problems... I've looked at some stuff on the internet until now and some things looked promising, namely pipelining... Although I'm not really sure that's what I need..
To sum up, what I think I need is some way to compose the resulting ref cursor(a really big table, one column for the ID and about 40 other columns, each with a specific bit of information about that ID's owner.),using many procedures, which I can then send back to my server-side app and deal with it. (Export to excel in that case.)
I'm at a loss really.. Hope someone with more experience can help me on this.
FA
I'm not sure if that is what you want, or how often do you need to run this thing
But since it sounds very heavy maybe you dont need the data up to date this second
If it's once a day or less, you can create a table with the employee ids, and use seperate MERGE updates to calculate the different fields
Then the application can get the data from that table
You can have a job that calculates this every time you need updated data.
You can read about the merge command here wiki and specifically for oracle here oracle. Since you use separate commands you can of course do it in different procedures if that is convenient.
for example:
begin
execute immediate 'truncate table temp_table';
insert into temp_table select emp_id from emps;
MERGE INTO temp_table a
USING (
select name ) b
on (a.emp_id = b.emp_id )
WHEN MATCHED THEN
UPDATE SET a.name = b.name; ...

Keeping dynamic out of SQL while using specifications with stored procedures

A specification essentially is a text string representing a "where" clause created by an end user.
I have stored procedures that copy a set of related tables and records to other places. The operation is always the same, but dependent on some crazy user requirements like "products that are frozen and blue and on sale on Tuesday".
What if we fed the user specification (or string parameter) to a scalar function that returned true/false which executed the specification as dynamic SQL or just exec (#variable).
It could tell us whether those records exist. We could add the result of the function to our copy products where clause.
It would keep us from recompiling the copy script each time our where clauses changed. Plus it would isolate the product selection in to a single function.
Anyone ever do anything like this or have examples? What bad things could come of it?
EDIT:
This is the specification I simply added to the end of each insert/select statement:
and exists (
select null as nothing
from SameTableAsOutsideTable inside
where inside.ID = outside.id and -- Join operations to outside table
inside.page in (6, 7) and -- Criteria 1
inside.dept in (7, 6, 2, 4) -- Criteria 2
)
It would be great to feed a parameter into a function that produces records based on the user criteria, so all that above could be something like:
and dbo.UserCriteria( #page="6,7", #dept="7,6,2,4")
Dynamic Search Conditions in T-SQL
When optimizing SQL the important thing is optimizing the access path to data (ie. index usage). This trumps code reuse, maintainability, nice formatting and just about every other development perk you can think of. This is because a bad access path will cause the query to perform hundreds of times slower than it should. The article linked sums up very well all the options you have, and your envisioned function is nowhere on the radar. Your options will gravitate around dynamic SQL or very complicated static queries. I'm afraid there is no free lunch on this topic.
It doesn't sound like a very good idea to me. Even supposing that you had proper defensive coding to avoid SQL injection attacks it's not going to really buy you anything. The code still needs to be "compiled" each time.
Also, it's pretty much always a bad idea to let users create free-form WHERE clauses. Users are pretty good at finding new and innovative ways to bring a server to a grinding halt.
If you or your users or someone else in the business can't come up with some concrete search requirements then it's likely that someone isn't thinking about it hard enough and doesn't really know what they want. You can have pretty versatile search capabilities without letting the users completely loose on the system. Alternatively, look at some of the BI tools out there and consider creating a data mart where they can do these kinds of ad hoc searches.
How about this:
You create another store procedure (instead of function) and pass the right condition to it.
Based on that condition it dumps the record ids to a temp table.
Next you move procedure will read ids from that table and do the needful things?
Or you could create a user function that returns a table which is nothing but the ids of the records that matches your criteria (dynamic)
If I am totally off, then please clarify me.
Hope this helps.
If you are forced to use dynamic queries and you don't have any solid and predefined search requirements, it is strongly recommended to use sp_executesql instead of EXEC . It provides parametrized queries to prevent SQL Injection attacks (to some extent) and It makes use of execution plans to speed up performance. (More info)

How would you do give the user a preference for how from an SQL table is to be printed?

I'm given a task from a prospective employer which involves SQL tables. One requirement that they mentioned is that they want the name retrieved from a table called "Employees" to come in the form at of either "<LastName>, <FirstName>" OR "<FirstName> <MiddleName> <LastName> <Suffix>".
This appears confusing to me because this kind of sounds like they're asking me to make a function or something. I could probably do this in a programming language and have the information retrieved that way, but to do this in the SQL table exclusively is weird to me. Since I'm rather new to SQL and my familiarity with SQL doesn't exceed simple tasks such as creating databases, tables, fields, inserting data into fields, updating fields in records, deleting records in tables which meet a specific condition, and selecting fields from tables.
I hope that this isn't considered cheating since I mentioned that this was for a prospective employer, but if I was still in school then I could just outright ask a professor where I can find a clue for this or he would've outright told me in class. But, for a prospective job, I'm not sure who I would ask about any confusion. Thanks in advance for anyone's help.
A SQL query has a fixed column output: you can't change it. To achieve this. you could have a concatenate with a CASE statement to make it one varchar column, but then you need something (parameter) to switch the CASE.
So, this is presentation, not querying SQL.
I'd return all 4 columns mentioned and decide how I want them in the client.
Unless you have just been asked for 2 different queries on the same SQL table
You haven't specified the RDBMS, but in SQL Server you could accomplish this using Computed Columns.
Typically, you would use a View over the table..