Conditionally using a SQL Temp Table throws errors - sql

I have a (not normalized) legacy SQL database and am working on a task to gather code from one of several similar tables. I want to execute a lot of identical code against the #Temp table, but the tables it draws from are very dissimilar (with some columns being the same).
My code is:
IF #Variable = 'X'
BEGIN
SELECT * INTO #Temp FROM TABLE1 WHERE Condition1 = #Condition
END
IF #Variable = 'Y'
BEGIN
SELECT * INTO #Temp FROM TABLE2 WHERE Condition1 = #Condition
END
At this point, I execute some common code. There is quite a lot and I want to just use #Temp, not have another IF condition with the code copied in multiple times. I cannot really declare the table ahead of time (it is very wide - and they are not the same) and I cannot really normalize the DB (the legacy system is far to 'mature' and my time frame is far to small). Also, at the end of the query, the #Temp table is used for creating new rows back in the original table (so again, I cannot just declare the common parts).
At this point, I cannot make my stored proc because
There is already an object named '#Temp' in the database.
This error highlights the 2nd IF block. Adding a DROP TABLE #Temp in the IF block does not help either. So I'm having to offload the work in additional SPROCs or repeat the code in conditional statements. For readability, I don't like either of these options.
Any way to use #Temp within multiple IF blocks as above ( I really have more IF conditions, only 2 shown to give an idea of the issue).
Example SqlFiddle

Related

Selecting into temp table dynamically with conditional source table

I'm preparing three temp tables and then, I assigned one of them as the data to be processed further. The choice is done by commenting out all the irrelevant temps.
select * into #Daily from ...
select * into #Monthly from ...
select * into #Yearly from ...
-- select * into #Data from #Daily
select * into #Data from #Montly
-- select * into #Data from #Yearly
Naturally, I'd like to control that by a parameter and make the selection dynamically controlled. I've only found examples like this with conditions for the subset of a static source.
declare #Type as varchar(max) = 'Daily'
...
select * into #Data from case(...)
Trying different versions of the above gave me a lot of info in red color. Due to ignorance and limited competency, it's totally useless to me. (Although, I'm sure it's pretty obvious once one gets it right. It's definitely PICNIC situation.)
What should I google for? I sense it's something like select into case source conditional but haven't got lucky (or didn't realized that I have). Quite a lot is about inserting into and not selecting into like here, which is irrelevant to me.
You could use a Cursor to loop through the tables and populate the Table Name from a cursor variable in some Dynamic SQL.
SSIS can also do this as everything can be parameterized including data destinations and it's perfect for this sort of workload.

Optimization when merging from Oracle datalink

I am trying to write an Oracle procedure to merge data from a remote datalink into a local table. Individually the pieces work quickly, but together they time out. Here is a simplified version of what I am trying.
What works:
Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24);
--Works in split second.
Merge into project
using (select /*+DRIVING_SITE(remoteCompData)*/
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in (1,2,3)) sourceData -- hardcoded IDs
On (rd.projectID = project.projectID)
When matched...
-- Merge statement works quickly when the IDs are hard coded
What doesn't work: Combining the two statements above.
Merge into project
using (select /*+DRIVING_SITE(rd)*/ -- driving site helps when this piece is extracted from the larger statement
rp.projectID,
rp.otherdata
FROM Them.Remote_Data#DBLink rd
WHERE rd.projectID in --in statement that works quickly by itself.
(Select distinct ProjectID from Project where LastUpdated < (sysdate - 6/24))
-- This select in the in clause one returns 10 rows. Its a test database.
On (rd.projectID = project.projectID)
)
When matched...
-- When I run this statement in SQL Developer, this is all that I get without the data updating
Connecting to the database local.
Process exited.
Disconnecting from the database local.
I also tried pulling out the in statement into a with statement hoping it would execute differently, but it had no effect.
Any direction for paths to pursue would be appreciated.
Thanks.
The /*+DRIVING_SITE(rd)*/ hint doesn't work with MERGE because the operation must run in the database where the merged table sits. Which in this case is the local database. That means the whole result set from the remote table is pulled across the database link and then filtered against the data from the local table.
So, discard the hint. I also suggest you convert the IN clause into a join:
Merge into project p
using (select rp.projectID,
rp.otherdata
FROM Project ld
inner join Them.Remote_Data#DBLink rd
on rd.projectID = ld.projectID
where ld.LastUpdated < (sysdate - 6/24)) q
-- This select in the in clause one returns 10 rows. Its a test database.
On (q.projectID = p.projectID)
)
Please bear in mind that answers to performance tuning questions without sufficient detail are just guesses.
I found your question having same problem. Yes, the hint in query is ignored when the query is included into using clause of merge command.
In my case I created work table, say w_remote_data for your example, and splitted merge command into two commands: (1) fill the work table, (2) invoke merge command using work table.
The pitfall is, we cannot simply use neither of commands create w_remote_data as select /*+DRIVING_SITE(rd)*/ ... or insert into w_remote_data select /*+DRIVING_SITE(rd)*/ ... to fill the work table. Both of these commands are valid but they are slow - the hint does not apply too so we would not get rid of the problem. The solution is in PLSQL: collect result of query in using clause using intermediate collection. See example (for simplicity I assume w_remote_data has same structure as remote_data, otherwise we have to define custom record instead of %rowtype):
declare
type ct is table of w_remote_data%rowtype;
c ct;
i pls_integer;
begin
execute immediate 'truncate table w_remote_data';
select /*+DRIVING_SITE(rd)*/ *
bulk collect into c
from Them.Remote_Data#DBLink rd ...;
if c.count > 0 then
forall i in c.first..c.last
insert into w_remote_data values c(i);
end if;
merge into project p using (select * from w_remote_data) ...;
execute immediate 'truncate table w_remote_data';
end;
My case was ETL script where I could rely it won't run in parallel. Otherwise we would have to cope with temporary (session-private) tables, I didn't try if it works with them.

INSERT FROM EXISTING SELECT without amending

With GDPR in the UK on the looming horizon and already have a team of 15 users creating spurious SELECT statements (in excess of 2,000) across 15 differing databases I need to be able to create a method to capture an already created SELECT statement and be able to assign surrogate keys/data WITHOUT rewriting every procedure we already have.
There will be a need to run the original team members script as normal and there will be requirements to pseudo the values.
My current thinking is to create a stored procedure along the lines of:
CREATE PROC Pseudo (#query NVARCHAR(MAX))
INSERT INTO #TEMP FROM #query
Do something with the data via a mapping table of real and surrogate/pseudo data.
UPDATE #TEMP
SET FNAME = (SELECT Pseudo_FNAME FROM PseudoTable PT WHERE #TEMP.FNAME = PT.FNAME)
SELECT * FROM #TEMP
So that team members can run their normal SELECT statements and get pseudo data simply by using:
EXEC Pseudo (SELECT FNAME FROM CUSTOMERS)
The problem I'm having is you can't use:
INSERT INTO #TEMP FROM #query
So I tried via CTE:
WITH TEMP AS (#query)
..but I can't use that either.
Surely there's a way of capturing the recordset from an existing select that I can pull into a table to amend it or capture the SELECT statement; without having to amend the original script. Please bear in mind that each SELECT statement will be unique so I can't write COLUMN or VALUES etc.
Does any anyone have any ideas or a working example(s) on how to best tackle this?
There are other lengthy methods I could externally do to carry this out but I'm trying to resolve this within SQL if possible.
So after a bit of deliberation I resolved it.
I passed the Original SELECT SQL to SP that used some SQL Injection, which when executed INSERTed data. I then Updated from that dataset.
The end result was "EXEC Pseudo(' Orginal SQL ;')
I will have to set some basic rules around certain columns for now as a short term fix..but at least users can create NonPseudo and Pseudo data as required without masses of reworking :)

Executing formula from table

I have a process that builds reports based upon dynamic SQL queries stored in tables. When I originally wrote it as a proof-of-concept it was able to successfully work when using a cursor style process...was originally actually done as a script in the proof, using Do/While - the "proof" was moved to tSQL initially in the same format and successful, other than the fact that it ran like crap because it was iterating 1 record at a time.
I rewrote the process to leverage the point of using SQL - mass select/manipulation of records...but I haven't been able to get the calculation grab to work in this manner and have just been using statically written case statements.
Tables:
Items list - just a friendly label for what each item is.
SourceQuery - nvarchar fields containing actual SQL select statements
Calculations - varchar fields containing data such as DateAdd(m,-1,CalcDate) and DATEADD(month, DATEDIFF(month, 0, CalcDate), 0) and a lot of other calculations based upon other values. (CalcDate is a select value which is going into the TempTable currently)
The dynamic execution takes the SourceQuery, builds, then executes it into a temp table:
DECLARE #SourceQuery Nvarchar(max)
create table #TempTable...
select distinct #SourceQuery=SourceQuery from vewTaskCombo where ....
Set #SourceQuery = 'Insert into #TempTable...' + #SourceQuery
Execute (#SourceQuery)
The above is doing the SourceQuery into the temp table but not currently doing anything with calculations - as mentioned that is currently done by an update statement using a CASE statement to decide which Date Calculation to use.
What I would like to do is eliminate the CASE statement and allow it to grab the calculation directly from the table. When doing this as a single item iteration it was fine because we could assign the calculation to a variable.
The above is just a snippet of the pieces - there are several other table elements that are all joined together to create the query and decide the calculations.
Edit response:
the issue I am having is how to get the calculation from the table to execute as a statement. So for example if I inner join the calculation table in I can grab what type of calculation it should be (DateAdd....) but it is grabbing it only as a varchar and no longer able to execute it as calculation. Before because it was iterating 1 at a time the current calculation was grabbed into a variable and executed that way. But now because I am doing it all in bulk. I can insert the formula into the temp table as another value but can't figure out how to get it to execute it as a calculation.
The goal is to execute the calculation that is stored in the table. I can select the calculation into the temp table but can't figure out how to execute it as a calculation without putting it into a separate variable - and since there can be more than one calculation I can't just assign it to a single variable (without putting in a cursor to go through each calculation one at a time, which I am trying to avoid doing).
Currently the statically written case statements look something like:
Update #TempTable
Set StartDate = CASE WHEN TaskThresholdID=2 then
DateAdd(m,-1,CalcDate)
WHEN TaskThresholdID=4 then DateAdd(m,-1,CalcDate)
.
.
.
DueDate = CASE WHEN TaskThresholdID=2 then DateAdd("D",4,CalcDate)
WHEN TaskThresholdID=4 then CalcDate
.
.
.
The goal is to grab that calculation from the table and not have it statically written into the procedure.
And thank you LukStorms for code formatting edit.
I ended up finding a solution to it after trying a few other ideas. Initially tried using a more formalized equation, while we did eventually get that to work the problem was adding "30" or "31" days in translation for a month was too inaccurate.
What I ended up doing was building the dynamic queries into a table (#UpdateQuery_Temp) then using COALESCE in order to get those into a single query. And finally EXECUTE that individual query.
Create Table #UpdateQuery_Temp (TaskThresholdID int, UpdateQuery varchar(max))
insert into #UpdateQuery_Temp
Select Distinct ThresholdID, ('update #TempTable set StartDate=' + StartCalculation + ',DueDate=' + DueCalculation + ' where ThresholdID=' + cast(ThresholdID as varchar(2)) + ' ') FROM #TempTable
DECLARE #UpdateQuery varchar(max)
SELECT #UpdateQuery = COALESCE(#UpdateQuery + ' ',' ') + UpdateQuery + ';'
FROM #UpdateQuery_Temp
EXECUTE (#UpdateQuery)
Using this format no matter how many combinations of Start/Due calculations there are it can dynamically grab those from the table and execute them. The execution time on 13,000 records took a small hit of 0.01 seconds - production tables are a few million records but the time difference is small enough that it is worth the hit to get it back to being table driven.

Optimizing stored procedure with multiple "LIKE"s

I am passing in a comma-delimited list of values that I need to compare to the database
Here is an example of the values I'm passing in:
#orgList = "1123, 223%, 54%"
To use the wildcard I think I have to do LIKE but the query runs a long time and only returns 14 rows (the results are correct, but it's just taking forever, probably because I'm using the join incorrectly)
Can I make it better?
This is what I do now:
declare #tempTable Table (SearchOrg nvarchar(max) )
insert into #tempTable
select * from dbo.udf_split(#orgList) as split
-- this splits the values at the comma and puts them in a temp table
-- then I do a join on the main table and the temp table to do a like on it....
-- but I think it's not right because it's too long.
select something
from maintable gt
join #tempTable tt on gt.org like tt.SearchOrg
where
AYEAR= ISNULL(#year, ayear)
and (AYEAR >= ISNULL(#yearR1, ayear) and ayear <= ISNULL(#yearr2, ayear))
and adate = ISNULL(#Date, adate)
and (adate >= ISNULL(#dateR1, adate) and adate <= ISNULL(#DateR2 , adate))
The final result would be all rows where the maintable.org is 1123, or starts with 223 or starts with 554
The reason for my date craziness is because sometimes the stored procedure only checks for a year, sometimes for a year range, sometimes for a specific date and sometimes for a date range... everything that's not used in passed in as null.
Maybe the problem is there?
Try something like this:
Declare #tempTable Table
(
-- Since the column is a varchar(10), you don't want to use nvarchar here.
SearchOrg varchar(20)
);
INSERT INTO #tempTable
SELECT * FROM dbo.udf_split(#orgList);
SELECT
something
FROM
maintable gt
WHERE
some where statements go here
And
Exists
(
SELECT 1
FROM #tempTable tt
WHERE gt.org Like tt.SearchOrg
)
Such a dynamic query with optional filters and LIKE driven by a table (!) are very hard to optimize because almost nothing is statically known. The optimizer has to create a very general plan.
You can do two things to speed this up by orders of magnitute:
Play with OPTION (RECOMPILE). If the compile times are acceptable this will at least deal with all the optional filters (but not with the LIKE table).
Do code generation and EXEC sp_executesql the code. Build a query with all LIKE clauses inlined into the SQL so that it looks like this: WHERE a LIKE #like0 OR a LIKE #like1 ... (not sure if you need OR or AND). This allows the optimizer to get rid of the join and just execute a normal predicate).
Your query may be difficult to optimize. Part of the question is what is in the where clause. You probably want to filter these first, and then do the join using like. Or, you can try to make the join faster, and then do a full table scan on the results.
SQL Server should optimize a like statement of the form 'abc%' -- that is, where the wildcard is at the end. (See here, for example.) So, you can start with an index on maintable.org. Fortunately, your examples meet this criteria. However, if you have '%abc' -- the wildcard comes first -- then the optimization won't work.
For the index to work best, it might also need to take into account the conditions in the where clause. In other words, adding the index is suggestive, but the rest of the query may preclude the use of the index.
And, let me add, the best solution for these types of searches is to use the full text search capability in SQL Server (see here).