Short-circuiting tables - sql

I'm upgrading several identical copies of a database which may already be upgraded partially, and for some reason bool values were stored in an nvarchar(5).
So in the below, (which exists inside an INSERT > SELECT block), I need to check if the column ShowCol exists, fill it with 0 if it does not, or fill it with the result of evaluating the string bool if it does:
CASE
WHEN COL_LENGTH('dbo.TableName', 'ShowCol') IS NULL THEN 0
ELSE IIF(LOWER(ShowCol) = 'false', 0, 1)
END
...but I'm getting an error "Invalid column name 'ShowCol'". I can't seem to short-circuit this, can you help?
Its worth noting that the column if it does exist contains a mix of "false", "False" and "FALSE", so that's the point of the LOWER(). (The True column also occasional trailing spaces to contend with, which is why I'm just dealing with False and everything else is true.)
I suspect that its because of this wrap in LOWER() which is causing the server to always evaluate the expression.

You can’t short circuit the existence of a column (and it has nothing to do with LOWER(); if you remove it, nothing will change).
You’ll need dynamic SQL, e.g.:
DECLARE #sql nvarchar(max) = N'UPDATE trg SET
trg.col1 = src.col1,
trg.col2 = src.col2';
IF COL_LENGTH('dbo.TableName', 'ShowCol') > 0
BEGIN
SET #sql += N', trg.ShowCol = IIF(LOWER(src.ShowCol) = ''false'', 0, 1)';
END
SET #sql += N' ...
FROM dbo.TableName AS trg
INNER JOIN dbo.Origin AS src
ON ...';
EXEC sys.sp_executesql #sql; -- ,N'params', #params;
When you're selecting data, you can fool the parser a little bit by introducing constants to take the place of columns, taking advantage of SQL Server's desire to find a column reference even at a different scope than the syntax would suggest. I talk about this in Make SQL Server DMV Queries Backward Compatible. I don't know of any straightforward way to make that work with writes without dynamic SQL, as the parser does more strict checking there, so it's harder to fool.
Imagine you have these tables:
CREATE TABLE dbo.SourceTable(a int, b int, c int);
INSERT dbo.SourceTable(a,b,c) VALUES(1,2,3);
CREATE TABLE dbo.DestinationWithAllColumns(a int, b int, c int);
INSERT dbo.DestinationWithAllColumns(a,b,c) VALUES(1,2,3);
CREATE TABLE dbo.DestinationWithoutAllColumns(a int, b int);
INSERT dbo.DestinationWithoutAllColumns(a,b) VALUES(1,2);
You can write a SELECT against either of them that produces an int output column called c:
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
SELECT trg.a, trg.b, trg.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Output:
a
b
c
1
2
3
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
SELECT trg.a, trg.b, trg.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithoutAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Output:
a
b
c
1
2
null
So far, so good. But as soon as you try and update:
;WITH optional_columns AS
(
SELECT c = CONVERT(int, NULL)
)
UPDATE trg SET trg.b = src.b, trg.c = src.c
FROM optional_columns
CROSS APPLY
(SELECT a,b,c FROM dbo.DestinationWithoutAllColumns) AS trg
INNER JOIN dbo.SourceTable AS src ON src.a = trg.a;
Msg 4421, Level 16, State 1
Derived table 'trg' is not updatable because a column of the derived table is derived or constant.
Example db<>fiddle

Related

Most performant way to filter on multiple values in multiple columns?

I have an application where the user can retrieve a list.
The user is allowed to add certain filters. For example:
Articles: 123, 456, 789
CustomerGroups: 1, 2, 3, 4, 5
Customers: null
ArticleGroups: null
...
When a filter is empty (or null), the query must ignore that filter.
What is the most performant way to build your query so it can handle a lot (10+) of different filters (and joins)?
My current approach is the following, but it doesn't scale very well:
CREATE PROCEDURE [dbo].[GetFilteredList]
#start datetime,
#stop datetime,
#ArticleList varchar(max), -- '123,456,789'
#ArticleGroupList varchar(max),
#CustomerList varchar(max),
#CustomerGroupList varchar(max) -- '1,2,3,4,5'
--More filters here...
AS
BEGIN
SET NOCOUNT ON
DECLARE #Articles TABLE (value VARCHAR(10));
INSERT INTO #Articles (value)
SELECT *
FROM [dko_db].[dbo].fnSplitString(#ArticleList, ',');
DECLARE #ArticleGroups TABLE (value VARCHAR(10));
INSERT INTO #ArticleGroups (value)
SELECT *
FROM [dko_db].[dbo].fnSplitString(#ArticleGroupList, ',');
DECLARE #Customers TABLE (value VARCHAR(10));
INSERT INTO #Customers (value)
SELECT *
FROM [dko_db].[dbo].fnSplitString(#CustomerList, ',');
DECLARE #CustomerGroups TABLE (value VARCHAR(10));
INSERT INTO #CustomerGroups (value)
SELECT *
FROM [dko_db].[dbo].fnSplitString(#CustomerGroupList, ',');
select * -- Some columns here
FROM [dbo].[Orders] o
LEFT OUTER JOIN [dbo].[Article] a on o.ArticleId = a.Id
LEFT OUTER JOIN [dbo].[ArticleGroup] ag on a.GroupId = ag.Id
LEFT OUTER JOIN [dbo].[Customer] c on o.CustomerId = o.Id
LEFT OUTER JOIN [dbo].[CustomerGroup] cg on c.GroupId = cg.Id
-- More joins here
WHERE o.OrderDate between #start and #stop and
(isnull(#ArticleList, '') = '' or a.ArticleCode in (select value from #Articles)) and
(isnull(#ArticleGroupList, '') = '' or ag.GroupCode in (select value from #ArticleGroups)) and
(isnull(#CustomerList, '') = '' or c.CustomerCode in (select value from #Customers)) and
(isnull(#CustomerGroupList, '') = '' or cg.GroupCode in (select value from #CustomerGroups))
ORDER BY c.Name, o.OrderDate
END
There's a lot of "low hanging fruit" performance improvements here.
First, lose ORDER BY c.Name, o.OrderDate that's just needless sorting.
Second, for your "list" variables (e.g. #ArticleList) - if you don't need VARCHAR(MAX) then change the data type(s) to VARCHAR(8000). VARCHAR(MAX) is much slower than VARCHAR(8000). I Never use MAX data types unless I am certain it's required.
Third, you can skip dumping your split values in to Table variables. That's Just needless overhead. You can lose all those declarations and inserts, then change THIS:
... a.ArticleCode in (select value from #Articles))
TO:
... a.ArticleCode in (SELECT value FROM dbo.fnSplitString(#ArticleList, ',')))
Fourth, fnSplitString is not an inline table valued function (e.g. you see BEGIN and END in the DDL) then it will be slow. An inline splitter will be much faster; consider DelimitedSplit8k or DelimitedSplit8K_LEAD.
Last I would add an OPTION (RECOMPILE) as this is a query highly unlikely to benefit from plan caching. A recompile will force the optimizer to evaluate your parameters ahead of time.
Beyond that, when joining a bunch of tables, check the execution plan, see where most of the data is coming from and use that info to index accordingly.

A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause in UPDATE

I'm trying to update some values based on every Id in the list. The logic I have seems to be what I want.
I want to populate a temporary table of Ids. Then for every ID I want to apply this query and output the deleted date and the ID into a new table I've created.
I keep getting the error:
Msg 10716, Level 15, State 1, Line 25
A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause.
What does this mean? I thought I am OUTPUTTING into the new table I've created.
USE datatemp
GO
DECLARE #idlist TABLE (id INT)
INSERT INTO #idlist (id) VALUES (3009099)
DECLARE #EndDate DATETIME
SET #EndDate = '2099-12-12'
IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'TEMP_TABLE')
BEGIN
CREATE TABLE [TEMP_TABLE] (
[id] INT,
[thedatetoend] DATETIME);
END
BEGIN TRY
SELECT *
FROM #idlist AS idlist
OUTER APPLY(
UPDATE [custprofile]
SET thedatetoend = #EndDate
OUTPUT idlist.id, DELETED.thedatetoend
INTO [TEMP_TABLE]
FROM [custprofile] as bc
INNER JOIN [custinformation] as cc
ON cc.custengageid = bc.custengageid
WHERE cc.id = idlist.id
AND bc.modifierid = 2
AND bc.thedatetoend > GETDATE()
AND cc.type = 1) o
I think you may have more success by using a CTE and avoiding the outer apply approach you are currently using. Updates made to the CTE cascade to the source table. It might look something like the following but as some columns don't reference the table aliases don't expect this to work "as is" (i.e. I'm not sure if you are outputting ccid or bcid and I don't know which table thedatetoend belongs to.)
WITH
CTE AS (
SELECT
cc.id AS ccid, bcid AS bcid, thedatetoend
FROM [custprofile] AS bc
INNER JOIN [custinformation] AS cc ON cc.custengageid = bc.custengageid
INNER JOIN #idlist AS idlist ON cc.id = idlist.id
WHERE bc.modifierid = 2
AND bc.thedatetoend > GETDATE()
AND cc.type = 1
)
UPDATE CTE
SET thedatetoend = #EndDate
OUTPUT ccid, DELETED.thedatetoend
INTO [TEMP_TABLE]

I need to optimize my first T-SQL update trigger

How do I rewrite this update trigger without using a lot of variables?
I wrote my first SQL Server trigger and it works fine, but I think, that there must be an easier solution.
If minimum one of 5 columns is changed I write two new rows in another table.
row 1 = old Fahrer (=Driver) and old dispodate and update-time
row 2 = new Fahrer and new dispodate and updatedatetime
My solution is just a copy of the foxpro-trigger, but there must be a easier solutions in T-SQL to check whether one colum is changed.
ALTER TRIGGER [dbo].[MyTrigger]
ON [dbo].[tbldisposaetze]
AFTER UPDATE
AS
SET NOCOUNT ON;
/*SET XACT_ABORT ON
SET ARITHABORT ON
*/
DECLARE #oldfahrer varchar(10)
DECLARE #oldbus varchar(10)
DECLARE #olddispodat date
DECLARE #oldvzeit decimal(4,0)
DECLARE #oldbzeit decimal(4,0)
DECLARE #oldbeschreibk varchar(255)
DECLARE #newfahrer varchar(10)
DECLARE #newbus varchar(10)
DECLARE #newdispodat date
DECLARE #newvzeit decimal(4,0)
DECLARE #newbzeit decimal(4,0)
DECLARE #newbeschreibk varchar(255)
SELECT #oldfahrer = fahrer,#oldbeschreibk=beschreibk,#oldbus=bus,#oldbzeit=bzeit,#olddispodat=dispodat,#oldvzeit=vzeit
FROM DELETED D
SELECT #newfahrer = fahrer,#newbeschreibk=beschreibk,#newbus=bus,#newbzeit=bzeit,#newdispodat=dispodat,#newvzeit=vzeit
FROM inserted I
if #oldbeschreibk <> #newbeschreibk or #oldbus <> #newbus or #oldbzeit <> #newbzeit or #oldfahrer <> #newfahrer or #oldvzeit <> #newvzeit
begin
IF (SELECT COUNT(*) FROM tbldispofahrer where fahrer=#oldfahrer and dispodat=#olddispodat) > 0
update tbldispofahrer set laenderung = GETDATE() where fahrer=#oldfahrer and dispodat=#olddispodat
else
INSERT into tbldispofahrer (fahrer,dispodat,laenderung) VALUES (#oldfahrer,#olddispodat,getdate())
IF (SELECT COUNT(*) FROM tbldispofahrer where fahrer=#newfahrer and dispodat=#newdispodat) > 0
update tbldispofahrer set laenderung = GETDATE() where fahrer=#newfahrer and dispodat=#newdispodat
else
INSERT into tbldispofahrer (fahrer,dispodat,laenderung) VALUES (#newfahrer,#newdispodat,getdate())
end
I'll assume you have SQL Server 2008 or greater. You can do this all in one statement without any variables.
Instead of doing all the work to first get the variables and see if they don't match, you can easily do that in as part of where clause. As folks have said in the comments, you can have multiple rows as part of inserted and deleted. In order to make sure you're working with the same updated row, you need to match by the primary key.
In order to insert or update the row, I'm using a MERGE statement. The source of the merge is a union with the where clause above, the top table in the union has the older fahrer, and the bottom has the new farher. Just like your inner IFs, existing rows are matched on farher and dispodat, and inserted or updated appropriately.
One thing I noticed, is that in your example newfahrer and oldfahrer could be exactly the same, so that only one insert or update should occur (i.e. if only bzeit was different). The union should prevent duplicate data from trying to get inserted. I do believe merge will error if there was.
MERGE tbldispofahrer AS tgt
USING (
SELECT d.farher, d.dispodat, GETDATE() [laenderung]
INNER JOIN inserted i ON i.PrimaryKey = d.PrimaryKey
AND (i.fahrer <> d.fahrer OR i.beschreibk <> d.beschreik ... )
UNION
SELECT i.farher, i.dispodat, GETDATE() [laenderung]
INNER JOIN inserted i ON i.PrimaryKey = d.PrimaryKey
AND (i.fahrer <> d.fahrer OR i.beschreibk <> d.beschreik ... )
) AS src (farher, dispodat, laenderung)
ON tgt.farher = src.farher AND tgt.dispodat = src.dispodat
WHEN MATCHED THEN UPDATE SET
laenderung = GETDATE()
WHEN NOT MATCHED THEN
INSERT (fahrer,dispodat,laenderung)
VALUES (src.fahrer, src.dispodat, src.laenderung)
There were a few little syntax errors in the answer from Daniel.
The following code is running fine:
MERGE tbldispofahrer AS tgt
USING (
SELECT d.fahrer, d.dispodat, GETDATE() [laenderung] from deleted d
INNER JOIN inserted i ON i.satznr = d.satznr
AND (i.fahrer <> d.fahrer OR i.beschreibk <> d.beschreibk or i.bus <> d.bus or i.bzeit <> d.bzeit or i.vzeit <> d.vzeit)
UNION
SELECT i.fahrer, i.dispodat, GETDATE() [laenderung] from inserted i
INNER JOIN deleted d ON i.satznr = d.satznr
AND (i.fahrer <> d.fahrer OR i.beschreibk <> d.beschreibk or i.bus <> d.bus or i.bzeit <> d.bzeit or i.vzeit <> d.vzeit)
) AS src (fahrer, dispodat, laenderung)
ON tgt.fahrer = src.fahrer AND tgt.dispodat = src.dispodat
WHEN MATCHED THEN UPDATE SET
laenderung = GETDATE()
WHEN NOT MATCHED THEN
INSERT (fahrer,dispodat,laenderung)
VALUES (src.fahrer, src.dispodat, src.laenderung);

Querying up tree for a particular value

I'm a bit of a SQL novice, so I could definitely use some assistance hashing out the general design of a particular query. I'll be giving a SQL example of what I'm trying to do below. It may contain some syntax errors, and I do apologize for that- I'm just trying to get the design down before I go running and testing it!
Side note- I have 0 control over the design scheme, so redesign is not an option. My example tables may have an error due to oversight on my part, but the overall design scheme of bottom-up value searching will remain the same. I'm querying an existing database filled with tons of data already in it.
The scenario is this: There is a tree of elements. Each element has an ID and a parent ID (table layouts below). Parent ID is a recursive foreign key to itself. There is a second table that contains values. Each value has an elementID that is a foreign key to the element table. So to get the value of a particular variable for a particular element, you must join the two tables.
The variable hierarchy goes Bottom-Up by way of inheritance. If you have an element and want to get its variable value, you first look at that element. If it doesn't have a value, then check the element's parent. If that doesn't check the parent's parent- all the way to the top. Every variable is guaranteed to have a value by the time you reach the top! (if I search for variableID 21- I know that 21 will exist. If not at the bottom, then definitely at the top) The lowest element on the tree gets priority, though- if the bottom element has a value for that variable, don't go any farther up!
The tables would look roughly like this:
Element_Table
--------------
elementID (PK)
ParentID (FK to elementID)
Value_Table
--------------
valueID (PK)
variableID
value (the value that we're looking for)
elementID (FK to Element_Table.elementID)
So, what I'm looking to do is create a function that cleanly (key word here. Nice, clean and efficient code) search, bottom-up, across the tree looking for a variable value. Once I find it- return that value and move on!
Here is an example of what I'm thinking:
CREATE FUNCTION FindValueInTreeBottomUp
(#variableID int, #element varchar(50))
RETURNS varchar(50)
AS
BEGIN
DECLARE #result varchar(50)
DECLARE #ID int
DECLARE #parentID int
SET #result = NULL, #ID = #element
WHILE (#result IS NULL)
BEGIN
SELECT #result = vals.value, #parentID = eles.ParentID
FROM Value_Table vals
JOIN Element_Table eles
ON vals.elementID = eles.elementID
WHERE eles.elementID = #ID AND vals.variableID = #variableID
IF(#result IS NULL)
#ID = #parentID
CONTINUE
ELSE
BREAK
END
RETURN #result
END
Again, I apologize if there are any syntactical errors. Still a SQL novice and haven't run this yet! I'm especially a novice at functions- I can query all day, but functions/sprocs are still rather new to me.
So, SQL gurus out there- can you think of a better way to do this? The design of the tables won't be changing; I have NO control over that. All I can do is produce the query to check the already existing design.
I think you could do something like this (it's untested, have to try it in sql fiddle):
;with cte1 as (
select e.elementID, e.parentID, v.value
from Element_Table as e
left outer join Value_Table as v on e.elementID = e.elementID and v.variableID = #variableID
), cte2 as (
select v.value, v.parentID, 1 as aDepth
from cte1 as v
where v.elementID = #elementID
union all
select v.value, v.parentID, c.aDepth + 1
from cte2 as c
inner join cte1 as v on v.elementID = c.ParentID
where c.value is null
)
select top 1 value
from cte2
where value is not null
order by aDepth
test infrastructure:
declare #Elements table (ElementID int, ParentID int)
declare #Values table (VariableID int, ElementID int, Value nvarchar(128))
declare #variableID int, #elementID int
select #variableID = 1, #elementID = 2
insert into #Elements
select 1, null union all
select 2, 1
insert into #Values
select 1, 1, 'test'
;with cte1 as (
select e.elementID, e.parentID, v.value
from #Elements as e
left outer join #Values as v on e.elementID = e.elementID and v.variableID = #variableID
), cte2 as (
select v.value, v.parentID, 1 as aDepth
from cte1 as v
where v.elementID = #elementID
union all
select v.value, v.parentID, c.aDepth + 1
from cte2 as c
inner join cte1 as v on v.elementID = c.ParentID
where c.value is null
)
select top 1 value
from cte2
where value is not null
order by aDepth

Update or insert data depending on whether row exists

I have a collection of rows that I get from a web service. Some of these rows are to be inserted, some are updates to existing rows. There is no way of telling unless I do a query for the ID in the table. If I find it, then update. If I don't, then insert.
Select #ID from tbl1 where ID = #ID
IF ##ROWCOUNT = 0
BEGIN
Insert into tbl1
values(1, 'AAAA', 'BBBB', 'CCCC', 'DDD')
END
ELSE
BEGIN
UPDATE tbl1
SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID
END
I am trying to figure out the most effient way to update/insert these rows into the table without passing them into a stored procedure one at a time.
UPDATE 1
I should have mentioned I am using SQL Server 2005. Also if I have 300 records I don't want to make 300 stored procedure calls.
the most efficient way will be first try to update the table if it returns 0 row updated then only do insertion. for ex.
UPDATE tbl1
SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID
IF ##ROWCOUNT = 0
BEGIN
Insert into tbl1
values(1, 'AAAA', 'BBBB', 'CCCC', 'DDD')
END
ELSE
BEGIN
END
Instead of paying for a seek first and then updating using another seek, just go ahead and try to update. If the update doesn't find any rows, you've still only paid for one seek, and didn't have to raise an exception, but you know that you can insert.
UPDATE dbo.tbl1 SET
A = #AAA,
B = #BBB,
C = #CCC,
D = #DDD
WHERE ID = #ID;
IF ##ROWCOUNT = 0
BEGIN
INSERT dbo.tbl1(ID,A,B,C,D)
VALUES(#ID,#AAA,#BBB,#CCC,#DDD);
END
You can also look at MERGE but I shy away from this because (a) the syntax is daunting and (b) there have been many bugs and several of them are still unresolved.
And of course instead of doing this one #ID at a time, you should use a table-valued parameter.
CREATE TYPE dbo.tbl1_type AS TABLE
(
ID INT UNIQUE,
A <datatype>,
B <datatype>,
C <datatype>,
D <datatype>
);
Now your stored procedure can look like this:
CREATE PROCEDURE dbo.tbl1_Update
#List AS dbo.tbl1_type READONLY
AS
BEGIN
SET NOCOUNT ON;
UPDATE t
SET A = i.A, B = i.B, C = i.C, D = i.D
FROM dbo.tbl1 AS t
INNER JOIN #List AS i
ON t.ID = i.ID;
INSERT dbo.tbl1
SELECT ID, A, B, C, D
FROM #List AS i
WHERE NOT EXISTS
(
SELECT 1
FROM dbo.tbl1 WHERE ID = i.ID
);
END
GO
Now you can just pass your DataTable or other collection from C# directly into the procedure as a single parameter.
From the collection of rows you get from the server find out which ones are already there:
select #id from tbl1 where id in (....)
Then you have a list of ids that are in the table and one that there are not in the table.
You will have then 2 batch operations: one for update, the other for insert.
what i understand is this :
at the front end u issue a single sql statement
ArrayofIDsforInsert = select ID from tbl1 where ID not in ( array of ids at the front end)
ArrayofIDsforUpdate = (IntialArrayofids at frontend) - (ArrayofIdsforInsert)
one insert into table and one update table...
now call the insert into table with ArrayofIds for insert
call the update table with ArrayofIds for update..