SQL UPDATE query - value depends on another rows - sql

There is a SQL Server database temporary table, let it be TableA. And the table structure is following:
CREATE TABLE #TableA
(
ID BIGINT IDENTITY (1, 1) PRIMARY KEY,
MapVal1 BIGINT NOT NULL,
MapVal2 BIGINT NOT NULL,
IsActual BIT NULL
)
The table is already filled with some mappings of MapVal1 to MapVal2. The issue is that not all the mappings should be flagged as Actual. For this reason should be used IsActual column. Currently IsActual is set to NULL for every row. The task is to create the query for updating IsActual column value. UPDATE query should follow next conditions:
If MapVal1 is unique and MapVal2 is unique (one-to-one mapping) - then this mapping should be flagged as Actual, so IsActual = 1;
If MapVal1 is not unique - then Actual should be the mapping of current MapVal1 to smallest MapVal2, and this MapVal2 must be not mapped to any other MapVal1 that is smaller than current MapVal1;
If MapVal2 is not unique - then Actual should be the mapping of current MapVal2 to smallest MapVal1, and this MapVal1 must be not mapped to any other MapVal2 that is smaller than current MapVal2;
All rows that are not fulfill any of 1), 2) or 3) conditions - should be flagged as inactual, so IsActual = 0.
I believe there is relation between Condition 2) and Condition 3). For every row they both are fulfilled or both are not.
To make it clear, here is an example of result I want to obtain:
Result should be that every MapVal1 is mapped to just one MapVal2 and vice varsa every MapVal2 is mapped to just one MapVal1.
I have created sql-query to resolve my task:
IF OBJECT_ID('tempdb..#TableA') IS NOT NULL
BEGIN
DROP TABLE #TableA
END
CREATE TABLE #TableA
(
ID BIGINT IDENTITY (1, 1) PRIMARY KEY,
MapVal1 BIGINT NOT NULL,
MapVal2 BIGINT NOT NULL,
IsActual BIT NULL
)
-- insert input data
INSERT INTO #TableA (MapVal1, MapVal2)
SELECT 1, 1
UNION ALL SELECT 1, 3
UNION ALL SELECT 1, 4
UNION ALL SELECT 2, 1
UNION ALL SELECT 2, 3
UNION ALL SELECT 2, 4
UNION ALL SELECT 3, 3
UNION ALL SELECT 3, 4
UNION ALL SELECT 4, 3
UNION ALL SELECT 4, 4
UNION ALL SELECT 6, 7
UNION ALL SELECT 7, 8
UNION ALL SELECT 7, 9
UNION ALL SELECT 8, 8
UNION ALL SELECT 8, 9
UNION ALL SELECT 9, 8
UNION ALL SELECT 9, 9
CREATE NONCLUSTERED INDEX IX_Mapping_MapVal1 ON #TableA (MapVal1);
CREATE NONCLUSTERED INDEX IX_Mapping_MapVal2 ON #TableA (MapVal2);
-- UPDATE of #TableA is starting here
-- every one-to-one mapping should be actual
UPDATE m1 SET
m1.IsActual = 1
FROM #TableA m1
LEFT JOIN #TableA m2
ON m1.MapVal1 = m2.MapVal1 AND m1.ID <> m2.ID
LEFT JOIN #TableA m3
ON m1.MapVal2 = m3.MapVal2 AND m1.ID <> m3.ID
WHERE m2.ID IS NULL AND m3.ID IS NULL
-- update for every one-to-many or many-to-many mapping is more complicated
-- would be great to change this part of query to make it witout any LOOP
DECLARE #MapVal1 BIGINT
DECLARE #MapVal2 BIGINT
DECLARE #i BIGINT
DECLARE #iMax BIGINT
DECLARE #LoopCount INT = 0
SELECT
#iMax = MAX (m.ID)
FROM #TableA m
SELECT
#i = MIN (m.ID)
FROM #TableA m
WHERE m.IsActual IS NULL
WHILE #i <= #iMax
BEGIN
SELECT #LoopCount = #LoopCount + 1
SELECT
#MapVal1 = m.MapVal1,
#MapVal2 = m.MapVal2
FROM #TableA m
WHERE m.ID = #i
IF EXISTS
(
SELECT *
FROM #TableA m
WHERE
m.ID < #i
AND
(m.MapVal1 = #MapVal1
OR m.MapVal2 = #MapVal2)
AND m.IsActual IS NULL
)
BEGIN
UPDATE m SET
m.IsActual = 0
FROM #TableA m
WHERE m.ID = #i
END
SELECT #i = MIN (m.ID)
FROM #TableA m
WHERE
m.ID > #i
AND m.IsActual IS NULL
END
UPDATE m SET
m.IsActual = 1
FROM #TableA m
WHERE m.IsActual IS NULL
SELECT * FROM #TableA
but as it was expected performance of the query with LOOP is very bad, specially when input table keep millions of rows. I spent a lot of time trying to produce query without LOOP to get reduce execution time of my query but unsuccessfully.
Could anybody advice me how to improve performance of my query. It would be great to get query without LOOP.

Using a loop does not imply you need to update the table one record at a time.
It may help if each individual UPDATE statement updates multiple records.
Consider all possible combinations of MapVal1 and MapVal2 as a matrix.
Every time you flag a cell as 'actual', you can flag an entire row and an entire column as 'not actual'.
The simplest way to do this, is by following these steps.
Of all mappings with IsActual = NULL, take the first one (smallest MapVal1, together with the smallest MapVal2 it is mapped to).
Flag this mapping as actual (IsActual = 1).
Flag all other mappings with the same MapVal1 as non-actual (IsActual = 0).
Flag all other mappings with the same MapVal2 as non-actual (IsActual = 0).
Repeat from step 1 until no more records with IsActual = NULL exist.
Here's an implementation:
SELECT 0 -- force ##ROWCOUNT initially 1
WHILE ##ROWCOUNT > 0
WITH MakeActual AS (
SELECT TOP 1 MapVal1, MapVal2
FROM #TableA
WHERE IsActual IS NULL
ORDER BY MapVal1, MapVal2
)
UPDATE a
SET IsActual = CASE WHEN a.MapVal1 = m.MapVal1 AND a.MapVal2 = m.MapVal2 THEN 1 ELSE 0 END
FROM #TableA a
INNER JOIN MakeActual m ON a.MapVal1 = m.MapVal1 OR a.MapVal2 = m.MapVal2
The number of loop iterations equals the number of 'actual' mappings.
The actual performance gain depends a lot on the data.
If the majority of mappings is one-to-one (i.e. hardly any non-actual mappings), then my algorithm will make little difference.
Therefore, it may be wise to keep the initial UPDATE statement from your own code sample (the one with the comment "every one-to-one mapping should be actual").
It may also help to play around with the indexes.
This one should help to further optimize the clause ORDER BY MapVal1, MapVal2:
CREATE NONCLUSTERED INDEX IX_MapVals ON #TableA (MapVal1, MapVal2)

Related

Why Optimizer Does Not Use Index Seek on Join

I wonder why the following SELECT statement (below) does not use Index Seek, but Index Scan. Is it just because the number of rows is too small or am I missing something?
Test data:
-- Init Tables
IF OBJECT_ID ( 'tempdb..#wat' ) IS NOT NULL
DROP TABLE #wat;
IF OBJECT_ID ( 'tempdb..#jam' ) IS NOT NULL
DROP TABLE #jam;
CREATE TABLE #wat (
ID INT IDENTITY(1,1) NOT NULL,
Name VARCHAR(15) NOT NULL,
Den DATETIME NOT NULL
)
CREATE TABLE #jam (
ID INT IDENTITY(1,1) NOT NULL,
Name VARCHAR(15) NOT NULL
)
-- Populate Temp Tables with Random Data
DECLARE #length INT
,#charpool VARCHAR(255)
,#poolLength INT
,#RandomString VARCHAR(255)
,#LoopCount INT
SET #Length = RAND() * 5 + 8
SET #CharPool = 'abcdefghijkmnopqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ23456789'
SET #PoolLength = LEN(#CharPool)
SET #LoopCount = 0
SET #RandomString = ''
WHILE (#LoopCount < 500)
BEGIN
INSERT INTO #jam (Name)
SELECT SUBSTRING(#Charpool, CONVERT(int, RAND() * #PoolLength), 5)
SET #LoopCount = #LoopCount + 1
END
-- Insert Rows into Second Temp Table
INSERT INTO #wat( Name, Den )
SELECT TOP 50 Name, GETDATE()
FROM #jam
-- Create Indexes
--DROP INDEX IX_jedna ON #jam
--DROP INDEX IX_dva ON #wat
CREATE INDEX IX_jedna ON #jam (Name) INCLUDE (ID);
CREATE INDEX IX_dva ON #wat (Name) INCLUDE (ID, Den);
-- Select
SELECT *
FROM #jam j
JOIN #wat w
ON w.Name = j.Name
Execution Plan:
There are several ways for optimiser to do jons: nested loops, hash match or merge join (your case) and may be another.
In dependence of your data: count of rows, existed indexes and statistics id decides which one is better.
in your example optimiser assumes that there is many-to-many relation. And you have both tables soret(indexed) by this fields.
why merge join? - it is logically - to move through both tables parallel. And server will have to do that only once.
To make seek as you want, the server have to move thorugh first table once, and have to make seeks in second table a lot of times, since all records have matches in another table. Server will read all records if he seeks. And there no profit when using seek (1000 seeks even more difucult than one simple loop through 1000 records).
if you want seek add some records with no matches and where clause in your query.
UPD
even adding simple
where j.ID = 1
gives you seek

SQL return bitwise flag value for non null columns indicating data significance

I have a case where I need to create a unique weight value based on how much data is contained with a distinct row in a dataset. Each column is assigned a bit value indicating its significance. For instance Col1 = 1, Col2 = 2 would signify that Col2 carries more weight than Col1. Data in both Col1 and Col2 (Col1 | Col2) = 3 would be more significant than a row with data in either column.
Columns at a later time can be reclassified by significance. Therefore I am looking for a solution which would be more versatile than hard coding the bitwise operation into the SQL query (see Option 1).
I have devised Option 2 as a way to overcome the reclassification requirement but I am not sure if this is the best way to write the query. My production dataset will range from 4M-25M records.
Can any provide another example, or improve this example, of how to write this query (Option 2) to improve performance and the ability for a user to change the column's significance?
Test Code Below:
Environment: SQL Server 2014
--EXAMPLE SETUP
IF OBJECT_ID('tempdb..#DataImage') IS NOT NULL
BEGIN
DROP TABLE #DataImage
DROP TABLE #Source
END
CREATE TABLE #DataImage (ID INT PRIMARY KEY CLUSTERED IDENTITY(1,1), ColumnOne BIT, ColumnTwo BIT, Flag INT)
CREATE TABLE #Source (ID INT PRIMARY KEY CLUSTERED IDENTITY(1,1), DataOne NVARCHAR(50), DataTwo INT)
--END SETUP
--CREATE ROW COMBINATION FLAGS (Real Flag Size = 10 combinations)
INSERT INTO #DataImage (ColumnOne, ColumnTwo, Flag) SELECT 1, 0, 1
INSERT INTO #DataImage (ColumnOne, ColumnTwo, Flag) SELECT 0, 1, 2
INSERT INTO #DataImage (ColumnOne, ColumnTwo, Flag) SELECT 1, 1, 3
--CREATE TEST DATA (Real Data Size = 10M records)
INSERT INTO #Source (DataOne, DataTwo) SELECT NULL, 2
INSERT INTO #Source (DataOne, DataTwo) SELECT N'Foo', NULL
INSERT INTO #Source (DataOne, DataTwo) SELECT N'Bar', 100
--QUERY SETUPS--
--OPTION 1: CALCULATE THE FLAG FOR EVERY ROW
SELECT SourceId = s.ID, Flag = IIF(s.DataOne IS NULL, 0, 1) + IIF(s.DataTwo IS NULL, 0, 2)
FROM #Source s
--OPTION 2: RETURN FLAG FOR SOURCE DATA
;WITH t0 (SourceId, Flag, C1, C2) AS (
SELECT s.ID, i.Flag, IIF(s.DataOne IS NULL, 0, 1), IIF(s.DataTwo IS NULL, 0, 1)
FROM #Source s CROSS JOIN #DataImage i
INTERSECT
SELECT s.ID, i.Flag, ColumnOne, ColumnTwo
FROM #Source s CROSS JOIN #DataImage i
)
SELECT t0.SourceId, t0.Flag
FROM t0

Left join with nearest value without duplicates

I want to achieve in MS SQL something like below, using 2 tables and through join instead of iteration.
From table A, I want each row to identify from table B which in the list is their nearest value, and when value has been selected, that value cannot re-used. Please help if you've done something like this before. Thank you in advance! #SOreadyToAsk
Below is a set-based solution using CTEs and windowing functions.
The ranked_matches CTE assigns a closest match rank for each row in TableA along with a closest match rank for each row in TableB, using the index value as a tie breaker.
The best_matches CTE returns rows from ranked_matches that have the best rank (rank value 1) for both rankings.
Finally, the outer query uses a LEFT JOIN from TableA to the to the best_matches CTE to include the TableA rows that were not assigned a best match due to the closes match being already assigned.
Note that this does not return a match for the index 3 TableA row indicated in your sample results. The closes match for this row is TableB index 3, a difference of 83. However, that TableB row is a closer match to the TableA index 2 row, a difference of 14 so it was already assigned. Please clarify you question if this isn't what you want. I think this technique can be tweaked accordingly.
CREATE TABLE dbo.TableA(
[index] int NOT NULL
CONSTRAINT PK_TableA PRIMARY KEY
, value int
);
CREATE TABLE dbo.TableB(
[index] int NOT NULL
CONSTRAINT PK_TableB PRIMARY KEY
, value int
);
INSERT INTO dbo.TableA
( [index], value )
VALUES ( 1, 123 ),
( 2, 245 ),
( 3, 342 ),
( 4, 456 ),
( 5, 608 );
INSERT INTO dbo.TableB
( [index], value )
VALUES ( 1, 152 ),
( 2, 159 ),
( 3, 259 );
WITH
ranked_matches AS (
SELECT
a.[index] AS a_index
, a.value AS a_value
, b.[index] b_index
, b.value AS b_value
, RANK() OVER(PARTITION BY a.[index] ORDER BY ABS(a.Value - b.value), b.[index]) AS a_match_rank
, RANK() OVER(PARTITION BY b.[index] ORDER BY ABS(a.Value - b.value), a.[index]) AS b_match_rank
FROM dbo.TableA AS a
CROSS JOIN dbo.TableB AS b
)
, best_matches AS (
SELECT
a_index
, a_value
, b_index
, b_value
FROM ranked_matches
WHERE
a_match_rank = 1
AND b_match_rank= 1
)
SELECT
TableA.[index] AS a_index
, TableA.value AS a_value
, best_matches.b_index
, best_matches.b_value
FROM dbo.TableA
LEFT JOIN best_matches ON
best_matches.a_index = TableA.[index]
ORDER BY
TableA.[index];
EDIT:
Although this method uses CTEs, recursion is not used and is therefore not limited to 32K recursions. There may be room for improvement here from a performance perspective, though.
I don't think it is possible without a cursor.
Even if it is possible to do it without a cursor, it would definitely require self-joins, maybe more than once. As a result performance is likely to be poor, likely worse than straight-forward cursor. And it is likely that it would be hard to understand the logic and later maintain this code. Sometimes cursors are useful.
The main difficulty is this part of the question:
when value has been selected, that value cannot re-used.
There was a similar question just few days ago.
The logic is straight-forward. Cursor loops through all rows of table A and with each iteration adds one row to the temporary destination table. To determine the value to add I use EXCEPT operator that takes all values from the table B and removes from them all values that have been used before. My solution assumes that there are no duplicates in value in table B. EXCEPT operator removes duplicates. If values in table B are not unique, then temporary table would hold unique indexB instead of valueB, but main logic remains the same.
Here is SQL Fiddle.
Sample data
DECLARE #TA TABLE (idx int, value int);
INSERT INTO #TA (idx, value) VALUES
(1, 123),
(2, 245),
(3, 342),
(4, 456),
(5, 608);
DECLARE #TB TABLE (idx int, value int);
INSERT INTO #TB (idx, value) VALUES
(1, 152),
(2, 159),
(3, 259);
Main query inserts result into temporary table #TDst. It is possible to write that INSERT without using explicit variable #CurrValueB, but it looks a bit cleaner with variable.
DECLARE #TDst TABLE (idx int, valueA int, valueB int);
DECLARE #CurrIdx int;
DECLARE #CurrValueA int;
DECLARE #CurrValueB int;
DECLARE #iFS int;
DECLARE #VarCursor CURSOR;
SET #VarCursor = CURSOR FAST_FORWARD
FOR
SELECT idx, value
FROM #TA
ORDER BY idx;
OPEN #VarCursor;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
WHILE #iFS = 0
BEGIN
SET #CurrValueB =
(
SELECT TOP(1) Diff.valueB
FROM
(
SELECT B.value AS valueB
FROM #TB AS B
EXCEPT -- remove values that have been selected before
SELECT Dst.valueB
FROM #TDst AS Dst
) AS Diff
ORDER BY ABS(Diff.valueB - #CurrValueA)
);
INSERT INTO #TDst (idx, valueA, valueB)
VALUES (#CurrIdx, #CurrValueA, #CurrValueB);
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
END;
CLOSE #VarCursor;
DEALLOCATE #VarCursor;
SELECT * FROM #TDst ORDER BY idx;
Result
idx valueA valueB
1 123 152
2 245 259
3 342 159
4 456 NULL
5 608 NULL
It would help to have the following indexes:
TableA - (idx) include (value), because we SELECT idx, value ORDER BY idx;
TableB - (value) unique, Temp destination table - (valueB) unique filtered NOT NULL, to help EXCEPT. So, it may be better to have a temporary #table for result (or permanent table) instead of table variable, because table variables can't have indexes.
Another possible method would be to delete a row from table B (from original or from a copy) as its value is inserted into result. In this method we can avoid performing EXCEPT again and again and it could be faster overall, especially if it is OK to leave table B empty in the end. Still, I don't see how to avoid cursor and processing individual rows in sequence.
SQL Fiddle
DECLARE #TDst TABLE (idx int, valueA int, valueB int);
DECLARE #CurrIdx int;
DECLARE #CurrValueA int;
DECLARE #iFS int;
DECLARE #VarCursor CURSOR;
SET #VarCursor = CURSOR FAST_FORWARD
FOR
SELECT idx, value
FROM #TA
ORDER BY idx;
OPEN #VarCursor;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
WHILE #iFS = 0
BEGIN
WITH
CTE
AS
(
SELECT TOP(1) B.idx, B.value
FROM #TB AS B
ORDER BY ABS(B.value - #CurrValueA)
)
DELETE FROM CTE
OUTPUT #CurrIdx, #CurrValueA, deleted.value INTO #TDst;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
END;
CLOSE #VarCursor;
DEALLOCATE #VarCursor;
SELECT
A.idx
,A.value AS valueA
,Dst.valueB
FROM
#TA AS A
LEFT JOIN #TDst AS Dst ON Dst.idx = A.idx
ORDER BY idx;
I highly believe THIS IS NOT A GOOD PRACTICE because I am bypassing the policy SQL made for itself that functions with side-effects (INSERT,UPDATE,DELETE) is a NO, but due to the fact that I want solve this without resulting to iteration options, I came up with this and gave me better view of things now.
create table tablea
(
num INT,
val MONEY
)
create table tableb
(
num INT,
val MONEY
)
I created a hard-table temp which I shall drop from time-to-time.
if((select 1 from sys.tables where name = 'temp_tableb') is not null) begin drop table temp_tableb end
select * into temp_tableb from tableb
I created a function that executes xp_cmdshell (this is where the side-effect bypassing happens)
CREATE FUNCTION [dbo].[GetNearestMatch]
(
#ParamValue MONEY
)
RETURNS MONEY
AS
BEGIN
DECLARE #ReturnNum MONEY
, #ID INT
SELECT TOP 1
#ID = num
, #ReturnNum = val
FROM temp_tableb ORDER BY ABS(val - #ParamValue)
DECLARE #SQL varchar(500)
SELECT #SQL = 'osql -S' + ##servername + ' -E -q "delete from test..temp_tableb where num = ' + CONVERT(NVARCHAR(150),#ID) + ' "'
EXEC master..xp_cmdshell #SQL
RETURN #ReturnNum
END
and my usage in my query simply looks like this.
-- initialize temp
if((select 1 from sys.tables where name = 'temp_tableb') is not null) begin drop table temp_tableb end
select * into temp_tableb from tableb
-- query nearest match
select
*
, dbo.GetNearestMatch(a.val) AS [NearestValue]
from tablea a
and gave me this..

Insert missing values from table

I have a table with a PK that grows fairly quickly, but since rows are fairly consistently deleted, it becomes a very sparse table quickly as such:
ID VALUE
----------------
1 'Test'
5 'Test 2'
24 'Test 3'
67 'Test 4'
Is there a way that I can automatically insert the next value in the missing IDs so that I don't grow that ID extremely large? For example, I'd like to insert 'Test 5' with ID 2.
I wouldn't do that.
As already explained by others in the comments, you gain nothing by re-filling gaps in the numbers.
Plus, you might even unintentionally mess up your data if you refer to these IDs anywhere else:
Let's say that there once was a row with ID 2 and you deleted it.
Then you insert a complete new row and re-use ID 2.
Now if you have any data anywhere that references ID 2, it suddenly links to the new value instead of the old one.
(Note to nit-pickers: Yes, this should not happen if referential integrity is set up properly. But this is not the case everywhere, so who knows...)
I'm not suggesting doing what you're trying to do, but if you want to do it, this is how. I am only answering the question, not solving the problem.
In your proc, you'd what to lock your table while doing this so that you don't get one the sneaks in. By using something link this:
EXEC #result = sp_getapplock #Resource = #LockResource,
#LockMode = 'Exclusive'
AND
EXEC sp_releaseapplock #Resource = #LockResource
TABLE
DECLARE #table TABLE ( id INT, val VARCHAR(20) )
DATA
INSERT INTO #table
(
id,
val
)
SELECT 1,
'Test'
UNION ALL
SELECT 2,
'Test'
UNION ALL
SELECT 5,
'Test 2'
UNION ALL
SELECT 24,
'Test 3'
UNION ALL
SELECT 67,
'Test 4'
Queries
INSERT INTO #table
SELECT TOP 1
id + 1,
'TEST'
FROM #table t1
WHERE NOT EXISTS ( SELECT TOP 1
1
FROM #table
WHERE id = t1.id + 1 )
ORDER BY id
INSERT INTO #table
SELECT TOP 1
id + 1,
'TEST'
FROM #table t1
WHERE NOT EXISTS ( SELECT TOP 1
1
FROM #table
WHERE id = t1.id + 1 )
ORDER BY id
SELECT *
FROM #table
RESULT
id val
1 Test
2 Test
5 Test 2
24 Test 3
67 Test 4
3 TEST
4 TEST
I deleted my answer about identity since they are not involved. It would be interesting to see if you are using this as a clustered index key, since to fill in gaps would violate the rule of thumb of strictly increasing values.
To just fill in gaps is relatively simple with a self-join and since you have a primary key, this query should run quickly to find the first gap (but of course, how are you handling simultaneous inserts and locks?):
SELECT lhs.ID + 1 AS firstgap
FROM tablename AS lhs
LEFT JOIN tablename AS rhs
ON rhs.ID = lhs.ID + 1
WHERE rhs.ID IS NULL
And inserting batches of records requires each insert to be done separately, while IDENTITY can handle that for you...
As said before: don't worry about the unused ID's.
It is however good practise to optimize the table when a lot of deletes happen.
In MySQL you can do this with:
optimize table tablename

Add empty row to query results if no results found

I'm writing stored procs that are being called by a legacy system. One of the constraints of the legacy system is that there must be at least one row in the single result set returned from the stored proc. The standard is to return a zero in the first column (yes, I know!).
The obvious way to achieve this is create a temp table, put the results into it, test for any rows in the temp table and either return the results from the temp table or the single empty result.
Another way might be to do an EXISTS against the same where clause that's in the main query before the main query is executed.
Neither of these are very satisfying. Can anyone think of a better way. I was thinking down the lines of a UNION kind of like this (I'm aware this doesn't work):
--create table #test
--(
-- id int identity,
-- category varchar(10)
--)
--go
--insert #test values ('A')
--insert #test values ('B')
--insert #test values ('C')
declare #category varchar(10)
set #category = 'D'
select
id, category
from #test
where category = #category
union
select
0, ''
from #test
where ##rowcount = 0
Very few options I'm afraid.
You always have to touch the table twice, whether COUNT, EXISTS before, EXISTs in UNION, TOP clause etc
select
id, category
from mytable
where category = #category
union all --edit, of course it's quicker
select
0, ''
where NOT EXISTS (SELECT * FROM mytable where category = #category)
An EXISTS solution is better then COUNT because it will stop when it finds a row. COUNT will traverse all rows to actually count them
It's an old question, but i had the same problem.
Solution is really simple WITHOUT double select:
select top(1) WITH TIES * FROM (
select
id, category, 1 as orderdummy
from #test
where category = #category
union select 0, '', 2) ORDER BY orderdummy
by the "WITH TIES" you get ALL rows (all have a 1 as "orderdummy", so all are ties), or if there is no result, you get your defaultrow.
You can use a full outer join. Something to the effect of ...
declare #category varchar(10)
set #category = 'D'
select #test.id, ISNULL(#test.category, #category) as category from (
select
id, category
from #test
where category = #category
)
FULL OUTER JOIN (Select #category as CategoryHelper ) as EmptyHelper on 1=1
Currently performance testing this scenario myself so not sure on what kind of impact this would have but it will give you a blank row with Category populated.
This is #swe's answer, just reformatted:
CREATE FUNCTION [mail].[f_GetRecipients]
(
#MailContentCode VARCHAR(50)
)
RETURNS TABLE
AS
RETURN
(
SELECT TOP 1 WITH TIES -- Returns either all Priority 1 rows or, if none exist, all Priority 2 rows
[To],
CC,
BCC
FROM (
SELECT
[To],
CC,
BCC,
1 AS Priority
FROM mail.Recipients
WHERE 1 = 1
AND IsActive = 1
AND MailContentCode = #MailContentCode
UNION ALL
SELECT
*,
2 AS Priority
FROM (VALUES
(N'system#company.com', NULL, NULL),
(N'author#company.com', NULL, NULL)
) defaults([To], CC, BCC)
) emails
ORDER BY Priority
)
I guess you could try:
Declare #count int
set #count = 0
Begin
Select #count = Count([Column])
From //Your query
if(#Count = 0)
select 0
else //run your query
The downside is that you're effectively running your query twice, the up side is that you're skiping the temp table.
To avoid duplicating the selecting query, how about a temp table to store the query result first? And based on the temp table, return default row if the temp table is empty or return the temp when it has result?