Iterating through every row efficient? - sql

Suppose I have the following table T1:
| col1 | col2 |
|------|------|
| 0 | 0 | // A++
| 3 | 123 | // B++
| 0 | 5 | // C++
| 8 | 432 | // A++
| 0 | 4 | // B++
I now need to create a trigger (on INSERT), that analyses every row, increases a counter (see below), populates the table T2 with the values of the counter:
IF col1 = 0 AND col2 = 0
A++
ELSE IF col1 = 0 col2 > 0
B++
ELSE IF col1 > 0
C++
In this case, T2 would look like:
| id | A | B | C |
|----|---|---|---|
| 1 | 1 | 2 | 2 |
My question is more about the design: Should I really iterate through each row, as described HERE, or is there a more efficient way?

Try something like this in trigger
;with data as
(
SELECT Sum(CASE WHEN col1 = 0 AND col2 = 0 THEN 1 END) AS a,
Sum(CASE WHEN col1 = 0 AND col2 > 0 THEN 1 END) AS b,
Sum(CASE WHEN col1 > 0 THEN 1 END) AS c
FROM (VALUES (0, 0 ),
(3, 123 ),
(0, 5 ),
(8, 432 ),
(0, 4 ) ) tc ( col1, col2 )
)
UPDATE yt
SET a = dt.a,
b = dt.b,
c = dt.c
FROM yourtable yt
JOIN data dt
ON a.id = b.id
This does not require row by row iteration. Replace the table valued constructor with Inserted table

This is something you should not write into a table (unless there aren't millions of rows and you need this for performance...). You should rather get this information on-the-fly like this:
DECLARE #T1 TABLE(col1 INT,col2 INT);
INSERT INTO #T1(col1,col2) VALUES
(0,0)
,(3,123)
,(0,5)
,(8,432)
,(0,4);
SELECT p.*
FROM
(
SELECT CASE WHEN col1=0 AND col2=0 THEN 'A'
WHEN col1=0 AND col2>0 THEN 'B'
WHEN col1>0 THEN 'C' END AS Category
FROM #T1 AS t
) AS tbl
PIVOT
(
COUNT(Category) FOR Category IN(A,B,C)
) AS p
The result
A B C
1 2 2
And I would suggest you to add another option (with ELSE) to catch invalid data (e.g. negativ values).

Related

Why and How do ORDER BY CASE Queries Work in SQL Server?

Let's look at the following table:
| col1 | col2 |
| -------- | -------------- |
| 1 | NULL |
| 23 | c |
| 73 | NULL |
| 43 | a |
| 3 | d |
Suppose you wanted to sort it like this:
| col1 | col2 |
| -------- | -------------- |
| 1 | NULL |
| 73 | NULL |
| 43 | a |
| 23 | c |
| 3 | d |
With the following code this would be almost trivial:
SELECT *
FROM dbo.table1
ORDER BY col2;
However, to sort it in the following, non-standard way isn't that easy:
| col1 | col2 |
| -------- | -------------- |
| 43 | a |
| 23 | c |
| 3 | d |
| 1 | NULL |
| 73 | NULL |
I made it with the following code
SELECT *
FROM dbo.table1
ORDER BY CASE WHEN col2 IS NULL THEN 1 ELSE 0 END, col2;
Can you explain to me 1) why and 2) how this query works? What bugs me is that the CASE-statement returns either 1 or 0 which means that either ORDER BY 1, col2 or ORDER BY 0, col2 will be executed. But the following code gives me an error:
SELECT *
FROM dbo.table1
ORDER BY 0, col2;
Yet, the overall statement works. Why?
How does this work?
ORDER BY (CASE WHEN col2 IS NULL THEN 1 ELSE 0 END),
col2;
Well, it works exactly as the code specifies. The first key for the ORDER BY takes on the values of 1 and 0 based on col2. The 1 is only when the value is NULL. Because 1 > 0, these are sorted after the non-NULL values. So, all non-NULL values are first and then all NULL values.
How are the non-NULL values sorted? That is where the second key comes in. They are ordered by col2.
Starting with this sample data:
--==== Sample Data
DECLARE #t TABLE (col1 INT, col2 VARCHAR(10))
INSERT #t(col1,col2) VALUES (1,NULL),(23,'c'),(73,NULL),(43,'a'),(3 ,'d');
Now note these three queries that do the exact same thing.
--==== QUERY1: Note the derived query
SELECT t.col1, t.col2
FROM
(
SELECT t.col1, t.col2, SortBy = CASE WHEN col2 IS NULL THEN 1 ELSE 0 END
FROM #t AS t
) AS t
ORDER BY t.SortBy;
--==== QUERY2: This does the same thing but with less code
SELECT t.col1, t.col2, SortBy = CASE WHEN col2 IS NULL THEN 1 ELSE 0 END
FROM #t AS t
ORDER BY SortBy;
--==== QUERY3: This is QUERY2 simplified
SELECT t.col1, t.col2
FROM #t AS t
ORDER BY CASE WHEN col2 IS NULL THEN 1 ELSE 0 END;
Note that you can simplify your CASE statements like so:
--==== Simplified Case statemnt examples
SELECT t.col1, t.col2
FROM #t AS t
ORDER BY CASE col2 WHEN NULL THEN 1 ELSE 0 END;
SELECT t.col1, t.col2
FROM #t AS t
ORDER BY IIF(col2 IS NULL,1,0);
Try this:
DECLARE #Table TABLE (col1 int, col2 char(1))
INSERT INTO #Table
VALUES
( 1 , NULL)
, ( 23, 'c' )
, ( 73, NULL)
, ( 43, 'a' )
, ( 3 , 'd' )
;
SELECT *
FROM #Table
ORDER BY ISNULL(col2, CHAR(255))
Common table expressions can be a big help both as a way of clarifying an issue as well as solving it. If you move the CASE clause up into the CTE and then use it to sort, this answers both why and how it works.
With Qry1 (
SELECT col1,
col2,
CASE WHEN col2 IS NULL THEN 1 ELSE 0 END As SortKey
FROM dbo.table1
)
SELECT *
FROM Qry1
ORDER BY SortKey, col2;
This is a description for oracle database SQL's ORDER BY:
ORDER [ SIBLINGS ] BY
{ expr | position | c_alias }
[ ASC | DESC ]
[ NULLS FIRST | NULLS LAST ]
[, { expr | position | c_alias }
[ ASC | DESC ]
[ NULLS FIRST | NULLS LAST ]
]...
We can see that position and expr were depicted as separate paths in the diagram. From the fact, we can conclude that the 0 and 1 are not categorized as position because the CASE expression is not position even though the expression would be evaluated to a number, which is can be viewed as position value.
I think this view can be applied to T-SQL too.

How to Update the Following Table with MERGE in Oracle?

I have the following data set (Oracle 12):
Table X
+---------+--------+---------------+--------+
| COLN | COLM | COLK | COLP |
+---------+--------+---------------+--------+
| 1 | 500 | K1 | 777 |
+---------+--------+---------------+--------+
Table A
+---------+--------+---------------+--------+
| COL1 | COL2 | COL3 | COL4 |
+---------+--------+---------------+--------+
| 1 | K1 | 500 | B |
| 1 | K2 | 500 | NULL |
+---------+--------+---------------+--------+
Table B
+---------+--------+---------+
| COLZ | COLX | COLW |
+---------+--------+---------+
| 1 | K1 | 777 |
| 1 | K2 | 678 |
+---------+--------+---------+
The three tables have the following commonality:
X.COLN = A.COL1 = B.COLZ
X.COLk = A.COL2 = B.COLX
X.COLM = A.COL3
I need to write a query which retrieves values for the following columns in one query:
X.COLK, X.COLP, B.COLX, B.COLW
The ultimate goal is, if the following conditions are met:
If there more than one record in Table A where A.COL1's and A.COL3's are matching (and there is a corresponding record in Table X)
And one of the rows is not null, e.g. A.COL4 = B, and another one is NULL
I update Table X to replace X.COLK, X.COLP (K1 and 777) in my MERGE statement with values in Table B (B.COLX, B.COLW -- K2 and 678).
Is this possible?
MERGE INTO X FX
USING (
SELECT COLX ONGOING_X, COLW ONGOING_W
FROM B
WHERE (COLZ, COLX) IN
(SELECT COL1, COL2
FROM A
WHERE COL3 = ?
AND COL1 = ?
AND COL4 IS NULL)
) NEW_B
ON (FX.COLk = ?
AND FX.COLP = ?)
WHEN MATCHED THEN
UPDATE SET
FX.COLk = NEW_B.ONGOING_X,
FX.FOLP = NEW_B.ONGOING_W;
You may do a MERGE using ROWID.
MERGE INTO x tgt USING (
WITH c AS (
SELECT col1,
col3,
MAX(
CASE
WHEN col4 IS NULL THEN col2
END
) AS col2 --Ongoing col2 as indicated from col4
FROM a
GROUP BY col1,
col3
HAVING COUNT(
CASE
WHEN col4 IS NULL THEN 1
END
) = 1 AND COUNT(col4) = 1 --Contains one and exactly one NULL and one NON NULL
) SELECT x.rowid AS rid,
b.*
FROM x
JOIN c ON c.col1 = x.coln AND c.col3 = x.colm
JOIN b ON b.colz = c.col1 AND b.colx = c.col2 --Join with ongoing value from c( a.k.a table A )
)
src ON ( tgt.rowid = src.rid ) --ROWID match
WHEN MATCHED THEN UPDATE SET tgt.colk = src.colx,
tgt.colp = src.colw;
Demo

SQL help on count in certain ID

Do you know how to display only the lines in table for same ID where col3 is not 'X'?
e.g., in the following table, it should display only ID 2 (as all the col2 are null)
ID | col1 | col2 | col3
---+------+------+-----
1 | 0 | 0 | X
1 | D | C | null
1 | D | C | null
2 | 0 | 0 | null
2 | D | C | null
2 | D | C | null
It should work for all ID with some many line by ID and only the same ID with all line having null.
If you are looking to get records where ID does not have at least one X in col 3 for other records:
SELECT Y.*
FROM Your_Table Y
WHERE Y.ID NOT IN (SELECT X.ID FROM YOUR_TABLE X WHERE X.ID=Y.ID AND X.COL3='X')
Most DBMS support 3 valued logic - True, False, and Undefined. NULL <> 3 is undefined, since NULL is an unknown value. You need to handle NULLs explicitly.
SELECT *
FROM Your_Table
WHERE col3 <> X
OR col3 IS NULL;
select * from table
where (col1 = col2) and (col3 <> 'X')
Use window functions or not exists:
select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.col3 = 'X');

Update unique rows in SQL

I have a table
id | col1 | col3| col4
1 | x | r |
2 | y | m |
3 | z | p |
4 | x | r |
i have to update all unique rows of this table
i.e
id | col1 | col3| col4
1 | x | r | 1
2 | y | m | 1
3 | z | p | 1
4 | x | r | 0
i can fetch unique rows by
select distinct col1,col2 from table
.But how can i identify these rows in order to update them.Please help.
You can use the group by to pick unique result:
SELECT MIN(ID) AS ID FROM TABLE GROUP BY COL1, COL3;
id | col1 | col3
1 | x | r
2 | y | m
3 | z | p
Then
UPDATE TABLE SET col4 = 1 WHERE ID IN (SELECT MIN(ID) FROM TABLE GROUP BY COL1, COL3);
Restriction is that the id column should be unique.
If it is a small enough table, here is what you can do
Step 1: Update everything to 1
Update Table Set Col4 = 1
Step 2: Update all dups to 0 (OTTOMH)
Update Table
Set Col4 = 0
From
(
Select Col1, Min (Id) FirstId
From Table
Group By Col1
Having Count (*) > 1
) Duplicates
Where Table.Col1 = Duplicates.Col1
And Table.Id <> Duplicates.FirstId
You can also try:
UPDATE test
SET col4 = 1
WHERE id IN
(
SELECT t1.id
FROM table_name t1
LEFT JOIN table_name t2
ON t2.id < t1.id
AND t2.col1 = t1.col1
AND t2.col3 = t1.col3
WHERE t2.id IS NULL
)
One more slightly convoluted option, to set both 0 and 1 values in one hit:
update my_table mt
set col4 = (
select case when rn = 1 then 1 else 0 end
from (
select id,
row_number() over (partition by col1, col3 order by id) as rn
from my_table) tt
where tt.id = mt.id);
4 rows updated.
select * from my_table order by id;
ID COL1 COL3 COL4
---------- ---- ---- ----------
1 x r 1
2 y m 1
3 z p 1
4 x r 0
This is just using row_number() to decide which of the unique combinations is first, arbitrarily using the lowest id, assigning that the value of one, and everything else zero.

SELECT with calculated column that is dependent upon a correlation

I don't do a lot of SQL,and most of the time, I'm doing CRUD operations. Occasionally I'll get something a bit more complicated. So, this question may be a newbie question, but I'm ready. I've just been trying to figure this out for hours, and it's been no use.
So, Imagine the following table structure:
> | ID | Col1 | Col2 | Col3 | .. | Col8 |
I want to select ID and a calculated column. The calculated column has a range of 0 - 8 and it contains the number of matches to the query. I also want to restrict the result set to only include rows that have a certain number of matches.
So, from this sample data:
> | 1 | 'a' | 'b' | 1 | 2 |
> | 2 | 'b' | 'c' | 1 | 2 |
> | 3 | 'b' | 'c' | 4 | 5 |
> | 4 | 'x' | 'x' | 9 | 9 |
I want to query on Col1 = 'a' OR Col2 = 'c' OR Col3 = 1 OR Col4 = 5 where the calculated result > 1 and have the result set look like:
> | ID | Cal |
> | 1 | 2 |
> | 2 | 2 |
> | 3 | 2 |
I'm using T-SQL and SQL Server 2005, if it matters, and I can't change the DB Schema.
I'd also prefer to keep it as one self-contained query and not have to create a stored procedure or temporary table.
This answer will work with SQL 2005, using a CTE to clean up the derived table a little.
WITH Matches AS
(
SELECT ID, CASE WHEN Col1 = 'a' THEN 1 ELSE 0 END +
CASE WHEN Col2 = 'c' THEN 1 ELSE 0 END +
CASE WHEN Col3 = 1 THEN 1 ELSE 0 END +
CASE WHEN Col4 = 5 THEN 1 ELSE 0 END AS Result
FROM Table1
WHERE Col1 = 'a' OR Col2 = 'c' OR Col3 = 1 OR Col4 = 5
)
SELECT ID, Result
FROM Matches
WHERE Result > 1
Here's a solution that leverages the fact that a boolean comparison returns the integers 1 or 0:
SELECT * FROM (
SELECT ID, (Col1='a') + (Col2='c') + (Col3=1) + (Col4=5) AS calculated
FROM MyTable
) q
WHERE calculated > 1;
Note that you have to parenthesize the boolean comparisons because + has higher precedence than =. Also, you have to put it all in a subquery because you normally can't use a column alias in a WHERE clause of the same query.
It might seem like you should also use a WHERE clause in the subquery to restrict its rows, but in all likelihood you're going to end up with a full table scan anyway so it's probably not a big win. On the other hand, if you expect that such a restriction would greatly reduce the number of rows in the subquery result, then it'd be worthwhile.
Re Quassnoi's comment, if you can't treat boolean expressions as integer values, there should be a way to map boolean conditions to integers, even if it's a bit verbose. For example:
SELECT * FROM (
SELECT ID,
CASE WHEN Col1='a' THEN 1 ELSE 0 END
+ CASE WHEN Col2='c' THEN 1 ELSE 0 END
+ CASE WHEN Col3=1 THEN 1 ELSE 0 END
+ CASE WHEN Col4=5 THEN 1 ELSE 0 END AS calculated
FROM MyTable
) q
WHERE calculated > 1;
This query is more index friendly:
SELECT id, SUM(match)
FROM (
SELECT id, 1 AS match
FROM mytable
WHERE col1 = 'a'
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col2 = 'c'
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col3 = 1
UNION ALL
SELECT id, 1 AS match
FROM mytable
WHERE col4 = 5
) q
GROUP BY
id
HAVING SUM(match) > 1
This will only be efficient if all the columns you are searching for are, first, indexed and, second, have high cardinality (many distinct values).
See this article in my blog for performance details:
Matching 3 of 4