How to detect change to update existing values with NULL [duplicate] - sql

This question already has answers here:
Avoiding IF ELSE due to variable been NULL
(3 answers)
Issues with SQL comparison and null values
(8 answers)
Is there a way to simplify a NULL compare of 2 values
(2 answers)
Closed last year.
I need to update a table if an existing record has changes. If the record exists (APK is found), then update it if ID_NUMBER has changed. The problem is when the existing TARGET value is NOT NULL and the SOURCE value IS NULL. How can I detect that condition as unequal?
Using ISNULL() can work, but the second parameter must -not- ever occur in the data. That requires profiling all of the NUMERIC data. Can it bedone without that? In this case, zero (0) can work only if it never occurs in the data.
UPDATE T SET
T.f1 = S.f1
FROM TARGET_TABLE T
INNER JOIN SOURCE_TABLE S
ON T.APK = S.APK
WHERE
ISNULL(T.ID_NUMBER,0) <> ISNULL(S.ID_NUMBER,0)
;
Here is the possible combinations of ID_NUMBER values in TARGET and SOURCE tables.
Target Source
====== ======
NULL NULL
NULL 7 -- should identify as unequal
7 NULL -- should identify as unequal, BUT DOES NOT
7 7
The following script shows the results. Only the third statement, comparing an existing TARGET value of 7 with the incoming SOURCE value of NULL will fail. Why is that? What code will work?
SET NOCOUNT ON;
SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
GO
SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
GO
SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
GO
SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
GO
Example execution:
1> SET NOCOUNT ON;
2> SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
3> GO
-----------
1> SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
2> GO
-----------
3
1> SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
2> GO
-----------
1> SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
2> GO
-----------
1>

I highly recommend doing some reading on NULL because it represents an unknown value and as such cannot be compared or added to another value. Therefore you have to treat it as a separate case using traditional AND/OR logic.
DECLARE #Table1 TABLE (APK int, ID_NUMBER int);
DECLARE #Table2 TABLE (APK int, ID_NUMBER int);
INSERT INTO #Table1 (APK, ID_NUMBER)
VALUES (1, null), (1, null), (1, 7), (1, 7), (1, 5);
INSERT INTO #Table2 (APK, ID_NUMBER)
VALUES (1, null), (1, 7), (1, null), (1, 7), (1, 4);
SELECT T.APK, T.ID_NUMBER, S.ID_NUMBER
FROM #Table1 T
INNER JOIN #Table2 S ON T.APK = S.APK
WHERE T.ID_Number <> S.ID_NUMBER
OR (T.ID_Number IS NULL AND S.ID_NUMBER IS NOT NULL)
OR (T.ID_Number IS NOT NULL AND S.ID_NUMBER IS NULL);
Given I suspect you have simplified your actual use-case you might find that EXCEPT can be used in your situation as EXCEPT (and INTERSECT) perform a different type of compare when it comes to NULLs. See here for more.

Please try the following solution.
It is based on use of a checksum in CTEs via HASHBYTES() function.
This method is working with NULL values and multiple columns in the tables.
I added UpdatedOn column to show what column was updated.
SQL
-- DDL and sample data population, start
DECLARE #Table1 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));
DECLARE #Table2 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));
INSERT INTO #Table1 (ID_NUMBER)
VALUES (null), (null), (7), (7), (5);
INSERT INTO #Table2 (ID_NUMBER)
VALUES (null), (7), (null), (7), (4);
-- DDL and sample data population, end
WITH source AS
(
SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum]
FROM #Table1 sp
CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
), target AS
(
SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum]
FROM #Table2 sp
CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
)
UPDATE T
SET T.ID_NUMBER = S.ID_NUMBER
, T.UpdatedOn = SYSDATETIMEOFFSET()
FROM TARGET AS T
INNER JOIN SOURCE AS S
ON T.APK = S.APK
WHERE T.[Checksum] <> S.[Checksum];
-- test
SELECT * FROM #Table2;
Output
+-----+-----------+--------------------------------+
| APK | ID_NUMBER | UpdatedOn |
+-----+-----------+--------------------------------+
| 1 | NULL | NULL |
| 2 | NULL | 2022-02-09 18:58:10.336 -05:00 |
| 3 | 7 | 2022-02-09 18:58:10.336 -05:00 |
| 4 | 7 | NULL |
| 5 | 5 | 2022-02-09 18:58:10.336 -05:00 |
+-----+-----------+--------------------------------+

Related

Find data by multiple Lookup table clauses

declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
The Name Table has more than just 1,2,3 and the list to parse on is dynamic
NameTable
id | name
----------
1 foo
2 bar
3 steak
CharacterTable
id | name
---------
1 tom
2 jerry
3 dog
NameToCharacterTable
id | nameId | characterId
1 1 1
2 1 3
3 1 2
4 2 1
I am looking for a query that will return a character that has two names. For example
With the above data only "tom" will be returned.
SELECT *
FROM nameToCharacterTable
WHERE nameId in (1,2)
The in clause will return every row that has a 1 or a 3. I want to only return the rows that have both a 1 and a 3.
I am stumped I have tried everything I know and do not want to resort to dynamic SQL. Any help would be great
The 1,3 in this example will be a dynamic list of integers. for example it could be 1,3,4,5,.....
Filter out a count of how many times the Character appears in the CharacterToName table matching the list you are providing (which I have assumed you can convert into a table variable or temp table) e.g.
declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
declare #RequiredNames table (nameId int);
insert into #RequiredNames (nameId)
values
(1),
(2);
select *
from #Character C
where (
select count(*)
from #NameToCharacter NC
where NC.characterId = c.id
and NC.nameId in (select nameId from #RequiredNames)
) = 2;
Returns:
id
name
1
tom
Note: Providing DDL+DML as shown here makes it much easier for people to assist you.
This is classic Relational Division With Remainder.
There are a number of different solutions. #DaleK has given you an excellent one: inner-join everything, then check that each set has the right amount. This is normally the fastest solution.
If you want to ensure it works with a dynamic amount of rows, just change the last line to
) = (SELECT COUNT(*) FROM #RequiredNames);
Two other common solutions exist.
Left-join and check that all rows were joined
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM #RequiredNames rn
LEFT JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = COUNT(nc.nameId) -- all rows are joined
);
Double anti-join, in other words: there are no "required" that are "not in the set"
SELECT *
FROM #Character c
WHERE NOT EXISTS (SELECT 1
FROM #RequiredNames rn
WHERE NOT EXISTS (SELECT 1
FROM #NameToCharacter nc
WHERE nc.nameId = rn.nameId AND nc.characterId = c.id
)
);
A variation on the one from the other answer uses a windowed aggregate instead of a subquery. I don't think this is performant, but it may have uses in certain cases.
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM (
SELECT *, COUNT(*) OVER () AS cnt
FROM #RequiredNames
) rn
JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = MIN(rn.cnt)
);
db<>fiddle

I need to do retrofit query using update or merge

I have two tables A and B. In A, I have a column called fetch_year. I need to consider table B from these two columns
primary_date
secondary_date
These columns have JSON values like {"lock":"true","date":"01/01/1990"}
So from this, I need to get the date and I need to extract the year and should save it in table A column called fetch_year. Will always consider primary_date first then secondary_date(if primary_date is null)
The final result should be 1990 in the fetch_year column
Table A is empty as of now( only one column with cal_id)
cal_id fetch_year
1 null
n null
Table B
|B_id|Cal_id | primary_date | secondary_date |
|----|-------|-----------------------------------|------------------------|
|11 | 1 |{"lock":"true","date":"01/01/1990"}|Null|
|12 | 2 | Null | {"lock":"true","date":"01/01/1980"} |
|13 | 3 | Null | Null |
|14 | 4 | {"lock":"true","date":"01/01/1995"} |{"lock":"true","date":"01/01/1997"} |
In table B
So I have n number of records in both the tables
I need results like this in A table
Cal_id fetch_year.
1 1990
2 1980
3 Null
4 1995
n n-values
In cal_id =4 in this case we have value in both columns so we are considering primary_date not secondary_date
Please help me with this problem
You could make use of either JSON_VALUE or OPENJSON here to extract the date from your JSON blobs.
I tend to prefer OPENJSON because it allows you to extract multiple values simultaneously and they don't have to be at the same level in a nested JSON structure. With the "squirelly" dates in your example data, though, you may prefer the JSON_VALUE version with TRY_CONVERT so that you have more control over date deserialization.
--Data setup
create table dbo.A (
Cal_id int,
fetch_year int
);
create table dbo.B (
B_id int not null identity(11,1),
Cal_id int,
primary_date nvarchar(max),
secondary_date nvarchar(max)
);
insert dbo.A (Cal_id, fetch_year)
values
(1, null),
(2, null),
(3, null),
(4, null);
insert dbo.B (Cal_id, primary_date, secondary_date)
values
(1, N'{"lock":"true","date":"01/01/1990"}', null),
(2, null, N'{"lock":"true","date":"01/01/1980"}'),
(3, null, null),
(4, N'{"lock":"true","date":"01/01/1995"}', N'{"lock":"true","date":"01/01/1997"}');
--JSON_VALUE example
update Table_A
set fetch_year = year(coalesce(
-- REF: CAST and CONVERT / Date and time styles
-- https://learn.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql#date-and-time-styles
try_convert(date, json_value(primary_date, '$.date'), 101), --mm/dd/yyyy
try_convert(date, json_value(secondary_date, '$.date'), 101) --mm/dd/yyyy
))
from dbo.A Table_A
join dbo.B Table_B on Table_B.Cal_id = Table_A.Cal_id
--OPENJSON example
update Table_A
set fetch_year = year(coalesce(
Primary_JSON.date,
Secondary_JSON.date
))
from dbo.A Table_A
join dbo.B Table_B on Table_B.Cal_id = Table_A.Cal_id
outer apply openjson(Table_B.primary_date) with ([date] date) Primary_JSON
outer apply openjson(Table_B.secondary_date) with ([date] date) Secondary_JSON;

How to create column based on previous rows?

I have the following table:
id type
1 NULL
2 A
3 NULL
4 NULL
5 A
6 NULL
7 B
8 A
9 B
10 NULL
I want to create a column where each row takes the current status if exist if not take the status from the previous one.
Basically want to get this:
id type new_type
1 NULL NULL -- firs one no previous so it can only be null
2 A A -- current type A so take it
3 NULL A -- no type so take the new_type of previous
4 NULL A
5 A A
6 NULL A
7 B B
8 A A
9 B B
10 NULL B
I know I need window function here but I don't know how a window function can reference a column that is "in progress" basically the window function need to reference both type and new_type but new_type doesn't exist yet.. it's the output.
How can this be done in SQL / Presto?
Presto has comprehensive support for window functions. Here, you can use lag() with the ignore nulls option to replace null values in column type:
select
id,
type,
coalesce(
type,
lag(type ignore nulls) over(order by id)
) new_type
from mytable
Needs a cursor, especially if the id is not guarantee to be sequential and without gaps.
This will run in MS-SQL:
-- stage sample data
drop table if exists oldTable
create table oldTable (id int, old_type nvarchar(1))
go
insert into oldTable values (1, null), (2, 'A'), (3, null), (4, null), (5, 'A'), (6, null), (7, 'B'), (8, 'A'), (9, 'B'), (10, null)
go
-- get new table ready
drop table if exists newTable
create table newTable (
id int,
old_type nvarchar(1),
new_type nvarchar(1)
)
GO
-- prepare for lots of cursing
declare #the_id int
declare #the_type nvarchar(1)
declare #running_type nvarchar(1)
declare mycursor cursor for
select
id, old_type
from
oldTable
-- do a barrel roll
open mycursor
fetch mycursor into #the_id, #the_type
while ##ERROR = 0 and ##FETCH_STATUS = 0
begin
set #running_type = COALESCE(#the_type, #running_type)
insert into newTable(id, old_type, new_type)
values (#the_id, #the_type, #running_type)
fetch mycursor into #the_id, #the_type
end
close mycursor
deallocate mycursor
go
-- verify results
select * from newTable
go
How can this be done in SQL
For example, it can be
SELECT t1.id,
t1.type,
( SELECT t2.type
FROM sourcetable t2
WHERE t2.id <= t1.id
AND t2.type IS NOT NULL
ORDER BY id DESC
LIMIT 1 ) new_type
FROM sourcetable t1

Filtering Nulls from multiple columns

Using SQL Server 2005.
Data is in 2 separate tables and I have only been given write permissions.
Data looks like:
DateTime1 | DateTime2
-----------------------
2012-06-01 | 2012-06-01
2012-06-02 | 2012-06-02
2012-06-04 | 2012-06-05
2012-06-02 | NULL
NULL | 2012-06-05
2012-06-04 | 2012-06-05
NULL | NULL
What I am trying to do is be able to count values in which DateTime1 and DateTime2 contain values, DateTime1 contains a date and DateTime2 is NULL, DateTime1 is NULL and DateTime2 contains values.
Overall Im trying to avoid DateTime1 being Null and DateTime2 being null.
My where statement looks like this:
Where (DateTime1 is not null or DateTime2 is not null)
The only problem is it is still showing where both are null values. Anyone know why this might be happening or how to solve it?
Thanks
EDIT
Full Query as requested by #Lamak
;With [CTE] As (
Select
TH.ID
,AMT
,Reason
,EffDate
,DateReq
,CS_ID
,ROW_NUMBER()
Over (Partition By ID Order By [pthPrimeKey] Desc) as [RN]
From
DateTime1Table as [MC] (nolock)
Left Join History as [TH] (nolock) on [TH].[ID] = [MC].[ID]
Left Join Trans as [SUB] (nolock) on [SUB].TransactionReasonCode = [TH].Reason
Left Join Renew as [RM] (nolock) on [MC].ID = [RM].ID
Where
([MC].[DateTime1] is not null or [RM].[DateTime2] is not null)
And [PostingDate] = DATEADD(dd, datediff(dd, 1, GetDate()),0)
)
SELECT
[ID]
,[AMT] as [Earned]
,[Reason] as [Reason]
,[EffDate] as [Eff]
,[DateReq] as [Date_Cancel_Req]
,[pthUserId_Number] as [CSR]
FROM [CTE]
Where RN <= 1
The following will allow rows to be included if
only DateTime1 has a value
only DateTime2 has a value
both have values
It will exclude rows where both values are NULL. Is that what you're after? (I tried to follow the conversations but got lost, and wish you'd have a simpler repro with sample data - I think the CTE and all the other joins and logic really take away from the actual problem you're having.)
WHERE COALESCE([MC].[DateTime1], [RM].[DateTime2]) IS NOT NULL
However, since you're performing a LEFT OUTER JOIN, this may belong in the ON clause for [RM] instead of WHERE. Otherwise you won't know if a row is excluded because the value in a matching row was NULL, or because there was no matching row. And maybe that's ok, just thought I would mention it.
EDIT
Of course, that clause provides the exact same results as ...
WHERE ([MC].[DateTime1] is not null or [RM].[DateTime2] is not null)
Want proof?
DECLARE #a TABLE(id INT, DateTime1 DATETIME);
DECLARE #b TABLE(id INT, DateTime2 DATETIME);
INSERT #a SELECT 1, '20120602' ; INSERT #b SELECT 1, NULL;
INSERT #a SELECT 2, NULL ; INSERT #b SELECT 2, '20120605';
INSERT #a SELECT 3, '20120604' ; INSERT #b SELECT 3, '20120605';
INSERT #a SELECT 4, NULL ; INSERT #b SELECT 4, NULL;
INSERT #a SELECT 5, '20120602' ; INSERT #b SELECT 9, NULL;
INSERT #a SELECT 6, NULL ; INSERT #b SELECT 10, '20120605';
INSERT #a SELECT 7, '20120604' ; INSERT #b SELECT 11, '20120605';
INSERT #a SELECT 8, NULL ; INSERT #b SELECT 12, NULL;
SELECT * FROM #a AS a LEFT OUTER JOIN #b AS b
ON a.id = b.id
WHERE COALESCE(a.DateTime1, b.DateTime2) IS NOT NULL;
SELECT * FROM #a AS a LEFT OUTER JOIN #b AS b
ON a.id = b.id
WHERE a.DateTime1 IS NOT NULL OR b.DateTime2 IS NOT NULL;
Both queries yield:
id DateTime1 id DateTime2
-- ---------- ---- ----------
1 2012-06-02 1 NULL -- because left is not null
2 NULL 2 2012-06-05 -- because right is not null
3 2012-06-04 3 2012-06-05 -- because neither is null
5 2012-06-02 NULL NULL -- because of no match
7 2012-06-04 NULL NULL -- because of no match
So as I suggested in the comment, if you're not seeing the rows you expect, you need to look at other parts of the query. If you provide sample data and desired results, we can try to help you narrow that down. As it is, I don't think we know enough about your schema and data to determine where the problem is.

How do I join an unknown number of rows to another row?

I have this scenario:
Table A:
---------------
ID| SOME_VALUE|
---------------
1 | 123223 |
2 | 1232ff |
---------------
Table B:
------------------
ID | KEY | VALUE |
------------------
23 | 1 | 435 |
24 | 1 | 436 |
------------------
KEY is a reference to to Table A's ID. Can I somehow join these tables so that I get the following result:
Table C
-------------------------
ID| SOME_VALUE| | |
-------------------------
1 | 123223 |435 |436 |
2 | 1232ff | | |
-------------------------
Table C should be able to have any given number of columns depending on how many matching values that are found in Table B.
I hope this enough to explain what I'm after here.
Thanks.
You need to use a Dynamic PIVOT clause in order to do this.
EDIT:
Ok so I've done some playing around and based on the following sample data:
Create Table TableA
(
IDCol int,
SomeValue varchar(50)
)
Create Table TableB
(
IDCol int,
KEYCol int,
Value varchar(50)
)
Insert into TableA
Values (1, '123223')
Insert Into TableA
Values (2,'1232ff')
Insert into TableA
Values (3, '222222')
Insert Into TableB
Values( 23, 1, 435)
Insert Into TableB
Values( 24, 1, 436)
Insert Into TableB
Values( 25, 3, 45)
Insert Into TableB
Values( 26, 3, 46)
Insert Into TableB
Values( 27, 3, 435)
Insert Into TableB
Values( 28, 3, 437)
You can execute the following Dynamic SQL.
declare #sql varchar(max)
declare #pivot_list varchar(max)
declare #pivot_select varchar(max)
Select
#pivot_list = Coalesce(#Pivot_List + ', ','') + '[' + Value +']',
#Pivot_select = Coalesce(#pivot_Select, ', ','') +'IsNull([' + Value +'],'''') as [' + Value + '],'
From
(
Select distinct Value From dbo.TableB
)PivotCodes
Set #Sql = '
;With p as (
Select a.IdCol,
a.SomeValue,
b.Value
From dbo.TableA a
Left Join dbo.TableB b on a.IdCol = b.KeyCol
)
Select IdCol, SomeValue ' + Left(#pivot_select, Len(#Pivot_Select)-1) + '
From p
Pivot ( Max(Value) for Value in (' + #pivot_list + '
)
)as pvt
'
exec (#sql)
This gives you the following output:
Although this works at the moment it would be a nightmare to maintain. I'd recommend trying to achieve these results somewhere else. i.e not in SQL!
Good luck!
As Barry has amply illustrated, it's possible to get multiple columns using a dynamic pivot.
I've got a solution that might get you what you need, except that it puts all of the values into a single VARCHAR column. If you can split those results, then you can get what you need.
This method is a trick in SQL Server 2005 that you can use to form a string out of a column of values.
CREATE TABLE #TableA (
ID INT,
SomeValue VARCHAR(50)
);
CREATE TABLE #TableB (
ID INT,
TableAKEY INT,
BValue VARCHAR(50)
);
INSERT INTO #TableA VALUES (1, '123223');
INSERT INTO #TableA VALUES (2, '1232ff');
INSERT INTO #TableA VALUES (3, '222222');
INSERT INTO #TableB VALUES (23, 1, 435);
INSERT INTO #TableB VALUES (24, 1, 436);
INSERT INTO #TableB VALUES (25, 3, 45);
INSERT INTO #TableB VALUES (26, 3, 46);
INSERT INTO #TableB VALUES (27, 3, 435);
INSERT INTO #TableB VALUES (28, 3, 437);
SELECT
a.ID
,a.SomeValue
,RTRIM(bvals.BValues) AS ValueList
FROM #TableA AS a
OUTER APPLY (
-- This has the effect of concatenating all of
-- the BValues for the given value of a.ID.
SELECT b.BValue + ' ' AS [text()]
FROM #TableB AS b
WHERE a.ID = b.TableAKEY
ORDER BY b.ID
FOR XML PATH('')
) AS bvals (BValues)
ORDER BY a.ID
;
You'll get this as a result:
ID SomeValue ValueList
--- ---------- --------------
1 123223 435 436
2 1232ff NULL
3 222222 45 46 435 437
This looks like something a database shouldn't do. Firstly; a table cannot have arbitrary number of columns depending on whatever you'll store. So you will have to put up a maximum number of values anyway. You can get around this by using comma seperated values as value for that cell (or a similar pivot-like solution).
However; if you do have table A and B; i recommend keeping to those two tables; as they seem to be pretty normalised. Should you need a list of b.value given an input a.some_value, the following sql query gives that list.
select b.value from a,b where b.key=a.id a.some_value='INPUT_VALUE';