SQL detect change in row - sql

I have data from sql server attached :
select * from log
What I want to do is I want to check if there any changes in code for the column name. So if you see the data from table log, the code change 2 times (B02,B03).
What I want to do is I want to retrieve the row which is the first changes everytime the code change. In this sample, the first changes is on the red box. So I want to have the result for row 5 and row 9.
I've tried to use partition like code below:
select a.name,a.code from(
select name,code,row_number() over(partition by code order by name) as rank from log)a
where a.rank=1
and get result like this.
However, I don't want the first row to be retrieved. Since it is the first value and I don't need that. So i just want to retrieve the changes indicates by column code. Please help if you know how to do it.
and please note, I can't write query using filter where code <> 'B01', because in this case, I don't know what is the first value.
Please assume the first value is the data that first inserted into the table.

Use lag to get the previous row's value (assuming id specifies ordering) and get the rows where it is different from the current row's value.
create table #log (id int identity(1,1) not null, name nvarchar(100), code nvarchar(100));
insert into #log(name,code) values ('SARUMA','B01'), ('SARUMA','B01'), ('SARUMA','B01'), ('SARUMA','B01');
insert into #log(name,code) values ('SARUMA','B02'), ('SARUMA','B02'), ('SARUMA','B02'), ('SARUMA','B02');
insert into #log(name,code) values ('SARUMA','B03'), ('SARUMA','B03');
select name
,code
from (
select l.*
,lag(code) over (
partition by name order by id
) as prev_code
from #log l
) l
where prev_code <> code

create table #log (name nvarchar(100), code nvarchar(100));
insert into #log values ('SARUMA','B01'), ('SARUMA','B01'), ('SARUMA','B01'), ('SARUMA','B01');
insert into #log values ('SARUMA','B02'), ('SARUMA','B02'), ('SARUMA','B02'), ('SARUMA','B02');
insert into #log values ('SARUMA','B03'), ('SARUMA','B03');
-- remove duplicates
with Singles (name, code)
AS (
select distinct name, code from #log
),
-- At first you need an order, in time? By alphanumerical code? Otherwise you cannot decide which is the first item you want to remove
-- So I added an identity ordering, but it is preferable to use a physical column
OrderedSingles (name, code, id)
AS (
select *, row_number() over(order by name)
from Singles
)
-- Now self-join to get the next one, if the index is sequential you can join id = id+1
-- and take the join columns
select distinct ii.name, ii.Code
from OrderedSingles i
inner join OrderedSingles ii
on i.Name = ii.Name and i.Code <> ii.Code
where i.id < ii.Id;

I think that your original post was pretty close, though you would want the windowing function to be on the [NAME] column, not the code. Please see my modifications, below. I've also changed the predicate to be >1, as 1 would be the original record.
SELECT
a.[name]
,a.[code]
FROM (
SELECT
[name]
,[code]
,ROW_NUMBER() OVER(PARTITION BY [name] order by [name], [code]) AS [rank]
FROM log)a
WHERE a.rank>1
NOTE: you may want to not use NAME as a field, since it is a reserved word. Additionally, RANK is a reserved word as well, and you've used it to alias the ROW_NUMBER in the nested query. You may want to use another non-reserved word for that - personally, I use RANKED for that purpose.

Related

How to copy the column id from another table?

I'm stuck with this since last week. I have two tables, where the id column of CustomerTbl correlates with CustomerID column of PurchaseTbl:
What I'm trying to achieve is I want to duplicate the data of the table from itself, but copy the newly generated id of CustomerTbl to PurchaseTbl's CustomerID
Just like from the screenshots above. Glad for any help :)
You may use OUTPUT clause to access to the new ID. But to access to both OLD ID and NEW ID, you will need to use MERGE statement. INSERT statement does not allow you to access to the source old id.
first you need somewhere to store the old and new id, a mapping table. You may use table variable or temp table
declare #out table
(
old_id int,
new_id int
)
then the merge statement with output clause
merge
#CustomerTbl as t
using
(
select id, name
from CustomerTbl
) as s
on 1 = 2 -- force it to `false`, not matched
when not matched then
insert (name)
values (name)
output -- the output clause
s.id, -- old_id
inserted.id -- new_id
into #out (old_id, new_id);
after that you just use the #out to join back using old_id to obtain the new_id for the PurchaseTbl
insert into PurchaseTbl (CustomerID, Item, Price)
select o.new_id, p.Item, p.Price
from #out o
inner join PurchaseTbl p on o.old_id = p.CustomerID
Not sure what your end game is, but one way you could solve this is this:
INSERT INTO purchaseTbl ( customerid ,
item ,
price )
SELECT customerid + 3 ,
item ,
price
FROM purchaseTbl;

Enter Specific data depending on row criteria

Afternoon, I have the following SQL command:
SELECT INVOICE_ID,
ITEM_ID,
ORDER_NO,
CLIENT_STATE
FROM CUSTOMER_ORDER_INV_JOIN
WHERE ORDER_NO = '*1007';
This pulls out the following information:
[enter image description here][1]
There is a specific Criteria that I want to reach and that is the following:
On Order No: *1007, If client state on all lines = PaidPosted then I need another column to show 'PaidPosted' on all lines.
However On Order No: *1007, If Client State on 4 Lines = 'PaidPosted' but 1 or more lines = 'PostedAuth' then I need another column where all lines to show 'PostedAuth'. However if all of the lines are NULL I need a column where all lines show 'No Invoice'.
Hopefully this makes more sense.
I think this will get you what you need.
You can create a Temporary Table that has your sort order:
CREATE TABLE #Sort_Order
(myOrder INT,
CLIENT_STATE NVARCHAR(20)
)
INSERT INTO #Sort_Order
VALUES(1, 'Preliminary')
INSERT INTO #Sort_Order
VALUES(2, 'PostedAuth')
INSERT INTO #Sort_Order
VALUES(3, 'PaidPosted')
Then you can just join it to your table and run a RANK function on it like so:
SELECT
A.*,
DENSE_RANK() OVER (PARTITION BY A.ID
ORDER BY B.myOrder ASC) AS OrderRank
FROM #Temp A
INNER JOIN #Sort_Order B On (A.CLIENT_STATE = B.CLIENT_STATE)
WHERE A.ID = 1
This will give you results by RANK and you can use a WHERE statement to filter on only RANK = 1
If your data has multiple rows of the same Client State, you will need to do some kind of DISTINCT or GROUP BY.

Stuck finding combination differences

I am really stuck on a problem regards to finding out if there is difference between two columns. Row value is a follows:
Serial code
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A1
D03L30225 A2
so say if there was another entry like A2 at the end , is there a way of knowing combination serial/code difference.
I have tried windows functions like partition and rank without success
This should work for you. One thing to note is that you have to order by something. Perhaps what I have ordered by is not correct for you situation, but you need something there.
IF OBJECT_ID('tempdb..#Test', 'U') IS NOT NULL DROP TABLE #Test;
create table #Test
(
Serial varchar(10),
code char(2)
)
insert into #Test values ('D03L30225', 'A1')
insert into #Test values ('D03L30225', 'A1')
insert into #Test values ('D03L30225', 'A1')
insert into #Test values ('D03L30225', 'A2')
;
with cte as
(
select rownum = row_number() over (order by Serial, code), Serial, code
from #Test
)
select curr.Serial, curr.code,
case
when curr.code <> prev.code then
1
else
0
end as 'DifferenceFlag'
from cte curr
left join cte prev on prev.rownum = curr.rownum - 1
If you are using SQL Server 2012 or higher you could use the LAG function. We are still on SQL Server 2008 R2. So I needed to do something similar recently I found the method I used above here.
According to your comment, I assume that you want to add another column third_column to your table, and set value for this column according to the change of pair Serial,Code
If that's true, you could use this:
ALTER TABLE
table_name
ADD
third_column numeric(18,0);
UPDATE t
SET t.third_column = t1.rwn
FROM table_name AS t
INNER JOIN
(select
serial, code
,row_number() over (order by serial, code) - 1 as rwn
from
table_name
group by
serial, code
order by
serial, code
) AS t1
ON
t.serial = t1.serial and t.code = t1.code;
I might write the code slightly different. The first query below will just list those codes having more than one serial number. And, the second will flag a whole group of codes where within that code are contained multiple serial numbers.
The other solutions provided will give you a proper row number. In any case, I don't know if this will help, but good luck!
select
code,
count(distinct serial) cnt_serial
from table
group by code
having
count(distinct serial) > 1
OR
select
code,
serial,
case when count(distinct serial) over (partition by code) > 1 then 'Y' end fl_code_has_dup
from table

I am looking for a way for a trigger to insert into a second table only where the value in table 1 changes

I am looking for a way for a trigger to insert into a second table only where the value in table 1 changes. It is essentially an audit tool to trap any changes made. The field in table 1 is price and we want to write additional fields.
This is what I have so far.
CREATE TRIGGER zmerps_Item_costprice__update_history_tr ON [ITEM]
FOR UPDATE
AS
insert into zmerps_Item_costprice_history
select NEWID(), -- unique id
GETDATE(), -- CURRENT_date
'PRICE_CHANGE', -- reason code
a.ima_itemid, -- item id
a.ima_price-- item price
FROM Inserted b inner join item a
on b.ima_recordid = a.IMA_RecordID
The table only contains a unique identifier, date, reference(item) and the field changed (price). It writes any change not just a price change
Is it as simple as this? I moved some of the code around because comments after the comma between columns is just painful to maintain. You also should ALWAYS specify the columns in an insert statement. If your table changes this code will still work.
CREATE TRIGGER zmerps_Item_costprice__update_history_tr ON [ITEM]
FOR UPDATE
AS
insert into zmerps_Item_costprice_history
(
UniqueID
, CURRENT_date
, ReasonCode
, ItemID
, ItemPrice
)
select NEWID()
, GETDATE()
, 'PRICE_CHANGE'
, d.ima_itemid
, d.ima_price
FROM Inserted i
inner join deleted d on d.ima_recordid = i.IMA_RecordID
AND d.ima_price <> i.ima_price
Since you haven't provided any other column names I Have used Column2 and Column3 and the "Other" column names in the below example.
You can expand adding more columns in the below code.
overview about the query below:
Joined the deleted and inserted table (only targeting the rows that has changed) joining with the table itself will result in unnessacary processing of the rows which hasnt changed at all.
I have used NULLIF function to yeild a null value if the value of the column hasnt changed.
converted all the columns to same data type (required for unpivot) .
used unpivot to eliminate all the nulls from the result set.
unpivot will also give you the column name its has unpivoted it.
CREATE TRIGGER zmerps_Item_costprice__update_history_tr
ON [ITEM]
FOR UPDATE
AS
BEGIN
SET NOCOUNT ON ;
WITH CTE AS (
SELECT CAST(NULLIF(i.Price , d.Price) AS NVARCHAR(100)) AS Price
,CAST(NULLIF(i.Column2 , d.Column2) AS NVARCHAR(100)) AS Column2
,CAST(NULLIF(i.Column3 , d.Column3) AS NVARCHAR(100)) AS Column3
FROM dbo.inserted i
INNER JOIN dbo.deleted d ON i.IMA_RecordID = d.IMA_RecordID
WHERE i.Price <> d.Price
OR i.Column2 <> d.Column2
OR i.Column3 <> d.Column3
)
INSERT INTO zmerps_Item_costprice_history
(unique_id, [CURRENT_date], [reason code], Item_Value)
SELECT NEWID()
,GETDATE()
,Value
,ColumnName + '_Change'
FROM CTE UNPIVOT (Value FOR ColumnName IN (Price , Column2, Column3) )up
END
As I understand your question correctly, You want to record change If and only if The column Price value is changes, you dont need any other column changes to be recorded
here is your code
CREATE TRIGGER zmerps_Item_costprice__update_history_tr ON [ITEM]
FOR UPDATE
AS
if update(ima_price)
insert into zmerps_Item_costprice_history
select NEWID(), -- unique id
GETDATE(), -- CURRENT_date
'PRICE_CHANGE', -- reason code
a.ima_itemid, -- item id
a.ima_price-- item price
FROM Inserted b inner join item a
on b.ima_recordid = a.IMA_RecordID

Help with SQL aggregate query with detecting duplicates

I have a table which has records that contain a persons information and a filename that the information originated from, so the table looks like so:
|Table|
|id, first-name, last-name, ssn, filename|
I also have a stored procedure that provides some analytics for the files in the system and i'm trying to add information to that stored procedure to shed light into the possibility of duplicates.
Here is the current stored procedure
SELECT [filename],
COUNT([filename]) as totalRecords,
COUNT(closedleads.id) as closedRecords,
ROUND(--calcs percent of records closed in a file)
FROM table
LEFT OUTER JOIN closedleads ON closedleads.leadid = table.id
GROUP BY [filename]
What I want to add is the ability to see maybe # of possible duplicates, defined as records with matching SSNs and I am at a loss as to how I could perform a count on a sub query or join and include it in the results set. Can anyone provide some pointers?
What I'm trying to do is add something like this to my procedure above
SELECT COUNT(
SELECT COUNT(*) FROM Table T1
INNER JOIN Table T2 on T1.SSN = T2.SSN
WHERE T1.id != T2.id
) as PossibleDuplicates
What I'm looking for is merging this code with my procedure above so I can get all of the same data in one and possible have this # of duplicates across each filename, so for each filename I get a result of # of records, # of records closed and # of possible duplicates
EDIT:
I'm very close to my desired goal but I'm failing on the last little bit--getting the number of possible duplicates BY filename, here is my query
select [q1].[filename], [q1].leads, [q1].closed, [q2].dups
FROM (
SELECT [filename], count([filename]) as leads,
count(closedleads.id) as closed
FROM Table
left join closedleads on closedleads.leadid = Table.id
group by [filename]
) as [q1]
INNER JOIN (
select count([ssn]) as dups, [filename] from Table
group by [ssn], [filename]
having count([ssn]) > 1
) as [q2] on [q1].[filename] = [q2].[filename]
This works but it showing multiple results for each filename with values of 2-5 instead of summing the total count of possible duplicates
Working Query
Hey everyone, thanks for all the help, this is eventually what I got to that worked exactly as I wanted
select [q1].[filename], [q1].leads, [q1].closed, [q2].dups,
round(([q1].closed / [q1].leads), 3) as percentClosed
FROM (
SELECT [filename], count([filename]) as leads,
count(closedleads.id) as closed
FROM Table
left join closedleads on closedleads.leadid = Table.id
and [filename] is not null
group by [filename]
) as [q1]
INNER JOIN (
select [filename], count(*) - count(distinct [ssn]) as dups
from Table
group by [filename]
) as [q2] on [q1].[filename] = [q2].[filename]
You'll probably want to make use of a HAVING clause somewhere, eg:
LEFT JOIN (
SELECT SSN, COUNT(SSN) - 1 DupeCount FROM Table T1
GROUP BY SSN
HAVING COUNT(SSN) > 1 ) AS PossibleDuplicates
ON table.ssn = PossibleDuplicates.SSN
If you want to include 0 possible duplicates (rather than null) you actually don't need the HAVING clause, just the left join.
Edit - Updated with a better example which matches your question better
Here's an example if I understand correctly.
create table #table (id int,ssn varchar(10))
insert into #table values(1,'10')
insert into #table values(2,'10')
insert into #table values(3,'11')
insert into #table values(4,'12')
insert into #table values(5,'11')
insert into #table values(6,'13')
select sum(cnt)
from (
select count(distinct ssn) as cnt
from #table
group by ssn
having count(*)>1
) dups
You shouldn't need to self join the table if you group by ssn and then pull back only ssn's where you have more then one.
I think the existing answers don't quite understand your question. I think I do but it's not completely specified yet. Is it a duplicate if the same SSN appears in two different files or only within the same file? Because you group by filename, that becomes the grain.
The Output of your query is like
StateFarm1, 500, 50, 10%, <your new value goes here>
AllState2, 100, 90, 90% <your new value goes here>
So if you have the same SSN in those two files, you have 1 duplicate, so on which row do you show 1, on the AllState row or the Statefarm row? If you say both, invariably someone will SUM that column and get a doubling of the results.
Now What if you have a Geico row with the same SSN, is that 1 duplicate or 2? and again which row?
I know this isn't a final answer but these questions do highlight the the question as it stands is unanswerable... you fix this and I'll change the answer,
please no downvotes in the meantime
Addendum
I believe the only thing you are missing is a DISTINCT.
select [q1].[filename], [q1].leads, [q1].closed, [q2].dups
FROM (
SELECT [filename], count([filename]) as leads,
count(closedleads.id) as closed
FROM tbldata
left join closedleads on closedleads.leadid = Table.id
group by [filename]
) as [q1]
INNER JOIN (
select count( DISTINCT [ssn]) as dups, [filename] from Table '<---- here'
group by [ssn], [filename]
having count([ssn]) > 1
) as [q2] on [q1].[filename] = [q2].[filename]
You don't need the outer COUNT - your inner SELECT COUNT(*)... will return you just one number, a count of records with duplicate SSN but different id.