SQL Server : return all rows in set if one row has value equal to target value - sql

I'm currently working on a SQL query that searches an "archive" database and returns a row for each change that occurred on an order from the beginning of time to today.
What I would like to do with this query is only return the orders that are currently or have been associated with a specific order handler. The best way for me to explain it is that every order is currently grouped in a "set" with a row number for each change, but if one of the rows ever holds the value I'm looking for either "handler" columns, I want it to return all the rows, not just the one with that target value.
Here is what I have so far.
SELECT
ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber,
ace.[OrderId],
ace.[OrderHandler],
ace.[EventDateTime],
ace.[OrderStatus],
LAG(ace.[OrderHandler], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderHandler,
LAG(ace.[EventDateTime], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousEventDateTime,
LAG(ace.[OrderStatus], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderStatus
FROM
Archive AS ace
Here is the sample data I receive when running the above query:
So instead of just returning row number 9 where the OrderHandler = POOL, I want to query if the OrderId has an OrderHandler of POOL at ANY TIME in history, return all the rows.
I figured I could potentially use a WHERE EXISTS but I'm not sure how I could return the whole set of results instead of just the results that match.
Any help is extremely appreciated!

You can use exists like this:
select a.*
from ace a
where exists (select 1
from ace a2
where a2.orderid = a.orderid and
a2.orderhandler = #orderhandler
);

Script for solution:
SELECT ROW_NUMBER() OVER (PARTITION BY ace.OrderId ORDER BY ace.EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
,LAG(ace.[OrderHandler], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderHandler
,LAG(ace.[EventDateTime], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousEventDateTime
,LAG(ace.[OrderId], 1) OVER (PARTITION BY ace.[OrderId] ORDER BY ace.[OrderId] ) as PreviousOrderId
,LAG(ace.[OrderStatus], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderStatus
FROM Archive as ace
WHERE EXISTS
(SELECT * FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
FROM Archive as ace
GROUP BY ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
HAVING ace.OrderHandler LIKE '%POOL%'
)x
WHERE ace.OrderId = x.OrderId)

Related

Avoid duplicate records from a particular column of a table

I have a table as shown in the image.In Number column, the values are appeared more than once (for example 63 appeared twice). I would like to keep only one value. Please see my code:
delete from t1 where
(SELECT *,row_number() OVER (
PARTITION BY
Number
ORDER BY
Date) as rn from t1 where rn > 1)
It shows error. Can anyone please assist.
enter image description here
The column created by row_number() was not accessed by your main query, in order to enable that, you can create a quick sub query and use the desired filter
SELECT *
FROM
(
SELECT *,
row_number() OVER (PARTITION BY Number ORDER BY Date) as rn
FROM t1 ) T
where rn = 1;
The partition by determines how row numbers repeat. The row numbers are assigned per group of partition by keys. So, you can get duplicates.
If you want a unique row number over all rows, just leave out the partition by:
select t1.*
from (select t1.*,
row_number() over (order by date) as rn
from t1
) t1
where rn > 1
if you want to keep only one value, rn = 1 instead of "> 1"

Multiple records not repeated

I have a table called TABLE_SCREW where I want to get the latest records for each code.
For example, in the table below you should obtain the records with ids 3 and 7.
I am a newbie in sql and I hope you can help me.
You could use:
SELECT TOP 1 WITH TIES *
FROM TABLE_SCREW
ORDER BY ROW_NUMBER() OVER(PARTITION BY CODE ORDER BY Date DESC);
Another approach(may have better performance):
SELECT * -- here * should be replaced with actual column names
FROM (SELECT *,ROW_NUMBER() OVER(PARTITION BY CODE ORDER BY Date DESC) AS rn
FROM TABLE_SCREW) sub
WHERE sub.rn = 1;

How to use a for each loop type in a SQL Query

Is there any way to do some type of loop through an array of strings in SQL? I am appending data to #T_POLICY, and I want to run through the queries with one database and the run through the same queries with a different database.
Ex.
use Staging__4SBI_STG_BG
go
WITH cteEnumerate AS
(
SELECT *
,RN = ROW_NUMBER() OVER (PARTITION BY POLICY_ID ORDER BY LOADDATE DESC)
FROM dbo.STG_POLICY
)
INSERT INTO #T_POLICY
SELECT SOURCE, AGENCY_D, POL_SEQ, POLICY_ID, POL_ENDNUMBER, PRODUCT_ID,
COMPANY_ID
FROM cteEnumerate
WHERE RN = 1
ORDER BY POLICY_ID;
So the next one I would like use is use Staging__4SBI_STG_TB instead of BG, and have quite a few others to run through. Could I create a table with these names and run through them? Any help would be great.
Thanks
How do you load table dbo.STG_POLICY? You can add a column IsLatestPolicy BIT to the table.
WITH cte AS (SELECT *
,RN = ROW_NUMBER() OVER (PARTITION BY POLICY_ID ORDER BY LOADDATE DESC)
FROM dbo.STG_POLICY
)
UPDATE cte
SET IsLatestPolicy = CASE WHEN RN = 1 THEN 1 ELSE 0 END;
So every time you can select the original table
SELECT SOURCE
, AGENCY_D
, POL_SEQ
, POLICY_ID
, POL_ENDNUMBER
, PRODUCT_ID
, COMPANY_ID
FROM dbo.STG_POLICY
WHERE IsLatestPolicy = 1
Therefore no need to create another table just for initiating a loop.

Delete field Duplicates from the Same table

I am writing this query to display a bunch of Names from a table filled automatically from an outside source:
select MAX(UN_ID) as [ID] , MAX(UN_Name) from UnavailableNames group by (UN_Name)
I have a lot of name duplicates, so I used "Group by"
I want to delete all the duplicates right after I do this select query..
(Delete where the field UN_Name is available twice, leave it once)
Any way to do this?
Something likes this should work:
WITH CTE AS
(
SELECT rn = ROW_NUMBER()
OVER(
PARTITION BY UN_Name
ORDER BY UN_ID ASC), *
FROM dbo.UnavailableNames
)
DELETE FROM cte
WHERE rn > 1
You basically assign an increasing "row number" within each group that shares the same "un_name".
Then you just delete all rows which have a "row number" higher than 1 and keep all the ones that appeared first.
With CTE As
(
Select uid,ROW_NUMBER() OVER( PARTITION BY uname order by uid) as rownum
From yourTable
)
Delete
From yourTable
where uid in (select uid from CTE where rownum> 1 )

Select rows based on two columns in SQL Server

I have a table which stores data where accidentally data has been stored multiple times because of case sensivity for the username field on server side code. The username field should be regarded as case insensitive. The important columns and data for the table can be found below.
My requirements now is to delete all but the most recent saved data. I'm writing an sql script for this, and started out by identifying all rows that are duplicates. This selection returns a table like below.
For each row, the most recent save is LASTUPDATEDDATE if it exist, otherwise CREATEDDATE. For this example, the most recent save for 'username' would be row 3.
ID CREATEDDATE LASTUPDATEDDATE USERNAME
-- ----------- --------------- --------
1 11-NOV-11 USERNAME
2 01-NOV-11 02-NOV-11 username
3 8-JAN-12 USERname
My script (which selects all rows where a duplicated username appears) looks like:
SELECT
id, createddate, lastupdateddate, username
FROM
table
WHERE
LOWER(username)
IN
(
SELECT
LOWER(username)
FROM
table
GROUP BY
LOWER(username)
HAVING
COUNT(*) > 1
)
ORDER BY
LOWER(username)
My question now is: How do I select everything but row 3? I have searched Stack Overflow for a good match to this question, but found no match good enough. I know I probably have to make a join of some kind, but can't really get my head around it. Would be really thankful for a push in the right direction.
We are using SQL Server, probably a quite new version.
To delete duplicates, you can use:
with todelete as (
select t.*,
row_number() over (partition by lower(username) order by createddate desc) as seqnum
from table
)
delete from t
where seqnum > 1
This assigns a sequential number to each row, starting with 1 for the most recent. It then deletes all but the most recent.
For two dates, you can use:
with todelete as (
select t.*,
row_number() over (partition by lower(username) order by thedate desc) as seqnum
from (select t.*,
(case when createddate >= coalesdce(updateddate, createddate)
then createddate
else updateddate
end) as thedate
from table
) t
)
delete from t
where seqnum > 1
A couple of things to note -- there is no reason to use LOWER in your query. A = a in SQL Server.
Also, to get the correct date, you can use COALESCE to determine if LastUpdatedDate exists and if so, sort by it, else sort by CreatedDate.
Putting that together, this should work:
DELETE T
FROM YourTable T
JOIN (
SELECT *, ROW_NUMBER() OVER (PARTITION BY username
ORDER BY COALESCE(lastupdateddate, createddate) DESC) as RN
FROM YourTable
) T2 ON T.Id = T2.Id
WHERE T2.RN > 1
Here is a sample fiddle: http://www.sqlfiddle.com/#!3/51f7c/1
As #Gordon correctly suggests, you could also use a CTE depending on the version of SQL Server you use (2005+):
WITH CTE AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY username
ORDER BY COALESCE(lastupdateddate, createddate) DESC) as RN
FROM YourTable
)
DELETE FROM CTE WHERE RN > 1