SQL Identify Cloud ID's - sql

I am analyzing a data set. Table has 3 columns:
-CLOUD_ID (ID Field) example: 121312
-CURRENT_Action (Textfield) example: Started
-MIN_STARTDATE (Date) example: 2016-04-20 17:03:58.633
I need to identify the Cloud_ID's which don't have the Current_Action "Deleted" as the minimum MIN_Startdate.

You can use window functions:
select distinct cloud_id
from (select t.*,
min(min_startdate) over (partition by cloud_id) as min_min_startdate
from t
) t
where min_startdate = min_min_startdate and
Current_Action <> 'Deleted';
Note: This assumes that Current_Action is not NULL, but that could easily be included in the logic.

A sligtly different approach, pls compare the perf:
with x as (select *, row_number() over(partition by cloud_id order by min_startdate) rn from #t)
select cloud_id from x where rn = 1 and current_action <> 'deleted'

Related

Get Earliest Date corresponding to the latest occurrence of a recurring name

I have a table with Name and Date columns. I want to get the earliest date when the current name appeared. For example:
Name
Date
X
30-Jan-2021
X
29-Jan-2021
X
28-Jan-2021
Y
27-Jan-2021
Y
26-Jan-2021
Y
25-Jan-2021
Y
24-Jan-2021
X
23-Jan-2021
X
22-Jan-2021
Now when I try to get the earliest date when current name (X) started to appear, I want 28-Jan, but the sql query would give 22-Jan-2021 because that's when X appeared originally for the first time.
Update: This was the query I was using:
Select min(Date) from myTable where Name='X'
I am using older SQL Server 2008 (in the process of upgrading), so do not have access to LEAD/LAG functions.
The solutions suggested below do work as intended. Thanks.
This is a type of gaps-and-islands problem.
There are many solutions. Here is one that is optimized for your case
Use LEAD/LAG to identify the first row in each grouping
Filter to only those rows
Number them rows and take the first one
WITH StartPoints AS (
SELECT *,
IsStart = CASE WHEN Name <> LEAD(Name, 1, '') OVER (ORDER BY Date DESC) THEN 1 END
FROM YourTable
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1 AND Name = 'X'
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;
db<>fiddle
For SQL Server 2008 or earlier (which I strongly suggest you upgrade from), you can use a self-join with row-numbering to simulate LEAD/LAG
WITH RowNumbered AS (
SELECT *,
AllRn = ROW_NUMBER() OVER (ORDER BY Date ASC)
FROM YourTable
),
StartPoints AS (
SELECT r1.*,
IsStart = CASE WHEN r1.Name <> ISNULL(r2.Name, '') THEN 1 END
FROM RowNumbered r1
LEFT JOIN RowNumbered r2 ON r2.AllRn = r1.AllRn - 1
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;
This is a gaps and island problem. Based on the sample data, this will work:
WITH Groups AS(
SELECT YT.[Name],
YT.[Date],
ROW_NUMBER() OVER (ORDER BY YT.Date DESC) -
ROW_NUMBER() OVER (PARTITION BY YT.[Name] ORDER BY Date DESC) AS Grp
FROM dbo.YourTable YT),
FirstGroup AS(
SELECT TOP (1) WITH TIES
G.[Name],
G.[Date]
FROM Groups G
WHERE [Name] = 'X'
ORDER BY Grp ASC)
SELECT MIN(FG.[Date]) AS Mi
db<>fiddle
If i did understand, you want to know when the X disappeared and reappeared again. in that case you can search for gaps in dates by group.
this and example how to detect that
SELECT name
,DATE
FROM (
SELECT *
,DATEDIFF(day, lead(DATE) OVER (
PARTITION BY name ORDER BY DATE DESC
), DATE) DIF
FROM YourTable
) a
WHERE DIF > 1

Selecting the latest order

I need to select the data of all my customers with the records displayed in the image. But I need to get the most recent record only, for example I need to get the order # E987 for John and E888 for Adam. As you can see from the example, when I do the select statement, I get all the order records.
You don't mention the specific database, so I'll answer with a generic solution.
You can do:
select *
from (
select t.*,
row_number() over(partition by name order by order_date desc) as rn
from t
) x
where rn = 1
You can use analytical function row_number.
Select * from
(Select t.*,
Row_number() over (partition by customer_id order by order_date desc) as rn
From your_table t) t
Where rn = 1
Or you can use not exists as follows:
Select *
From yoir_table t
Where not exists
(Select 1 from your_table tt
Where t.customer_id = tt.custome_id
And tt.order_date > t.order_date)
You can do it with a subquery that finds the last order date.
SELECT t.*
FROM yoir_table t
JOIN (SELECT tt.custome_id,
MAX(tt.order_date) MaxOrderDate
FROM yoir_table tt
GROUP BY tt.custome_id) AS tt
ON t.custome_id = tt.custome_id
AND t.order_date = tt.MaxOrderDate

SQL Server : return all rows in set if one row has value equal to target value

I'm currently working on a SQL query that searches an "archive" database and returns a row for each change that occurred on an order from the beginning of time to today.
What I would like to do with this query is only return the orders that are currently or have been associated with a specific order handler. The best way for me to explain it is that every order is currently grouped in a "set" with a row number for each change, but if one of the rows ever holds the value I'm looking for either "handler" columns, I want it to return all the rows, not just the one with that target value.
Here is what I have so far.
SELECT
ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber,
ace.[OrderId],
ace.[OrderHandler],
ace.[EventDateTime],
ace.[OrderStatus],
LAG(ace.[OrderHandler], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderHandler,
LAG(ace.[EventDateTime], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousEventDateTime,
LAG(ace.[OrderStatus], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderStatus
FROM
Archive AS ace
Here is the sample data I receive when running the above query:
So instead of just returning row number 9 where the OrderHandler = POOL, I want to query if the OrderId has an OrderHandler of POOL at ANY TIME in history, return all the rows.
I figured I could potentially use a WHERE EXISTS but I'm not sure how I could return the whole set of results instead of just the results that match.
Any help is extremely appreciated!
You can use exists like this:
select a.*
from ace a
where exists (select 1
from ace a2
where a2.orderid = a.orderid and
a2.orderhandler = #orderhandler
);
Script for solution:
SELECT ROW_NUMBER() OVER (PARTITION BY ace.OrderId ORDER BY ace.EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
,LAG(ace.[OrderHandler], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderHandler
,LAG(ace.[EventDateTime], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousEventDateTime
,LAG(ace.[OrderId], 1) OVER (PARTITION BY ace.[OrderId] ORDER BY ace.[OrderId] ) as PreviousOrderId
,LAG(ace.[OrderStatus], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderStatus
FROM Archive as ace
WHERE EXISTS
(SELECT * FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
FROM Archive as ace
GROUP BY ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
HAVING ace.OrderHandler LIKE '%POOL%'
)x
WHERE ace.OrderId = x.OrderId)

SQL Server Group By with Max on Date field

I hope i can explain the issue i'm having and hopefully so can point me in the same direction.
I'm trying to do a group by (Email Address) on a subset of data, then i'm using a max() on a date field but because of different values in other fields its bring back more rows then require.
I would just like to return the max record per email address and return the fields that are on the same row that are on the max record.
Not sure how i can write this query?
This is a task for ROW_NUMBER:
select *
from
(
select t.*,
-- assign sequential number starting with 1 for the maximum date
row_number() over (partiton by email_address order by datecol desc) as rn
from tab
) as dt
where rn = 1 -- only return the latest row
You can write this query using row_number():
select t.*
from (select t.*,
row_number() over (partition by emailaddress order by date desc) as seqnum
from t
) t
where seqnum = 1;
How about something like this?
select a.*
from baseTable as a
inner join
(select Email,
Max(EmailDate) as EmailDate
from baseTable
group by Email) as b
on a.Email = b.Email
and a.EmailDate = b.EmailDate

Oracle SQL query to find all projects where specific user has newest entry

I have a table that looks in simplified version like this with date values, name, user:
I would like to have a query that gives me the projects for which a specific user has the newest date. For instance if I look for User U1 it would return A. If I look for User U2 it would return B. It will usually return serveral projects as the table is very long and a user can have the newest date for n projects .
I have been trying for a while now without success. How can I do this?
Try with to use sub-query as below
select *
from projects P
where P.DateTime = (select max(P1.DateTime)
from projects P1
where P.User = P1.User
)
and P.User = 'U1'
You can use analytic functions for this:
select t.*
from (select t.*,
row_number() over (partition by project order by datetime desc) as seqnum
from table t
) t
where seqnum = 1 and user = SELECTEDUSER;
If a project can have multiple rows on the maximum date, then use rank() or dense_rank() instead of row_number().
SELECT Project FROM
(
SELECT Project, ROW_NUMBER() OVER (PARTITION BY user ORDER BY [DateTime] DESC) AS Rank
FROM table
)
WHERE Rank = 1
If multiple (say 2) rows have the same [DateTime], row_number() will return 1 and 2 as Rank, if you use Rank(), they will both return 1.
I was able to get this done with this sql:
SELECT * FROM TABLE WHERE USER = 'USERNAME' AND (PROJECT, DAT) in (SELECT PROJECT, MAX(DAT) as MAXDAT FROM TABELLE GROUP BY PROJECT)