SQL Identify Cloud ID's

SQL Identify Cloud ID's - sql

I am analyzing a data set. Table has 3 columns:
-CLOUD_ID (ID Field) example: 121312
-CURRENT_Action (Textfield) example: Started
-MIN_STARTDATE (Date) example: 2016-04-20 17:03:58.633
I need to identify the Cloud_ID's which don't have the Current_Action "Deleted" as the minimum MIN_Startdate.

You can use window functions:
select distinct cloud_id
from (select t.*,
min(min_startdate) over (partition by cloud_id) as min_min_startdate
from t
) t
where min_startdate = min_min_startdate and
Current_Action <> 'Deleted';
Note: This assumes that Current_Action is not NULL, but that could easily be included in the logic.

A sligtly different approach, pls compare the perf:
with x as (select *, row_number() over(partition by cloud_id order by min_startdate) rn from #t)
select cloud_id from x where rn = 1 and current_action <> 'deleted'

Related

Get Earliest Date corresponding to the latest occurrence of a recurring name

I have a table with Name and Date columns. I want to get the earliest date when the current name appeared. For example:
Name
Date
X
30-Jan-2021
X
29-Jan-2021
X
28-Jan-2021
Y
27-Jan-2021
Y
26-Jan-2021
Y
25-Jan-2021
Y
24-Jan-2021
X
23-Jan-2021
X
22-Jan-2021
Now when I try to get the earliest date when current name (X) started to appear, I want 28-Jan, but the sql query would give 22-Jan-2021 because that's when X appeared originally for the first time.
Update: This was the query I was using:
Select min(Date) from myTable where Name='X'
I am using older SQL Server 2008 (in the process of upgrading), so do not have access to LEAD/LAG functions.
The solutions suggested below do work as intended. Thanks.

This is a type of gaps-and-islands problem.
There are many solutions. Here is one that is optimized for your case
Use LEAD/LAG to identify the first row in each grouping
Filter to only those rows
Number them rows and take the first one
WITH StartPoints AS (
SELECT *,
IsStart = CASE WHEN Name <> LEAD(Name, 1, '') OVER (ORDER BY Date DESC) THEN 1 END
FROM YourTable
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1 AND Name = 'X'
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;
db<>fiddle
For SQL Server 2008 or earlier (which I strongly suggest you upgrade from), you can use a self-join with row-numbering to simulate LEAD/LAG
WITH RowNumbered AS (
SELECT *,
AllRn = ROW_NUMBER() OVER (ORDER BY Date ASC)
FROM YourTable
),
StartPoints AS (
SELECT r1.*,
IsStart = CASE WHEN r1.Name <> ISNULL(r2.Name, '') THEN 1 END
FROM RowNumbered r1
LEFT JOIN RowNumbered r2 ON r2.AllRn = r1.AllRn - 1
),
Numbered AS (
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Date DESC)
FROM StartPoints
WHERE IsStart = 1
)
SELECT
Name, Date
FROM Numbered
WHERE rn = 1;

This is a gaps and island problem. Based on the sample data, this will work:
WITH Groups AS(
SELECT YT.[Name],
YT.[Date],
ROW_NUMBER() OVER (ORDER BY YT.Date DESC) -
ROW_NUMBER() OVER (PARTITION BY YT.[Name] ORDER BY Date DESC) AS Grp
FROM dbo.YourTable YT),
FirstGroup AS(
SELECT TOP (1) WITH TIES
G.[Name],
G.[Date]
FROM Groups G
WHERE [Name] = 'X'
ORDER BY Grp ASC)
SELECT MIN(FG.[Date]) AS Mi
db<>fiddle

If i did understand, you want to know when the X disappeared and reappeared again. in that case you can search for gaps in dates by group.
this and example how to detect that
SELECT name
,DATE
FROM (
SELECT *
,DATEDIFF(day, lead(DATE) OVER (
PARTITION BY name ORDER BY DATE DESC
), DATE) DIF
FROM YourTable
) a
WHERE DIF > 1

Selecting the latest order

I need to select the data of all my customers with the records displayed in the image. But I need to get the most recent record only, for example I need to get the order # E987 for John and E888 for Adam. As you can see from the example, when I do the select statement, I get all the order records.

You don't mention the specific database, so I'll answer with a generic solution.
You can do:
select *
from (
select t.*,
row_number() over(partition by name order by order_date desc) as rn
from t
) x
where rn = 1

You can use analytical function row_number.
Select * from
(Select t.*,
Row_number() over (partition by customer_id order by order_date desc) as rn
From your_table t) t
Where rn = 1
Or you can use not exists as follows:
Select *
From yoir_table t
Where not exists
(Select 1 from your_table tt
Where t.customer_id = tt.custome_id
And tt.order_date > t.order_date)

You can do it with a subquery that finds the last order date.
SELECT t.*
FROM yoir_table t
JOIN (SELECT tt.custome_id,
MAX(tt.order_date) MaxOrderDate
FROM yoir_table tt
GROUP BY tt.custome_id) AS tt
ON t.custome_id = tt.custome_id
AND t.order_date = tt.MaxOrderDate

SQL Server : return all rows in set if one row has value equal to target value

I'm currently working on a SQL query that searches an "archive" database and returns a row for each change that occurred on an order from the beginning of time to today.
What I would like to do with this query is only return the orders that are currently or have been associated with a specific order handler. The best way for me to explain it is that every order is currently grouped in a "set" with a row number for each change, but if one of the rows ever holds the value I'm looking for either "handler" columns, I want it to return all the rows, not just the one with that target value.
Here is what I have so far.
SELECT
ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber,
ace.[OrderId],
ace.[OrderHandler],
ace.[EventDateTime],
ace.[OrderStatus],
LAG(ace.[OrderHandler], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderHandler,
LAG(ace.[EventDateTime], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousEventDateTime,
LAG(ace.[OrderStatus], 1) OVER (PARTITION BY [OrderId] ORDER BY ace.[EventDateTime]) AS PreviousOrderStatus
FROM
Archive AS ace
Here is the sample data I receive when running the above query:
So instead of just returning row number 9 where the OrderHandler = POOL, I want to query if the OrderId has an OrderHandler of POOL at ANY TIME in history, return all the rows.
I figured I could potentially use a WHERE EXISTS but I'm not sure how I could return the whole set of results instead of just the results that match.
Any help is extremely appreciated!

You can use exists like this:
select a.*
from ace a
where exists (select 1
from ace a2
where a2.orderid = a.orderid and
a2.orderhandler = #orderhandler
);

Script for solution:
SELECT ROW_NUMBER() OVER (PARTITION BY ace.OrderId ORDER BY ace.EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
,LAG(ace.[OrderHandler], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderHandler
,LAG(ace.[EventDateTime], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousEventDateTime
,LAG(ace.[OrderId], 1) OVER (PARTITION BY ace.[OrderId] ORDER BY ace.[OrderId] ) as PreviousOrderId
,LAG(ace.[OrderStatus], 1) OVER ( PARTITION BY ace.[OrderId] ORDER BY ace.[EventDateTime] ) as PreviousOrderStatus
FROM Archive as ace
WHERE EXISTS
(SELECT * FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY EventDateTime) AS RowNumber
,ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
FROM Archive as ace
GROUP BY ace.[OrderId]
,ace.[OrderHandler]
,ace.[EventDateTime]
,ace.[OrderStatus]
HAVING ace.OrderHandler LIKE '%POOL%'
)x
WHERE ace.OrderId = x.OrderId)

SQL Server Group By with Max on Date field

I hope i can explain the issue i'm having and hopefully so can point me in the same direction.
I'm trying to do a group by (Email Address) on a subset of data, then i'm using a max() on a date field but because of different values in other fields its bring back more rows then require.
I would just like to return the max record per email address and return the fields that are on the same row that are on the max record.
Not sure how i can write this query?

This is a task for ROW_NUMBER:
select *
from
(
select t.*,
-- assign sequential number starting with 1 for the maximum date
row_number() over (partiton by email_address order by datecol desc) as rn
from tab
) as dt
where rn = 1 -- only return the latest row

You can write this query using row_number():
select t.*
from (select t.*,
row_number() over (partition by emailaddress order by date desc) as seqnum
from t
) t
where seqnum = 1;

How about something like this?
select a.*
from baseTable as a
inner join
(select Email,
Max(EmailDate) as EmailDate
from baseTable
group by Email) as b
on a.Email = b.Email
and a.EmailDate = b.EmailDate

Oracle SQL query to find all projects where specific user has newest entry

I have a table that looks in simplified version like this with date values, name, user:
I would like to have a query that gives me the projects for which a specific user has the newest date. For instance if I look for User U1 it would return A. If I look for User U2 it would return B. It will usually return serveral projects as the table is very long and a user can have the newest date for n projects .
I have been trying for a while now without success. How can I do this?

Try with to use sub-query as below
select *
from projects P
where P.DateTime = (select max(P1.DateTime)
from projects P1
where P.User = P1.User
)
and P.User = 'U1'

You can use analytic functions for this:
select t.*
from (select t.*,
row_number() over (partition by project order by datetime desc) as seqnum
from table t
) t
where seqnum = 1 and user = SELECTEDUSER;
If a project can have multiple rows on the maximum date, then use rank() or dense_rank() instead of row_number().

SELECT Project FROM
(
SELECT Project, ROW_NUMBER() OVER (PARTITION BY user ORDER BY [DateTime] DESC) AS Rank
FROM table
)
WHERE Rank = 1
If multiple (say 2) rows have the same [DateTime], row_number() will return 1 and 2 as Rank, if you use Rank(), they will both return 1.

I was able to get this done with this sql:
SELECT * FROM TABLE WHERE USER = 'USERNAME' AND (PROJECT, DAT) in (SELECT PROJECT, MAX(DAT) as MAXDAT FROM TABELLE GROUP BY PROJECT)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Identify Cloud ID's - sql

I am analyzing a data set. Table has 3 columns: -CLOUD_ID (ID Field) example: 121312 -CURRENT_Action (Textfield) example: Started -MIN_STARTDATE (Date) example: 2016-04-20 17:03:58.633 I need to identify the Cloud_ID's which don't have the Current_Action "Deleted" as the minimum MIN_Startdate.

A sligtly different approach, pls compare the perf: with x as (select *, row_number() over(partition by cloud_id order by min_startdate) rn from #t) select cloud_id from x where rn = 1 and current_action <> 'deleted'

Related

Get Earliest Date corresponding to the latest occurrence of a recurring name

Selecting the latest order

SQL Server : return all rows in set if one row has value equal to target value

SQL Server Group By with Max on Date field

Oracle SQL query to find all projects where specific user has newest entry

Categories

Resources