Select only one row when a certain condition is met? - sql

ID
NAME
DATE
STATUS
1
Joe
01-22
Approved
1
Joe
01-22
Pending
2
Bill
02-22
Approved
2
Bill
02-22
Sent back
3
John
01-22
Approved
4
Bob
02-22
Pending
How do I only return one row per ID, placing priority on approved?
Example: for Id 1 I only want the row that is approved and not the one that is pending.
Some Id's may only have 1 record for example ID 4 has just one record and is pending.
What I want is:
IF status = approved and pending for the same Id then keep the approved record and not select the pending record
If status = pending then keep that record

This will preferentially select Approved, then Pending, then everything else. If you don't want "everything else" just filter in the WHERE clause.
select id,
name,
date,
status
from (
select *,
row_number() over
( partition by id
order by case when status = 'Approved' then 1
when status = 'Pending' then 2
else 3
end asc,
date
) as first_by_date_with_approved_precedence
from your_table
) tmp
where first_by_date_with_approved_precedence = 1

It could also be as easy as the following (provided status is not blank or null)
Select Top 1 with ties *
from YourTable
order by row_number() over (partition by id order by Status)
Results
ID NAME DATE STATUS
1 Joe 01-22 Approved
2 Bill 02-22 Approved
3 John 01-22 Approved
4 Bob 02-22 Pending

Related

Check for condition in GROUP BY?

Take this example data:
ID Status Date
1 Pending 2/10/2020
2 Pending 2/10/2020
3 Pending 2/10/2020
2 Pending 2/10/2020
2 Pending 2/10/2020
1 Complete 2/15/2020
I need an SQL statement that will group all the data but bring back the current status. So for ID 1 the group by needs a condition that only returns the Completed row and also returned the pending rows for ID 2 and 3.
I am not 100% how to write in the condition for this.
Maybe something like:
SELECT ID, Status, Date
FROM table
GROUP BY ID, Status, Date
ORDER BY ID
The problem with this is the resulting data would look like:
ID Status Date
1 Pending 2/10/2020
1 Complete 2/15/2020
2 Pending 2/10/2020
3 Pending 2/10/2020
But I need:
ID Status Date
1 Complete 2/15/2020
2 Pending 2/10/2020
3 Pending 2/10/2020
What can I do to check for the Completed status so I can only return Completed in the group by?
Do only GROUP BY the ID column. Use MIN() to chose Complete before Pending.
SELECT ID, MIN(Status)
FROM table
GROUP BY ID
ORDER BY ID
To use Date as 'last row indicator', you can:
DECLARE #Src TABLE (
ID int,
Status varchar(20),
Date Date
)
INSERT #Src VALUES
(1, 'Pending' ,'2/10/2020'),
(1, 'Complete' ,'2/15/2020'),
(2, 'Pending' ,'2/10/2020'),
(3, 'Pending' ,'2/10/2020');
SELECT TOP 1 WITH TIES *
FROM #Src
ORDER BY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Date DESC)
Result:
ID Status Date
----------- -------------------- ----------
1 Complete 2020-02-15
2 Pending 2020-02-10
3 Pending 2020-02-10

Case statement for HIVE platform

I have a table with the following columns:
ID
Scheduled Date
Status
Target Date
I need to extract 'Status' corresponding to minimum 'Appointment Date' for each ID. If not available then I need to extract status corresponding to the minimum 'Target Date' for that ID.
Sample data:
ID | Scheduled_Date | Status | Target_Date
1 12/11/2017 Completed 12/11/2017
1 12/12/2017 Completed 12/12/2017
2 12/13/2017 Completed 12/13/2017
3 12/14/2017 Pending 12/14/2017
3 12/15/2017 Pending 12/15/2017
4 Confirmed 12/18/2017
4 Confirmed 12/19/2017
5 12/14/2017 Completed 12/14/2017
5 12/15/2017 Pending 12/15/2017
Can you please correct the code that I am trying to write?
SELECT ID,
CASE WHEN ID IS NOT NULL THEN
CASE WHEN MIN(SCHEDULED_DATE) IS NOT NULL
THEN STATUS
ELSE
END
CASE WHEN MIN(TARGET_DATE) IS NOT NULL
THEN STATUS
ELSE ''
END
FROM FIRST_STATUS
Try this query.
SELECT id,
status
FROM yourtable t
WHERE COALESCE (Scheduled_Date,
Target_Date) IN
(SELECT MIN(COALESCE (Scheduled_Date,Target_Date))
FROM yourtable i
WHERE i.ID = t.id
GROUP BY i.ID);
DEMO
Use row_number() analytic function:
select id,
status
from
(
select id,
status,
row_number() over(partition by id, order by nvl(Scheduled_Date,Target_Date)) rn
from yourtable t
)s
where rn=1
;

Teradata SQL MAX function and duplicate rows

My original report returns 3 records for account ABC, 1 with an approved status and 2 with a suspended status. The only difference in these lines is the status and billing number. All other data is identical.
I'm trying to create a variation of this report that would return 1 line for Account ABC with a column that displays the count for approved accounts and another column with the count for suspended accounts.
In the new report, there would be an Approved Accounts column with a value of 1 and a Suspended Accounts column with a value of 2.
I'm using a MAX function to return only 1 line. The issue I'm having is that 2 records with the suspended status are identical except for the Billing Number.
If I remove the billing number from the SQL then the results only return 1 suspended and 1 approved account. I need the SQL to return 1 line with 1 in the approved column and 2 in the suspended column
Here is some sample data:
Acct# Bill# Name Location Status
ABC Bill1 ABC Co 123456 Approved
ABC Bill2 ABC Co 123456 Suspended
ABC Bill3 ABC Co 123456 Suspended
Any suggestions would be greatly appreciated. Thanks for your help.....
You need "conditional aggregation":
select Acct#, Name, Location,
sum(case when Status = 'Approved' then 1 else 0 end) as Approved,
sum(case when Status = 'Suspended' then 1 else 0 end) as Suspended
from ...
group by 1,2,3

SQL JOIN - retrieve MAX DateTime from second table and the first DateTime after previous MAX for other value

I have issue with creating a proper SQL expression.
I have table TICKET with column TICKETID
TICKETID
1000
1001
I then have table STATUSHISTORY from where I need to retrieve what was the last time (maximum time) when that ticket entered VENDOR status (last VENDOR status) and when it exited VENDOR status (by exiting VENDOR status I mean the first next INPROG status, but only first INPROG after the VENDOR status, it's always INPROG the next status after VENDOR status). Also it is also possible that VENDOR status for ID does not exist at all in STATUSHISOTRY (then nulls should be returned), but INPROG exists always - it can be before but also and after VENDOR status, if ID is not anymore in VENDOR status.
Here is the example of STATUSHISTORY.
ID TICKETID STATUS DATETIME
1 1000 INPROG 01.01.2017 10:00
2 1000 VENDOR 02.01.2017 10:00
3 1000 INPROG 03.01.2017 10:00
4 1000 VENDOR 04.01.2017 10:00
5 1000 INPROG 05.01.2017 10:00
6 1000 HOLD 06.01.2017 10:00
7 1000 INPROG 07.01.2017 10:00
8 1001 INPROG 02.02.2017 10:00
9 1001 VENDOR 03.02.2017 10:00
10 1001 INPROG 04.02.2017 10:00
11 1001 VENDOR 05.02.2017 10:00
So the result when doing the query from TICKET table and doing the JOIN with table STATUSHISTORY should be:
ID VENDOR_ENTERED VENDOR_EXITED
1000 04.01.2017 10:00 05.01.2017 10:00
1001 05.02.2017 10:00 null
Because for ID 1000 last VENDOR status was at 04.01.2017 and the first INPROG status after the VENDOR status for that ID was at 05.01.2017 while for ID 1001 the last VENDOR status was at 05.02.2017 and after that INPROG status did not happen yet.
If VENDOR did not exist then both columns should be null in result.
I am really stuck with this, trying different JOINs but without any progress.
Thank you in advance if you can help me.
You can do this with window functions. First, assign a "vendor" group to the tickets. You can do this using a cumulative sum counting the number of "vendor" records on or before each record.
Then, aggregate the records to get one record per "vendor" group. And use row numbers to get the most recent records. So:
with vg as (
select ticket,
min(datetime) as vendor_entered,
min(case when status = 'INPROG' then datetime end) as vendor_exitied
from (select sh.*,
sum(case when status = 'VENDOR' then 1 else 0 end) over (partition by ticketid order by datetime) as grp
from statushistory sh
) sh
group by ticket, grp
)
select vg.tiketid, vg.vendor_entered, vg.vendor_exited
from (select vg.*,
row_number() over (partition by ticket order by vendor_entered desc) as seqnum
from vg
) vg
where seqnum = 1;
You can aggregate to get max time, then join onto all of the date values higher than that time, and then re-aggregate:
select a.TicketID,
a.VENDOR_ENTERED,
min( EXIT_TIME ) as VENDOR_EXITED
from (
select TicketID,
max( DATETIME ) as VENDOR_ENTERED
from StatusHistory
where Status = 'VENDOR'
group by TicketID
) as a
left join
(
select TicketID,
DATETIME as EXIT_TIME
from StatusHistory
where Status = 'INPROG'
) as b
on a.TicketID = b.TicketID
and EXIT_TIME >= a.VENDOR_ENTERED
group by a.TicketID,
a.VENDOR_ENTERED
DB2 is not supported in SQLfiddle, but a standard SQL example can be found here.

Record has multiple statuses that gets rechecked. Only want records that meet a certain criteria

I have a table like below
Account number | Last Name | Status | date
111 doe acknowledged 04-11-2013
111 doe acknowledged 05-01-2013
111 doe paid 05-10-2013
123 smith acknowledged 05-15-2013
123 smith acknowledged 05-22-2013
145 walter paid 05-23-2013
There are names and account numbers that holder the same information but just have different statuses and dates.
I am trying to get the most recent date and compare it to the current date. So for today I would compare doe with 5-10, smith with 5-22 and walter with 5-23...
Now an account can get rechecked multiple times and it will only stop getting rechecked once it has been paid. So smith would be the only one to get recheck at a later date.
I am wanting to find all of the records that have an acknowledged status. I do not want any of these to be paid.
So far in my code I able to get the max date for a record, but it brings back the acknowledged and the paid record since they are both distinct. I only want records that have not been paid and that are still acknowledged.
You seem to want the most recent acknowledged status for accounts that are not paid. Here is a query for that:
select AccountNumber, Name,
max(date)
from t
group by AccountNumber, Name
having sum(case when status = 'Paid' then 1 else 0 end) = 0
You only have the two different statuses in your question. If you had more and you wanted just the acknowledged max date, then you would do:
select AccountNumber, Name,
max(case when status = 'Acknowledged' then date end)
from t
group by AccountNumber, Name
having sum(case when status = 'Paid' then 1 else 0 end) = 0
To get the most recent status, you can use row_number() to enumerate the rows:
select AccountNumber, Name,
max(case when status = 'Acknowledged' then date end),
max(case when seqnum = 1 then status end) as MostRecentStatus
from (select t.*,
row_number() over (partition by AccountNumber, Name order by date desc) as seqnum
from t
) t
group by AccountNumber, Name
having sum(case when status = 'Paid' then 1 else 0 end) = 0