SQL SELECT query conditional for multiple possible values - sql

I have the following data:
id
customer_id
status
1
1
Shipped
2
1
In Progress
3
1
Cancelled
4
2
Shipped
5
2
In Progress
6
3
Shipped
How do I do a SQL query to SELECT a row for each customer based on the status?
If the customer has a status of 'In Progress', then return only that in the results.
If the customer does not have a status of 'In Progress', but does have a status of 'Shipped', then return that instead.
So the results would be:
id
customer_id
status
2
1
In Progress
5
2
In Progress
6
3
Shipped

One option of tackling this problem is:
filtering out any status different than either 'Shipped' or 'In Progress', with a WHERE clause
using FIRST_VALUE, partitioned by "customer_id", ordered by "id" to get your last "id" and "status"
aggregating on duplicate records, using DISTINCT
SELECT DISTINCT
FIRST_VALUE(id) OVER(PARTITION BY customer_id ORDER BY id DESC) AS id,
customer_id,
FIRST_VALUE(status_) OVER(PARTITION BY customer_id ORDER BY id DESC) AS status_
FROM tab
WHERE status_ IN ('Shipped', 'In Progress')
This is likely to work on almost all the most common DBMS'.

A bit complex, as I need two common table expressions:
-- your input, don't use in final query
WITH
indata(id,customer_id,status) AS (
SELECT 1,1,'Shipped'
UNION ALL SELECT 2,1,'In Progress'
UNION ALL SELECT 3,1,'Cancelled'
UNION ALL SELECT 4,2,'Shipped'
UNION ALL SELECT 5,2,'In Progress'
UNION ALL SELECT 6,3,'Shipped'
)
-- real query starts here, replace following comma with "WITH" ...
,
w_rank AS (
SELECT
customer_id
, status
, CASE status
WHEN 'Shipped' THEN 1
WHEN 'In Progress' THEN 2
WHEN 'Cancelled' THEN 0
ELSE -1
END AS rnk
FROM indata
)
,
grp AS (
SELECT
customer_id
, MAX(rnk) AS rnk
FROM w_rank
GROUP BY
customer_id
)
SELECT
indata.id
, indata.customer_id
, indata.status
FROM grp
JOIN w_rank USING(customer_id,rnk)
JOIN indata USING(customer_id,status)
ORDER BY 1;
-- out id | customer_id | status
-- out ----+-------------+------------
-- out 2 | 1 | In Progress
-- out 5 | 2 | In Progress
-- out 6 | 3 | Shipped
Can be your DBMS does not support the USING() clause in joins - then use the ON clause.

Related

Partition Over issue in SQL

I have a Order shipment table like below -
Order_ID
shipment_id
pkg_weight
1
101
5
1
101
5
1
101
5
1
102
3
1
102
3
I want the output table to look like below -
Order_ID
Distinct_shipment_id
total_pkg_weight
1
2
8
select
order_id
, count(distinct(shipment_id)
, avg(pkg_weight) over (partition by shipment_id)
from table1
group by order_id
but getting the below error -
column "pkg_weight" must appear in the GROUP BY clause or be used in
an aggregate function
Please help
Use a distinct select first, then aggregate:
SELECT Order_ID,
COUNT(DISTINCT shipment_id) AS Distinct_shipment_id,
SUM(pkg_weight) AS total_pkg_weight
FROM
(
SELECT DISTINCT Order_ID, shipment_id, pkg_weight
FROM table1
) t
GROUP BY Order_ID;

How to merge two query results joining same date

let's say there's a table have data like below
id
status
date
1
4
2022-05
2
3
2022-06
I want find count of id of each month by their status. Something like this below
date
count(status1) = 4
count(status2) =3
2022-05
1
null
2022-06
null
1
I tried doing
-- select distinct (not working)
select date, status1, status2 from
(select date, count(id) as "status1" from myTable
where status = 4 group by date) as myTable1
join
(select date, count(id) as "status2" from myTable
where status = 3 group by date) as myTable2
on myTable1.date = myTable2.date;
-- group by (not working)
but it does duplicate the data needed.
and I am using SQL Server.
select d.date,
sum
(
case
when d.status=4 then 1
else 0
end
)count_status_4,
sum
(
case
when d.status=5 then 1
else 0
end
)count_status_5
from your_table as d
group by d.date

SQL Query to find the Row with first change of data

UniqueId
ITEM
DATE
1
A
2022-01-01
2
A
2022-01-02
3
B
2022-01-03
4
B
2022-01-04
5
A
2022-01-05
6
A
2022-01-06
7
B
2022-01-07
8
B
2022-01-08
9
A
2022-01-09
10
A
2022-01-10
11
A
2022-01-11
I have above table where the item is changing from A to B and then B to A (etc).
The the most recent item in the table based on the date is A (the last row).
I need to find the date on which this last item (A) was started to be in effect.
So in this case the item A was in effect from 2022-01-09 onwards (UniqueId 9).
How can I find the UniqueId or the date of item A, where it got changed to be in effect (Row 9)?
Thank you.
with data as (
select *,
last_value(item) over (order by "date") as last_item,
lag(item) over (order by "date") as prev_item
from T
)
select
max(case when item = last_item and item <> prev_item then "date" end) as max_date
from data;
or
with data as (
select *,
case when item <> lag(item) over (order by "date")
and item = last_value(item) over (order by "date")
then 1 end as flag
from T
)
select max("date") as last_transition_date
from data
where flag = 1;
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=bd5f6398c0167d74c26a67fafac5225e
Supposing you need all the data:
with data as (
select *,
case when item <> lag(item) over (order by "date")
and item = last_value(item) over (order by "date")
then 1 end as flag
from T
)
select *,
max(case when flag = 1 then "date" end) over () as last_transition_date
from data;
Getting a flag using a comparison of current item with previous item in time, using LAG() is indeed the way.
But it's absolutely sufficient to get the highest date and highest unique (as both are sorted ascending together) where the obtained flag is 1:
WITH
-- your input
indata(UniqueId,ITEM,DATE) AS (
SELECT 1,'A',DATE '2022-01-01'
UNION ALL SELECT 2,'A',DATE '2022-01-02'
UNION ALL SELECT 3,'B',DATE '2022-01-03'
UNION ALL SELECT 4,'B',DATE '2022-01-04'
UNION ALL SELECT 5,'A',DATE '2022-01-05'
UNION ALL SELECT 6,'A',DATE '2022-01-06'
UNION ALL SELECT 7,'B',DATE '2022-01-07'
UNION ALL SELECT 8,'B',DATE '2022-01-08'
UNION ALL SELECT 9,'A',DATE '2022-01-09'
UNION ALL SELECT 10,'A',DATE '2022-01-10'
UNION ALL SELECT 11,'A',DATE '2022-01-11'
)
-- real query starts here; replace following comma with "WITH"
,
w_change_ind AS (
SELECT
*
, CASE WHEN LAG(item) OVER(ORDER BY date) <> item
THEN 1
ELSE 0
END AS chg_ind
FROM indata
)
SELECT
MAX(uniqueid) AS uqid
, MAX(date) AS dt
FROM w_change_ind
WHERE chg_ind=1
;
-- out uqid | dt
-- out ------+------------
-- out 9 | 2022-01-09
Based on your description, this is one way to do what you want.
select top 1 * from table1
where item ='A'
order by uniqueid desc
If this is not what you want, then you will have to provide additional information.

PostgreSQL Pivot by Last Date

I need to make a PIVOT table from Source like this table
FactID UserID Date Product QTY
1 11 01/01/2020 A 600
2 11 02/01/2020 A 400
3 11 03/01/2020 B 500
4 11 04/01/2020 B 200
6 22 06/01/2020 A 1000
7 22 07/01/2020 A 200
8 22 08/01/2020 B 300
9 22 09/01/2020 B 100
Need Pivot Like this where Product QTY is QTY by Last Date
UserID A B
11 400 200
22 200 100
My try PostgreSQL
Select
UserID,
MAX(CASE WHEN Product='A' THEN 'QTY' END) AS 'A',
MAX(CASE WHEN Product='B' THEN 'QTY' END) AS 'B'
FROM table
GROUP BY UserID
And Result
UserID A B
11 600 500
22 1000 300
I mean I get a result by the maximum QTY and not by the maximum date!
What do I need to add to get results by the maximum (last) date ??
Postgres doesn't have "first" and "last" aggregation functions. One method for doing this (without a subquery) uses arrays:
select userid,
(array_agg(qty order by date desc) filter (where product = 'A'))[1] as a,
(array_agg(qty order by date desc) filter (where product = 'B'))[1] as b
from tab
group by userid;
Another method uses select distinct with first_value():
select distinct userid,
first_value(qty) over (partition by userid order by product = 'A' desc, date desc) as a,
first_value(qty) over (partition by userid order by product = 'B' desc, date desc) as b
from tab;
With the appropriate indexes, though, distinct on might be the fastest approach:
select userid,
max(qty) filter (where product = 'A') as a,
max(qty) filter (where product = 'B') as b
from (select distinct on (userid, product) t.*
from tab t
order by userid, product, date desc
) t
group by userid;
In particular, this can use an index on userid, product, date desc). The improvement in performance will be most notable if there are many dates for a given user.
You can use DENSE_RANK() window function in order to filter by the last date per each product and UserID before applying conditional aggregation such as
SELECT UserID,
MAX(CASE WHEN Product='A' THEN QTY END) AS "A",
MAX(CASE WHEN Product='B' THEN QTY END) AS "B"
FROM
(
SELECT t.*, DENSE_RANK() OVER (PARTITION BY Product,UserID ORDER BY Date DESC) AS rn
FROM tab t
) q
WHERE rn = 1
GROUP BY UserID
Demo
presuming all date values are distinct(no ties occur for dates)

Oracle check if any of multiple string exists in another table

I am newbie to Oracle. I have a requirement in which I need to fetch all the error codes from the comment field and then check it in another table to see the type of code. Depending on the type of code I have to give preference to particular type and then display that error code and type into a csv along with other columns. Below how the data is present in a column
TABLE 1 : COMMENTS_TABLE
id | comments
1 | Manually added (BPM001). Currency code does not exists(TECH23).
2 | Invalid counterparty (EXC001). Manually added (BPM002)
TABLE 2 : ERROR_CODES
id | error_code | error_type
1 | BPM001 | MAN
2 | EXC001 | EXC
3 | EXC002 | EXC
4 | BPM002 | MAN
I am able to get all error codes using REGEX_SUBSTR but not sure how to check it with other table and depending on type display only one. For eg. if the type is MAN only that error code should be returned in select clause.
I propose you to define a hierarchy of error_codes
within the FIRST function to search for the best fit.
SQL Fiddle
Query 1:
SELECT c.id,
MAX (
ERROR_CODE)
KEEP (DENSE_RANK FIRST
ORDER BY CASE ERROR_TYPE WHEN 'MAN' THEN 1 WHEN 'EXC' THEN 2 END)
AS ERROR_CODE,
MAX (
ERROR_TYPE)
KEEP (DENSE_RANK FIRST
ORDER BY CASE ERROR_TYPE WHEN 'MAN' THEN 1 WHEN 'EXC' THEN 2 END)
AS ERROR_TYPE
FROM ERROR_CODES e
JOIN COMMENTS_TABLE c ON c.COMMENTS LIKE '%' || e.ERROR_CODE || '%'
GROUP BY c.id
Results:
| ID | ERROR_CODE | ERROR_TYPE |
|----|------------|------------|
| 1 | BPM001 | MAN |
| 2 | BPM002 | MAN |
EDIT : You said in your comments
This is helpul, but I have multiple fields in select clause and adding
that in group by could be a problem
One option could be to use a WITH clause to define this result set and then join with other columns.
with res as
(
select ...
--query1
)
select t.other_columns, r.id, r.error_code ...
from other_table join res on ...
You may also use row_number() alternatively ( Which was actually my original answer. But I changed it to KEEP .. DENSE_RANK as it is efficient.
SELECT * FROM
( SELECT c.id
,ERROR_CODE
,ERROR_TYPE
--Other columns,
,row_number() OVER (
PARTITION BY c.id ORDER BY CASE error_type
WHEN 'MAN'
THEN 1
WHEN 'EXC'
THEN 2
ELSE 3
END
) AS rn
FROM ERROR_CODES e
INNER JOIN COMMENTS_TABLE c
ON c.COMMENTS LIKE '%' || e.ERROR_CODE || '%'
) WHERE rn = 1;
Fiddle
You can sort, prioritize and filter records with analytic functions.
with comments as(
select 1 as id
,'Manually added (BPM001). Currency code does not exists(TECH23).' as comments
from dual union all
select 2 as id
,'Invalid counterparty (EXC001). Manually added (BPM002)' as comments
from dual
)
,error_codes as(
select 1 as id, 'BPM001' as error_code, 'MAN' as error_type from dual union all
select 2 as id, 'EXC001' as error_code, 'EXC' as error_type from dual union all
select 3 as id, 'EXC002' as error_code, 'EXC' as error_type from dual union all
select 4 as id, 'BPM002' as error_code, 'MAN' as error_type from dual
)
-- Everything above this line is not part of the query. Just for generating test data
select *
from (select c.id as comment_id
,c.comments
,e.error_code
,row_number() over(
partition by c.id -- For each comment
order by case error_type when 'MAN' then 1 -- First prio
when 'EXC' then 2 -- Second prio
else 3 -- Everything else
end) as rn
from comments c
join error_codes e on(
e.error_code = regexp_substr(c.comments, e.error_code)
)
)
where rn = 1 -- Pick the highest priority code
/
If you could add a priority column to your error code (or even error_type) you could skip the case/when logic in the order by and simply replacing it with the priority column.