Find a row that has passed different steps in the same table - sql

Hi to all here it is my problem:
I have an history table where a record register different steps:
Id
Step
Item Code
1
Created
112345
2
Approved
112345
3
Completed
112345
4
Closed
112345
5
Created
112346
6
Approved
112346
8
Closed
112346
What i want to find inside this table is:
All the item codes that have done one step (for example the Approved one) and where the next one is not the "natural one" (for expample the Completed one). In the example table the time code 112346 item has done the Approved step but has skipped the Completed step).
Is there anyway to do a query like this? I've used the PARTITION BY to make a cluster of Item ,for each step, but i am unable to continue the query.
Thanks in advance for any help or suggestion

You can use the LEAD analytic function to check if the next step is not the expected one:
SELECT id, step, item_code
FROM (
SELECT t.*,
LEAD(step) OVER (PARTITION BY item_code ORDER BY id) AS next_step
FROM table_name t
)
WHERE step = 'Approved'
AND (next_step IS NULL OR next_step != 'Completed')
Or, from Oracle 12, you can use MATCH_RECOGNIZE to perform row-by-row processing:
SELECT id, step, item_code
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY item_code
ORDER BY id
ALL ROWS PER MATCH
PATTERN ( approved {- (not_completed|$) -} )
DEFINE
approved AS step = 'Approved',
not_completed AS step <> 'Completed'
)
Which, for the sample data:
CREATE TABLE table_name (Id, Step, Item_Code) AS
SELECT 1, 'Created', 112345 FROM DUAL UNION ALL
SELECT 2, 'Approved', 112345 FROM DUAL UNION ALL
SELECT 3, 'Completed', 112345 FROM DUAL UNION ALL
SELECT 4, 'Closed', 112345 FROM DUAL UNION ALL
SELECT 5, 'Created', 112346 FROM DUAL UNION ALL
SELECT 6, 'Approved', 112346 FROM DUAL UNION ALL
SELECT 8, 'Closed', 112346 FROM DUAL;
Both output:
ID
STEP
ITEM_CODE
6
Approved
112346
db<>fiddle here

Provided you have a table of step names with a proper ordering int column (seqno)
with
/* Sample data */
steps(seqno, Step) as (
select 1, 'Created' from dual union all
select 2, 'Approved' from dual union all
select 3, 'Completed' from dual union all
select 4, 'Closed' from dual
),
tbl(Id, Step, ItemCode) as (
select 1, 'Created' , 112345 from dual union all
select 2, 'Approved' , 112345 from dual union all
select 3, 'Completed', 112345 from dual union all
select 4, 'Closed' , 112345 from dual union all
select 5, 'Created' , 112346 from dual union all
select 6, 'Approved' , 112346 from dual union all
select 8, 'Closed' , 112346 from dual
),
/* Find steps with the next step being out of order */
m as (
select max(seqno)+1 term
from steps
)
select Id, Step, ItemCode
from (
SELECT t.*, s.seqno, lead(s.seqno, 1, (select term from m)) over(partition by ItemCode order by Id) nextSeqno
FROM tbl t
JOIN steps s on s.Step = t.Step
) q
cross join m
where nextseqno != Seqno + 1 and nextseqno < m.term

Related

Oracle subselect case

So, there are 2 tables
Table1 (Contains all articles,center,date 00000)
Tabla2 (Contains articles handwritten (that are also in Table1),center, date)
We have a procedure that every day compares Table1 and Table2 articles and center, and if they match, an update changes th Table1 date for that article and center.
Now, we also want to add something, we want that in case center is ' ' (empty) on Tabla2, it updates every center that has that article in Table1.
Here is the OracleSQL:
update Table1 r
set date1= (SELECT max(date2) FROM Tabla2 t
where t.articulo = r.articulo
and t.center = to_char(center) //It gets the center from a select behind
and t.date2 >= to_char(sysdate,'yyyymmdd')
group by t.center);
We want both cases to work
If center has a real center like 20, it only updates center 20.
If center has a empty '' then it updates every center with that article.
You can use:
UPDATE Table1 r
SET date1 = ( SELECT MAX(date2)
KEEP (DENSE_RANK FIRST ORDER BY t.center NULLS LAST)
FROM Tabla2 t
WHERE t.articulo = r.articulo
AND (t.center = r.center OR t.center IS NULL)
AND t.date2 >= TRUNC(sysdate)
);
Note: KEEP (DENSE_RANK LAST... is used to prefer dates from a row with a non-NULL center over rows with a NULL center.
Which, if you have the sample data:
CREATE TABLE table1 (articulo, center, date1) AS
SELECT 1, 1, CAST(NULL AS DATE) FROM DUAL UNION ALL
SELECT 2, 2, NULL FROM DUAL UNION ALL
SELECT 3, 3, NULL FROM DUAL UNION ALL
SELECT 4, 4, NULL FROM DUAL;
CREATE TABLE tabla2 (articulo, center, date2) AS
SELECT 1, 1, DATE '2023-05-19' FROM DUAL UNION ALL
SELECT 2, 2, DATE '2023-01-01' FROM DUAL UNION ALL
SELECT 2, 2, DATE '2023-05-19' FROM DUAL UNION ALL
SELECT 3, 3, DATE '2023-01-01' FROM DUAL UNION ALL
SELECT 3, NULL, DATE '2023-05-19' FROM DUAL UNION ALL
SELECT 4, NULL, DATE '2023-01-01' FROM DUAL UNION ALL
SELECT 4, NULL, DATE '2023-05-19' FROM DUAL;
Then, after the update Table1 contains:
ARTICULO
CENTER
DATE1
1
1
2023-05-19 00:00:00
2
2
2023-05-19 00:00:00
3
3
2023-01-01 00:00:00
4
4
2023-05-19 00:00:00
db<>fiddle here
I would use something like this:
and t.center = nvl(to_char(center),t.center)
If the center is populated, it will use that value. If center is null, the nvl will instead result in the value of t.center. Basically resulting in t.center=t.center (which again means always true).

How to query data which is not unique up to a certain point?

Basically the current conditions of the query are
WHERE data_payload_uri BETWEEN
'/organization/team/folder/2021'
AND
'/organization/team/folder/2022'
And this gets all data for the year of 2021.
A sample of the data_payload_uri data looks like this:
/organization/team/folder/20210101/orig
/organization/team/folder/20210102/orig
/organization/team/folder/20210102/orig_v1
/organization/team/folder/20210103/orig
/organization/team/folder/20210104/orig
/organization/team/folder/20210105/orig
/organization/team/folder/20210105/orig_v1
/organization/team/folder/20210105/orig_v2
What I would like to do is only query the rows where up until the last forward-slash, the row is NOT unique.
What this means, is I want to NOT query the rows which ONLY have one orig
/organization/team/folder/20210101/orig
/organization/team/folder/20210103/orig
/organization/team/folder/20210104/orig
but I DO want to query all the other rows
/organization/team/folder/20210105/orig
/organization/team/folder/20210105/orig_v1
/organization/team/folder/20210105/orig_v2
/organization/team/folder/20210102/orig
/organization/team/folder/20210102/orig_v1
What is the best way to do this? Pls let me know if anything is unclear and thank you for any help
You can use the analytic COUNT function:
SELECT *
FROM (
SELECT t.*,
COUNT(DISTINCT data_payload_uri) OVER (
PARTITION BY SUBSTR(data_payload_uri, 1, INSTR(data_payload_uri, '/', -1))
) AS cnt
FROM table_name t
WHERE data_payload_uri >= '/organization/team/folder/2021'
AND data_payload_uri < '/organization/team/folder/2022'
)
WHERE cnt > 1
Which, for the sample data:
CREATE TABLE table_name (id, data_payload_uri) AS
SELECT 1, '/organization/team/folder/20210101/orig' FROM DUAL UNION ALL
SELECT 2, '/organization/team/folder/20210102/orig' FROM DUAL UNION ALL
SELECT 3, '/organization/team/folder/20210102/orig_v1' FROM DUAL UNION ALL
SELECT 4, '/organization/team/folder/20210103/orig' FROM DUAL UNION ALL
SELECT 5, '/organization/team/folder/20210104/orig' FROM DUAL UNION ALL
SELECT 6, '/organization/team/folder/20210105/orig' FROM DUAL UNION ALL
SELECT 7, '/organization/team/folder/20210105/orig_v1' FROM DUAL UNION ALL
SELECT 8, '/organization/team/folder/20210105/orig_v2' FROM DUAL;
Outputs:
ID
DATA_PAYLOAD_URI
CNT
2
/organization/team/folder/20210102/orig
2
3
/organization/team/folder/20210102/orig_v1
2
6
/organization/team/folder/20210105/orig
3
7
/organization/team/folder/20210105/orig_v1
3
8
/organization/team/folder/20210105/orig_v2
3
db<>fiddle here

How to apply: count(distinct ...) over (partition by ... order by) in big query?

I currently have this source table.
I am trying to get this second table from the first table, in SQL on GCP BigQuery.
My Query is the following :
SELECT
SE.MARKET_ID,
SE.LOCAL_POS_ID,
SE.BC_ID,
LEFT(SE.SALE_CREATION_DATE,6) AS DATE_ID_MONTH,
COUNT(DISTINCT
CASE
WHEN FLAG
THEN SE.CUST_ID
END)
OVER (PARTITION BY SE.MARKET_ID, SE.LOCAL_POS_ID, SE.BC_ID, LEFT(SE.SALE_CREATION_DATE,4) ORDER BY LEFT(SE.SALE_CREATION_DATE,6)) AS NB_ACTIVE_CUSTOMERS
FROM
SE
GROUP BY
SE.MARKET_ID, SE.LOCAL_POS_ID, SE.BC_ID, LEFT(SE.SALE_CREATION_DATE,6)
However, I get this error that I did not succeed to bypass :
Window ORDER BY is not allowed if DISTINCT is specified at [12:107]
I can't create a previous table with the following request :
SELECT DISTINCT
SE.MARKET_ID,
SE.LOCAL_POS_ID,
SE.BC_ID,
LEFT(SE.SALE_CREATION_DATE,6) AS DATE_ID_MONTH,
CASE
WHEN FLAG
THEN SE.CUST_ID
ELSE NULL
END AS VALID_CUST_ID
FROM
SE
in order to use a dense_rank() after that because I have 50 others indicators (and 500M rows) to add to this table (indicators based on other flags) and I can't obviously create a WITH for each of them, I need to have it in only a few WITH or none (exactly like my current query is supposed to do).
Has anyone got a clue on how I can handle that please ?
Consider below approach
select * except(ids),
array_length(array(
select distinct id
from unnest(split(ids)) id
)) as nb_active_customers,
format('%t', array(
select distinct id
from unnest(split(ids)) id
)) as distinct_values
from (
select market_id, local_pos_id, bc_id, date_id_month,
string_agg('' || ids) over(partition by market_id order by date_id_month) ids
from (
select market_id, local_pos_id, bc_id, left(sale_creation_date,6) AS date_id_month,
string_agg('' || cust_id) ids
from se
where flag = 1
group by market_id, local_pos_id, bc_id, date_id_month
)
) t
if applied to sample data in your question - output is
I think some of your sample data is incorrect but I did play with it and get a matching result, for the MPE data at least. You can accomplish this by first tagging the "distinctly counted" rows with an extra partition on CUST_ID and then first ordering on FLAG DESC. Then you would sum over that in the same way you hoped to apply count(distinct <expr>) over ...
WITH SE AS (
SELECT 1 LINE_ID, 'TW' MARKET_ID, 'X' LOCAL_POS_ID, 'MPE' BC_ID,
1 CUST_ID, '20200201' SALE_CREATION_DATE, 1 FLAG UNION ALL
SELECT 2, 'TW', 'X', 'MPE', 2, '20201005', 1 UNION ALL
SELECT 3, 'TW', 'X', 'MPE', 3, '20200415', 0 UNION ALL
SELECT 4, 'TW', 'X', 'MPE', 1, '20200223', 1 UNION ALL
SELECT 5, 'TW', 'X', 'MPE', 6, '20200217', 1 UNION ALL
SELECT 6, 'TW', 'X', 'MPE', 9, '20200715', 1 UNION ALL
SELECT 7, 'TW', 'X', 'MPE', 4, '20200223', 1 UNION ALL
SELECT 8, 'TW', 'X', 'MPE', 1, '20201008', 1 UNION ALL
SELECT 9, 'TW', 'X', 'MPE', 2, '20201019', 1 UNION ALL
SELECT 10, 'TW', 'X', 'MPE', 1, '20200516', 1 UNION ALL
SELECT 11, 'TW', 'X', 'MPE', 1, '20200129', 1 UNION ALL
SELECT 12, 'TW', 'X', 'MPE', 1, '20201007', 1 UNION ALL
SELECT 13, 'TW', 'X', 'MPE', 2, '20201005', 1 UNION ALL
SELECT 14, 'TW', 'X', 'MPE', 3, '20200505', 1 UNION ALL
SELECT 15, 'TW', 'X', 'MPE', 8, '20201103', 1 UNION ALL
SELECT 16, 'TW', 'X', 'MPE', 9, '20200820', 1
),
DATA AS (
SELECT *,
LEFT(SALE_CREATION_DATE, 6) AS SALE_MONTH,
LEFT(SALE_CREATION_DATE, 4) AS SALE_YEAR,
CASE ROW_NUMBER() OVER (
PARTITION BY MARKET_ID, LOCAL_POS_ID, BC_ID,
LEFT(SALE_CREATION_DATE, 4), CUST_ID
ORDER BY FLAG DESC, LEFT(SALE_CREATION_DATE, 6)
) WHEN 1 THEN FLAG END AS COUNTER /* assumes possible to have no flagged row */
FROM SE
)
SELECT MARKET_ID, LOCAL_POS_ID, BC_ID, SALE_MONTH,
SUM(SUM(COUNTER)) OVER (
PARTITION BY MARKET_ID, LOCAL_POS_ID, BC_ID, SALE_YEAR
ORDER BY SALE_MONTH
) AS NB_ACTIVE_CUSTOMERS
FROM DATA
GROUP BY MARKET_ID, LOCAL_POS_ID, BC_ID, SALE_YEAR, SALE_MONTH
ORDER BY MARKET_ID, LOCAL_POS_ID, BC_ID, SALE_YEAR, SALE_MONTH

Convert a series of Number values in Text in Oracle SQL Query

In the Oracle database, I have string values (VARCHAR2) like 1,4,7,8. The number represents as 1=car, 2= bus, 3=BB, 4=SB, 5=Ba, 6=PA, 7=HB, and 8 =G
and want to convert the above-said example to "car,SB,HB,G" in my query results
I tried to use "Decode" but it does not work. Please advise how to make it works. Would appreciate.
Thanks`
Initially, I have used the following query:
Select Clientid as C#, vehicletypeExclusions as vehicle from
clients
The sample of outcomes are:
C# Vehicle
20 1,19,20,23,24,7,5
22 1,19,20,23,24,7,5
I also tried the following that gives me the null value of vehicles:
Select Clientid as C#, Decode (VEHICLETYPEEXCLUSIONS, '1', 'car',
'3','bus', '5','ba' ,'7','HB', '8','G'
, '9','LED1102', '10','LED1104', '13','LED8-2',
'14','Flip4-12', '17','StAT1003', '19','Taxi-Min', '20','Tax_Sed',
'21','Sup-veh' , '22','T-DATS', '23','T-Mini',
'24','T-WAM') as vehicle_Ex from clients >
Here's one option. Read comments within code. Sample data in lines #1 - 13; query begins at line #14.
SQL> with
2 expl (id, name) as
3 (select 1, 'car' from dual union all
4 select 2, 'bus' from dual union all
5 select 3, 'BB' from dual union all
6 select 4, 'SB' from dual union all
7 select 5, 'Ba' from dual union all
8 select 6, 'PA' from dual union all
9 select 7, 'HB' from dual union all
10 select 8, 'G' from dual
11 ),
12 temp (col) as
13 (select '1,4,7,8' from dual),
14 -- split COL to rows
15 spl as
16 (select regexp_substr(col, '[^,]+', 1, level) val,
17 level lvl
18 from temp
19 connect by level <= regexp_count(col, ',') + 1
20 )
21 -- join SPL with EXPL; aggregate the result
22 select listagg(e.name, ',') within group (order by s.lvl) result
23 from expl e join spl s on s.val = e.id;
RESULT
--------------------------------------------------------------------------------
car,SB,HB,G
SQL>
Using the function f_subst from https://stackoverflow.com/a/68537479/429100 :
create or replace
function f_subst(str varchar2, template varchar2, subst sys.odcivarchar2list) return varchar2
as
res varchar2(32767):=str;
begin
for i in 1..subst.count loop
res:=replace(res, replace(template,'%d',i), subst(i));
end loop;
return res;
end;
/
I've replaced ora_name_list_t (nested table) with sys.odcivarchar2list (varray) to make this example easier, but I would suggest to create your own collection for example create type varchar2_table as table of varchar2(4000);
Example:
select
f_subst(
'1,4,7,8'
,'%d'
,sys.odcivarchar2list('car','bus','BB','SB','Ba','PA','HB','G')
) s
from dual;
S
----------------------------------------
car,SB,HB,G
Assume you have a lookup table (associating the numeric codes with descriptions) and a table of input strings, which I called sample_inputs in my tests, as shown below:
create table lookup (code, descr) as
select 1, 'car' from dual union all
select 2, 'bus' from dual union all
select 3, 'BB' from dual union all
select 4, 'SB' from dual union all
select 5, 'Ba' from dual union all
select 6, 'PA' from dual union all
select 7, 'HB' from dual union all
select 8, 'G' from dual
;
create table sample_inputs (str) as
select '1,4,7,8' from dual union all
select null from dual union all
select '3' from dual union all
select '5,5,5' from dual union all
select '6,2,8' from dual
;
One strategy for solving your problem is to split the input - slightly modified to make it a JSON array, so that we can use json_table to split it - then join to the lookup table and re-aggregate.
select s.str, l.descr_list
from sample_inputs s cross join lateral
( select listagg(descr, ',') within group (order by ord) as descr_list
from json_table( '[' || str || ']', '$[*]'
columns code number path '$', ord for ordinality)
join lookup l using (code)
) l
;
STR DESCR_LIST
------- ------------------------------
1,4,7,8 car,SB,HB,G
3 BB
5,5,5 Ba,Ba,Ba
6,2,8 PA,bus,G

Rank function with complex scenario

I have a TAble TABLE1 and having columns like ID, Status and Code
BAsed on the code priority i want the output , the priority is
SS -> RR - > TT - > AA ( these priority is not stored in any tables)
Query should first look for Approved status then we need to check for Code column
Example1:
ID: 2345 - This record having Approved status for all the codes like SS , AA and RR
and based on the code priority SS should be pulled in the output as 2345, SS
Example2:
ID: 3333- This record having Approved status for all the codes like RR and TT
and based on the code priority RR should be pulled in the output as 3333, RR
ID: 4444- Eventhough this record is having Codes like SS and RR but it is status column is having value as TERMED so we need to populate the next priority in the list and output should display as 4444 TT
ID: 5555- None of the status for this ID is having Approved status all are having status as Termed so based on the priority in the output 5555,SS should be picked as this one is the priority
so output for 2345 and 5555 is same only the difference is if none of the record having approved status then only we should go for Termed - if the record is only having termed then based on priority record should be pulled
Attached the picture for reference
You may use RANK along with a CASE expression for ordering:
WITH cte AS (
SELECT t.*,
RANK() OVER (PARTITION BY ID
ORDER BY CASE Status WHEN 'Approved' THEN 1
WHEN 'Termed' THEN 2
ELSE 3 END,
CASE Code WHEN 'SS' THEN 1
WHEN 'RR' THEN 2
WHEN 'TT' THEN 3
WHEN 'AA' THEN 4 END) rnk
FROM yourTable t
WHERE Status = 'Approved'
)
SELECT ID, Code
FROM cte
WHERE rnk = 1;
Demo
create table table1 (id, status, code) as
select 2345, 'Approved', 'SS' from dual union all
select 2345, 'Approved', 'AA' from dual union all
select 2345, 'Approved', 'RR' from dual union all
select 3333, 'Approved', 'RR' from dual union all
select 3333, 'Approved', 'TT' from dual union all
select 4444, 'TERMED', 'SS' from dual union all
select 4444, 'TERMED', 'RR' from dual union all
select 4444, 'Approved', 'TT' from dual
;
select ID, CODE
from (
select ID, STATUS, CODE
, row_number()over(
partition by ID
order by status
, decode(code, 'SS', 1, 'RR', 2, 'TT', 3, 'AA', 4) ) rank
from table1
)
where rank = 1
;