Oracle check if any of multiple string exists in another table - sql

I am newbie to Oracle. I have a requirement in which I need to fetch all the error codes from the comment field and then check it in another table to see the type of code. Depending on the type of code I have to give preference to particular type and then display that error code and type into a csv along with other columns. Below how the data is present in a column
TABLE 1 : COMMENTS_TABLE
id | comments
1 | Manually added (BPM001). Currency code does not exists(TECH23).
2 | Invalid counterparty (EXC001). Manually added (BPM002)
TABLE 2 : ERROR_CODES
id | error_code | error_type
1 | BPM001 | MAN
2 | EXC001 | EXC
3 | EXC002 | EXC
4 | BPM002 | MAN
I am able to get all error codes using REGEX_SUBSTR but not sure how to check it with other table and depending on type display only one. For eg. if the type is MAN only that error code should be returned in select clause.

I propose you to define a hierarchy of error_codes
within the FIRST function to search for the best fit.
SQL Fiddle
Query 1:
SELECT c.id,
MAX (
ERROR_CODE)
KEEP (DENSE_RANK FIRST
ORDER BY CASE ERROR_TYPE WHEN 'MAN' THEN 1 WHEN 'EXC' THEN 2 END)
AS ERROR_CODE,
MAX (
ERROR_TYPE)
KEEP (DENSE_RANK FIRST
ORDER BY CASE ERROR_TYPE WHEN 'MAN' THEN 1 WHEN 'EXC' THEN 2 END)
AS ERROR_TYPE
FROM ERROR_CODES e
JOIN COMMENTS_TABLE c ON c.COMMENTS LIKE '%' || e.ERROR_CODE || '%'
GROUP BY c.id
Results:
| ID | ERROR_CODE | ERROR_TYPE |
|----|------------|------------|
| 1 | BPM001 | MAN |
| 2 | BPM002 | MAN |
EDIT : You said in your comments
This is helpul, but I have multiple fields in select clause and adding
that in group by could be a problem
One option could be to use a WITH clause to define this result set and then join with other columns.
with res as
(
select ...
--query1
)
select t.other_columns, r.id, r.error_code ...
from other_table join res on ...
You may also use row_number() alternatively ( Which was actually my original answer. But I changed it to KEEP .. DENSE_RANK as it is efficient.
SELECT * FROM
( SELECT c.id
,ERROR_CODE
,ERROR_TYPE
--Other columns,
,row_number() OVER (
PARTITION BY c.id ORDER BY CASE error_type
WHEN 'MAN'
THEN 1
WHEN 'EXC'
THEN 2
ELSE 3
END
) AS rn
FROM ERROR_CODES e
INNER JOIN COMMENTS_TABLE c
ON c.COMMENTS LIKE '%' || e.ERROR_CODE || '%'
) WHERE rn = 1;
Fiddle

You can sort, prioritize and filter records with analytic functions.
with comments as(
select 1 as id
,'Manually added (BPM001). Currency code does not exists(TECH23).' as comments
from dual union all
select 2 as id
,'Invalid counterparty (EXC001). Manually added (BPM002)' as comments
from dual
)
,error_codes as(
select 1 as id, 'BPM001' as error_code, 'MAN' as error_type from dual union all
select 2 as id, 'EXC001' as error_code, 'EXC' as error_type from dual union all
select 3 as id, 'EXC002' as error_code, 'EXC' as error_type from dual union all
select 4 as id, 'BPM002' as error_code, 'MAN' as error_type from dual
)
-- Everything above this line is not part of the query. Just for generating test data
select *
from (select c.id as comment_id
,c.comments
,e.error_code
,row_number() over(
partition by c.id -- For each comment
order by case error_type when 'MAN' then 1 -- First prio
when 'EXC' then 2 -- Second prio
else 3 -- Everything else
end) as rn
from comments c
join error_codes e on(
e.error_code = regexp_substr(c.comments, e.error_code)
)
)
where rn = 1 -- Pick the highest priority code
/
If you could add a priority column to your error code (or even error_type) you could skip the case/when logic in the order by and simply replacing it with the priority column.

Related

SQL SELECT query conditional for multiple possible values

I have the following data:
id
customer_id
status
1
1
Shipped
2
1
In Progress
3
1
Cancelled
4
2
Shipped
5
2
In Progress
6
3
Shipped
How do I do a SQL query to SELECT a row for each customer based on the status?
If the customer has a status of 'In Progress', then return only that in the results.
If the customer does not have a status of 'In Progress', but does have a status of 'Shipped', then return that instead.
So the results would be:
id
customer_id
status
2
1
In Progress
5
2
In Progress
6
3
Shipped
One option of tackling this problem is:
filtering out any status different than either 'Shipped' or 'In Progress', with a WHERE clause
using FIRST_VALUE, partitioned by "customer_id", ordered by "id" to get your last "id" and "status"
aggregating on duplicate records, using DISTINCT
SELECT DISTINCT
FIRST_VALUE(id) OVER(PARTITION BY customer_id ORDER BY id DESC) AS id,
customer_id,
FIRST_VALUE(status_) OVER(PARTITION BY customer_id ORDER BY id DESC) AS status_
FROM tab
WHERE status_ IN ('Shipped', 'In Progress')
This is likely to work on almost all the most common DBMS'.
A bit complex, as I need two common table expressions:
-- your input, don't use in final query
WITH
indata(id,customer_id,status) AS (
SELECT 1,1,'Shipped'
UNION ALL SELECT 2,1,'In Progress'
UNION ALL SELECT 3,1,'Cancelled'
UNION ALL SELECT 4,2,'Shipped'
UNION ALL SELECT 5,2,'In Progress'
UNION ALL SELECT 6,3,'Shipped'
)
-- real query starts here, replace following comma with "WITH" ...
,
w_rank AS (
SELECT
customer_id
, status
, CASE status
WHEN 'Shipped' THEN 1
WHEN 'In Progress' THEN 2
WHEN 'Cancelled' THEN 0
ELSE -1
END AS rnk
FROM indata
)
,
grp AS (
SELECT
customer_id
, MAX(rnk) AS rnk
FROM w_rank
GROUP BY
customer_id
)
SELECT
indata.id
, indata.customer_id
, indata.status
FROM grp
JOIN w_rank USING(customer_id,rnk)
JOIN indata USING(customer_id,status)
ORDER BY 1;
-- out id | customer_id | status
-- out ----+-------------+------------
-- out 2 | 1 | In Progress
-- out 5 | 2 | In Progress
-- out 6 | 3 | Shipped
Can be your DBMS does not support the USING() clause in joins - then use the ON clause.

How to count changes within each column and in SQL

This is how the table is looking like:
id
city
address
steps
date
1
null
null
a
2021-11-01
1
NY
null
b
2021-11-04
1
Chicago
null
c
2021-11-05
2
SF
33, ABC colony
x
2021-12-01
2
SF
33, ABC colony
y
2021-12-04
2
SF
44, Kang Street
z
2021-12-05
3
Austin
null
i
2022-01-01
3
Austin
12, Bridgetown
j
2022-01-04
3
Austin
null
k
2022-01-05
What I want is total count of times that for any 'id' there was an update in fields city and address only but excluding null. We dont care about the column steps and any updates there.
For id = 1, the city was changed from null to NY to Chicago. However, the address remained null, but the given the dates I count it as 2. Changing from null to NY is not supposed to be counted as an update.
For id = 2, the city was never changed it was always SF. But, there is a change in address but only once and thus we count the update as 2 again.
For id = 3, the city was never changed but the address changed from null to an address back to null. We don't count the first null because the customer may not have the info but if he/she changes it back to null that has to be counted. Here also update count will be 2.
I am expecting the results as:
id
change_count
1
2
2
2
3
2
Can I know how to do this via sql? The major problem is to not count "null" as I rank the id in ascending order of when the record came but count when it is changed back to "null" is where I am mainly confused.
Any help is appreciated. I am working on it and if I get the SQL finalized, I will share it here too.
Can this work for you?
WITH
-- your input, do not use in query ...
indata(id,city,addr,steps,dt) AS (
SELECT 1,NULL ,NULL ,'a',DATE '2021-11-01'
UNION ALL SELECT 1,'NY' ,NULL ,'b',DATE '2021-11-04'
UNION ALL SELECT 1,'Chicago',NULL ,'c',DATE '2021-11-05'
UNION ALL SELECT 2,'SF' ,'33, ABC colony' ,'x',DATE '2021-12-01'
UNION ALL SELECT 2,'SF' ,'33, ABC colony' ,'y',DATE '2021-12-04'
UNION ALL SELECT 2,'SF' ,'44, Kang Street','z',DATE '2021-12-05'
UNION ALL SELECT 3,'Austin' ,NULL ,'i',DATE '2022-01-01'
UNION ALL SELECT 3,'Austin' ,'12, Bridgetown' ,'j',DATE '2022-01-04'
UNION ALL SELECT 3,'Austin' ,NULL ,'k',DATE '2022-01-05'
)
-- end of your input
-- real query starts here, replace following comma with "WITH" ...
,
olap AS (
SELECT
id
-- a NULL is not COUNTed DISTINCT, but an empty string is
, CASE WHEN city IS NULL AND LAG(city) OVER w IS NOT NULL THEN '' ELSE city END AS city
, CASE WHEN addr IS NULL AND LAG(addr) OVER w IS NOT NULL THEN '' ELSE addr END AS addr
FROM indata
WINDOW w AS (PARTITION BY id ORDER BY dt)
)
SELECT
id
, GREATEST(COUNT(DISTINCT city),COUNT(DISTINCT addr)) AS changecount
FROM olap
GROUP BY 1
ORDER BY 1
;
-- out id | changecount
-- out ----+-------------
-- out 1 | 2
-- out 2 | 2
-- out 3 | 2
I tired using combination of window-function lag and coalesce method and I finally got the answer but if someone has a better solution, do suggest. :)
My sql:
with cte1 as(
select *,
row_number over(partition by id order by date) as rn
from main_table),
cte2 as (
select * from cte1 where (rn =1 and city <> null or address <> null)),
cte3 as (
SELECT id,
case when coalesce(city,'-1')=COALESCE(lag(city,1) over(partition by id order by date), city,'-1') then 0 else 1 end as cityChange,
case when coalesce(address,'-1')=COALESCE(lag(address,1) over(partition by id order by date), address,'-1') then 0 else 1 end as addressChange
from cte2)
select id,
sum(cityChange) as cityChangeCount,
sum(addressChange) as addressChangeCount
from cte3
group by id

Get top 5 records for each group and Concate them in a Row per group

I have a table Contacts that basically looks like following:
Id | Name | ContactId | Contact | Amount
---------------------------------------------
1 | A | 1 | 12323432 | 555
---------------------------------------------
1 | A | 2 | 23432434 | 349
---------------------------------------------
2 | B | 3 | 98867665 | 297
--------------------------------------------
2 | B | 4 | 88867662 | 142
--------------------------------------------
2 | B | 5 | null | 698
--------------------------------------------
Here, ContactId is unique throughout the table. Contact can be NULL & I would like to exclude those.
Now, I want to select top 5 contacts for each Id based on their Amount. I am accomplished that by following query:
WITH cte AS (
SELECT id, Contact, amount, ROW_NUMBER()
over (
PARTITION BY id
order by amount desc
) AS RowNo
FROM contacts
where contact is not null
)
select *from cte where RowNo <= 5
It's working fine upto this point. Now I want to concate these (<=5) record for each group & show them in a single row by concatenating them.
Expected Result :
Id | Name | Contact
-------------------------------
1 | A | 12323432;23432434
-------------------------------
2 | B | 98867665;88867662
I am using following query to achieve this but it still gives all records in separate rows and also including Null values too:
WITH cte AS (
SELECT id, Contact, amount,contactid, ROW_NUMBER()
over (
PARTITION BY id
order by amount desc
) AS RowNo
FROM contacts
where contact is not null
)
select *from id, name,
STUFF ((
SELECT distinct '; ' + isnull(contact,'') FROM cte
WHERE co.id= cte.id and co.contactid= cte.contactid
and RowNo <= 5
FOR XML PATH('')),1, 1, '')as contact
from contacts co inner join cte where cte.id = co.id and co.contactid= cte.contactid
Above query still gives me all top 5 contacts in diff rows & including null too.
Is it a good idea to use CTE and STUFF togather? Please suggest if there is any better approach than this.
I got the problem with my final query:
I don't need original Contact table in my final Select, since I already have everything I needed in CTE. Also, Inside STUFF(), I'm using contactid to join which is what actually I'm trying to concat here. Since I'm using that condition for join, I am getting records in diff rows. I've removed these 2 condition and it worked.
WITH cte AS (
SELECT id, Contact, amount,contactid, ROW_NUMBER()
over (
PARTITION BY id
order by amount desc
) AS RowNo
FROM contacts
where contact is not null
)
select *from id, name,
STUFF ((
SELECT distinct '; ' + isnull(contact,'') FROM cte
WHERE co.id= cte.id
and RowNo <= 5
FOR XML PATH('')),1, 1, '')as contact
from cte where rowno <= 5
You can use conditional aggregation:
id, name, contact,
select id, name,
concat(max(case when seqnum = 1 then contact + ';' end),
max(case when seqnum = 2 then contact + ';' end),
max(case when seqnum = 3 then contact + ';' end),
max(case when seqnum = 4 then contact + ';' end),
max(case when seqnum = 5 then contact + ';' end)
) as contacts
from (select c.*
row_number() over (partition by id order by amount desc) as seqnum
from contacts c
where contact is not null
) c
group by id, name;
If you are running SQL Server 2017 or higher, you can use string_agg(): as most other aggregate functions, it ignores null values by design.
select id, name, string_agg(contact, ',') within group (order by rn) all_contacts
from (
select id, name, contact
row_number() over (partition by id order by amount desc) as rn
from contacts
where contact is not null
) t
where rn <= 5
group by id, name
Note that you don't strictly need a CTE here; you can return the columns you need from the subquery, and use them directly in the outer query.
In earlier versions, one approach using stuff() and for xml path is:
with cte as (
select id, name, contact,
row_number() over (partition by id order by amount desc) as rn
from contacts
where contact is not null
)
select id, name,
stuff(
(
select ', ' + c1.concat
from cte c1
where c1.id = c.id and c1.rn <= 5
order by c1.rn
for xml path (''), type
).value('.', 'varchar(max)'), 1, 2, ''
) all_contacts
from cte
group by id, name
I agree with #GMB. STRING_AGG() is what you need ...
WITH
contacts(Id,nm,ContactId,Contact,Amount) AS (
SELECT 1,'A',1,12323432,555
UNION ALL SELECT 1,'A',2,23432434,349
UNION ALL SELECT 2,'B',3,98867665,297
UNION ALL SELECT 2,'B',4,88867662,142
UNION ALL SELECT 2,'B',5,NULL ,698
)
,
with_filter_val AS (
SELECT
*
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY amount DESC) AS rn
FROM contacts
)
SELECT
id
, nm
, STRING_AGG(CAST(contact AS CHAR(8)),',') AS contact_list
FROM with_filter_val
WHERE rn <=5
GROUP BY
id
, nm
-- out id | nm | contact_list
-- out ----+----+-------------------
-- out 1 | A | 12323432,23432434
-- out 2 | B | 98867665,88867662

Query : Group By one column and select antoher column

I would have a problem with a query
I should make a query that takes the last state ( so check the date), Grouped for a called column mat_calc.
mat_calc | STATE | DATE
1 | NEW | 25/03/2016
1 | DONE |25/01/2016
2 |PROC |25/04/2016
2 |PROC |25/07/2016
2 |DONE |25/09/2016
3 |NEW |25/01/2016
3 |PROC |25/06/2016
3 |DONE |25/02/2016
3 |OK |25/12/2016
4 |OK |25/03/2016
So I should give it back :
the mat_cal With its status
1 | NEW
2 | DONE
3 | OK
4 | OK
My query is
select mat_cal AS mat_cal , STATO AS STATO, MAX(DATA) AS DATA
from CALC
group by mat_cal ;
It gives me trouble on the group id because it looks like I DO NOT use it.
How can i do it? Thanks
Sorry,i can't do a tables with stack overflow
Use row_number():
select c.*
from (select c.*,
row_number() over (partition by mat_cal order by data desc) as seqnum
from calc c
) c
where seqnum = 1;
you can try the following:
--creating the data you publishied
with calc (mat_cal,STATO,date)
as
(
select '1' as mat_cal,'NEW' as STATO,'25/03/2016' as date
union
select '1','DONE','25/01/2016'
union
select '2','PROC','25/04/2016'
union
select '2','PROC','25/07/2016'
union
select '2','DONE','25/09/2016'
union
select '3','NEW','25/01/2016'
union
select '3','PROC','25/06/2016'
union
select '3','DONE','25/02/2016'
union
select '3','OK','25/12/2016'
union
select '4','OK','25/03/2016')
--the query to solve the problem
select mat_cal , STATO ,date
from CALC as c
where c.date = (select max(date) from calc as c2 where c.mat_cal = c2.mat_cal group by c2.mat_cal)
When you use the Group By clause you need to do it with all the columns selected. If you don't do that you'll get the ORA-00979 exception.
Try adding the STATO column to the Group By.
select mat_cal AS mat_cal , STATO AS STATO, MAX(DATA) AS DATA
from CALC
group by mat_cal, STATO ;

Max and Min value's corresponding records

I have a scenario to get the respective field value of "Max" and "Min" records
Please find the sample data below
-----------------------------------------------------------------------
ID Label ProcessedDate
-----------------------------------------------------------------------
1 Label1 11/01/2016
2 Label2 11/02/2016
3 Label3 11/03/2016
4 Label4 11/04/2016
5 Label5 11/05/2016
I have the "ID" field populated in another table as a foreign key. While querying those records in that table based on the "ID" field I need to get the "Label" field of "Max" Processed date and "Min" processed date.
-----------------------------------------------------------------------
ID LabelID GroupingField
-----------------------------------------------------------------------
1 1 101
2 2 101
3 3 101
4 4 101
5 5 101
6 1 102
7 2 102
8 3 102
9 4 102
And the final result set I expect it to look something like this.
-----------------------------------------------------------------------
GroupingField FirstProcessed LastProcessed
-----------------------------------------------------------------------
101 Label1 Label5
102 Label1 Label4
I have 'almost' managed to get this above result using rank function but still not satisfied with it. So I am looking if someone can provide me with a better option.
Thanks,
Prakazz
CREATE TABLE #Details (ID INT,LabelID INT,GroupingField INT)
CREATE TABLE #Details1 (ID INT,Label VARCHAR(100),ProcessedDate VARCHAR(100))
INSERT INTO #Details1 (ID ,Label ,ProcessedDate )
SELECT 1,'Label1','11/01/2016' UNION ALL
SELECT 2,'Label2','11/02/2016' UNION ALL
SELECT 3,'Label3','11/03/2016' UNION ALL
SELECT 4,'Label4','11/04/2016' UNION ALL
SELECT 5,'Label5','11/05/2016'
INSERT INTO #Details (ID ,LabelID ,GroupingField )
SELECT 1,1,101 UNION ALL
SELECT 2,2,101 UNION ALL
SELECT 3,3,101 UNION ALL
SELECT 4,4,101 UNION ALL
SELECT 5,5,101 UNION ALL
SELECT 6,1,102 UNION ALL
SELECT 7,2,102 UNION ALL
SELECT 8,3,102 UNION ALL
SELECT 9,4,102
;WITH CTE (GroupingField , MAXId ,MinId) AS
(
SELECT GroupingField,MAX(LabelID) MAXId,MIN(LabelID) MinId
FROM #Details
GROUP BY GroupingField
)
SELECT GroupingField ,B.Label FirstProcessed, A.Label LastProcessed
FROM CTE
JOIN #Details1 A ON MAXId = A.ID
JOIN #Details1 B ON MinId = B.ID
You can use SQL Row_Number() function using Partition By as follows with a combination of Group By
;with cte as (
select
t.Label, t.ProcessedDate,
g.GroupingField,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate ASC) minD,
ROW_NUMBER() over (partition by GroupingField Order By ProcessedDate DESC) maxD
from tbl t
inner join GroupingFieldTbl g
on t.ID = g.LabelID
)
select GroupingField, max(FirstProcessed) FirstProcessed, max(LastProcessed) LastProcessed
from (
select
GroupingField,
FirstProcessed = CASE when minD = 1 then Label else null end,
LastProcessed = CASE when maxD = 1 then Label else null end
from cte
where
minD = 1 or maxD = 1
) t
group by GroupingField
order by GroupingField
I also used CTE expression to make coding easier and understandable
Output is as