Oracle query with group - sql

I have a scenario where I need to fetch all the records within an ID for the same source. Given below is my input set of records
ID SOURCE CURR_FLAG TYPE
1 IBM Y P
1 IBM Y OF
1 IBM Y P
2 IBM Y P
2 TCS Y P
3 IBM NULL P
3 IBM NULL P
3 IBM NULL P
4 IBM NULL OF
4 IBM NULL OF
4 IBM Y ON
From the above settings, I need to select all the records with source as IBM within that same ID group.Within the ID group if there is at least one record with a source other than IBM, then I don't want any record from that ID group. Also, we need to fetch only those records where at least one record in that ID group with curr_fl='Y'
In the above scenario even though the ID=3 have a source as IBM, but there is no record with CURR_FL='Y', my query should not fetch the value.In the case of ID=4, it can fetch all the records with ID=4, as one of the records have value='Y'.
Also within the group which has satisfied the above condition, I need one more condition for source_type. if there are records with source_type='P', then I need to fetch only that record.If there are no records with P, then I will search for source_type='OF' else source_type='ON'
I have written a query as given below.But it's running for long and not fetching any results. Is there any better way to modify this query
select
ID,
SOURCE,
CURR_FL,
TYPE
from TABLE a
where
not exists(select 1 from TABLE B where a.ID = B.ID and source <> 'IBM')
and exists(select 1 from TABLE C where a.ID = C.ID and CURR_FL = 'Y') and
(TYPE, ID) IN (
select case type when 1 then 'P' when 2 then 'OF' else 'ON' END TYPE,ID from
(select ID,
max(priority) keep (dense_rank first order by priority asc) as type
from ( select ID,TYPE,
case TYPE
when 'P' then 1
when 'OF' then 2
when 'ON' then 3
end as priority
from TABLE where ID
in(select ID from TABLE where CURR_FL='Y') AND SOURCE='IBM')
group by ID))

I think you can just do a single aggregation over your table by ID and check for the yes flag as well as assert that no non IBM source appears. I do this in a CTE below, and then join back to your original table to return full matching records.
WITH cte AS (
SELECT
ID,
CASE WHEN SUM(CASE WHEN TYPE = 'P' THEN 1 ELSE 0 END) > 0
THEN 1
WHEN SUM(CASE WHEN TYPE = 'OF' THEN 1 ELSE 0 END) > 0
THEN 2
WHEN SUM(CASE WHEN TYPE = 'ON' THEN 1 ELSE 0 END) > 0
THEN 3 ELSE 4 END AS p_type
FROM yourTable
GROUP BY ID
HAVING
SUM(CASE WHEN CURR_FLAG = 'Y' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN SOURCE <> 'IBM' THEN 1 ELSE 0 END) = 0
)
SELECT t1.*
FROM yourTable t1
INNER JOIN cte t2
ON t1.ID = t2.ID
WHERE
t2.p_type = 1 AND t1.TYPE = 'P' OR
t2.p_type = 2 AND t1.TYPE = 'OF' OR
t2.p_type = 3 AND t1.TYPE = 'ON';

Related

Select the greatest occurence from a column, based on date is frequencies are the same

I have the following dataset with let's say ID = {1,[...],5} and Col1 = {a,b,c,Null} :
ID
Col1
Date
1
a
01/10/2022
1
a
02/10/2022
1
a
03/10/2022
2
b
01/10/2022
2
c
02/10/2022
2
c
03/10/2022
3
a
01/10/2022
3
b
02/10/2022
3
Null
03/10/2022
4
c
01/10/2022
5
b
01/10/2022
5
Null
02/10/2022
5
Null
03/10/2022
I would like to group my rows by ID, compute new columns to show the number of occurences and compute a new column that would show a string of characters, depending on the frequency of Col1. With most a = Hi, most b = Hello, most c = Welcome, most Null = Unknown. If multiple modalities except Null have the same frequency, the most recent one based on date wins.
Here is the dataset I need :
ID
nb_a
nb_b
nb_c
nb_Null
greatest
1
3
0
0
0
Hi
2
0
1
2
0
Welcome
3
1
1
0
1
Hello
4
0
0
1
0
Welcome
5
0
1
0
2
Unknown
I have to do this in a compute recipe in Dataiku. The group by is handled by the group by section of the recipe while the rest of the query needs to be done in the "custom aggregations" section of the recipe. I'm having troubles with the if equality then most recent part of the code.
My SQL code looks like this :
CASE WHEN SUM(CASE WHEN Col1 = a THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = b THEN 1 ELSE 0)
AND SUM(CASE WHEN Col1 = a THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = c THEN 1 ELSE 0)
THEN 'Hi'
CASE WHEN SUM(CASE WHEN Col1 = b THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = a THEN 1 ELSE 0)
AND SUM(CASE WHEN Col1 = b THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = c THEN 1 ELSE 0)
THEN 'Hello'
CASE WHEN SUM(CASE WHEN Col1 = c THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = a THEN 1 ELSE 0)
AND SUM(CASE WHEN Col1 = c THEN 1 ELSE 0) >
SUM(CASE WHEN Col1 = b THEN 1 ELSE 0)
THEN 'Welcome'
Etc, etc, repeat for other cases.
But surely there must be a better way to do this right? And I have no idea how to include the most recent one when frequencies are the same.
Thank you for your help and sorry if my message isn't clear.
I tried to repro this in Azure Synapse using SQL script. Below is the approach.
Sample Table is created as in below image.
Create table tab1 (id int, col1 varchar(50), date_column date)
Insert into tab1 values(1,'a','2021-10-01')
Insert into tab1 values(1,'a','2021-10-02')
Insert into tab1 values(1,'a','2021-10-03')
Insert into tab1 values(2,'b','2021-10-01')
Insert into tab1 values(2,'c','2021-10-02')
Insert into tab1 values(2,'c','2021-10-03')
Insert into tab1 values(3,'a','2021-10-01')
Insert into tab1 values(3,'b','2021-10-02')
Insert into tab1 values(3,'Null','2021-10-03')
Insert into tab1 values(4,'c','2021-10-01')
Insert into tab1 values(5,'b','2021-10-01')
Insert into tab1 values(5,'Null','2021-10-02')
Insert into tab1 values(5,'Null','2021-10-03')
Step:1
Query is written to find the count of values within the group id,col1 and maximum date value within each combination of id, col1.
select
distinct id,col1,
count(*) over (partition by id,col1) as count,
case when col1='Null' then null else max(date_column) over (partition by id,col1) end as max_date
from tab1
Step:2
Row number is calculated within each id, col1 group on the decreasing order of count and max_date columns. This is done when two or more values have same frequency, then to assign value based on latest date.
select *, row_number() over (partition by id order by count desc, max_date desc) as row_num from
(select
distinct id,col1,
count(*) over (partition by id,col1) as count,
case when col1='Null' then null else max(date_column) over (partition by id,col1) end as max_date
from tab1)q1
Step:3
Line items with row_num=1 are filtered and values for the greatest column is assigned with the logic
most a = Hi, most b = Hello, most c = Welcome, most Null = Unknown.
Full Query
select id,
[greatest]=case when col1='a' then 'Hi'
when col1='b' then 'Hello'
when col1='c' then 'Welcome'
else 'Unknown'
end
from
(select *, row_number() over (partition by id order by count desc, max_date desc) as row_num from
(select
distinct id,col1,
count(*) over (partition by id,col1) as count,
case when col1='Null' then null else max(date_column) over (partition by id,col1) end as max_date
from tab1)q1
)q2 where row_num=1
Output
By this approach, even when the frequencies are same, based on the most recent date, required values can be updated.

Select table adding columns with data depending on duplicates in other column

Imagine this data.
Id
Type
1
A
1
B
1
B
2
A
3
B
I want to select table and ad two columns turning it to this. How can i do it? (In teradata)
Id
Type
Id with both A+B
Id with only A
1
A
1
0
1
B
1
0
1
B
1
0
2
A
0
1
3
B
0
0
I'm not familiar with teradata but in standard SQL next query should be working:
SELECT
T.*,
CASE WHEN Cnt = 2 THEN 1 ELSE 0 END AS BOTH_TYPES_PRESENT,
CASE WHEN Cnt = 1 AND Type = 'A' THEN 1 ELSE 0 END AS ONLY_A_PRESENT
FROM T
LEFT JOIN (
SELECT Id, COUNT(DISTINCT Type) Cnt FROM T WHERE Type IN ('A', 'B') GROUP BY Id
) CNT ON T.Id = CNT.Id;
SQL online editor

Checking if the row has the max value in a group

I'm trying get to find out if a row has the max value in a group. Here's really simple example:
Data
VoteCount LocationId UserId
3 1 1
4 1 2
3 2 2
4 2 1
Pseudo-query
select
LocationId,
sum(case
when UserId = 1 /* and has max vote count*/
then 1 else 0
end) as IsUser1Winner,
sum(case
when UserId = 2 /* and has max vote count*/
then 1 else 0
end) as IsUser2Winner
from LocationVote
group by LocationID
It should return:
LocationId IsUser1Winner IsUser2Winner
1 0 1
2 1 1
I also couldn't find a way to generate dynamic column names here. What would be the simplest way to write this query?
You could also do this using a Case statement
WITH CTE as
(SELECT
MAX(VoteCount) max_votes
, LocationId
FROM LocationResult
group by LocationId
)
SELECT
A.LocationId
, Case When UserId=1
THEN 1
ELSE 0
END IsUser1Winner
, Case when UserId=2
THEn 1
ELSE 0
END IsUser2Winner
from LocationResult A
inner join
CTE B
on A.VoteCount = B.max_votes
and A.LocationId = B.LocationId
Try this:
select *
from table t
cross apply (
select max(votes) max_value
from table ref
where ref.group = t.group
)votes
where votes.max_value = t.votes
but if your table is huge and has no propriate indexes performance may be poor
Another way is to get max values by groups into table variable or temp table and then join it to original table.

How to compare two table values using PLSQL

I have to compare two tables values;
TABLE_A TABLE_B
ID TYPE ID TYPE
12345 12345 3
67891 12345 7
36524 67891 3
67891 2
67891 5
36524 3
Logic: I have to compare table_A id with Table_B id
if found 3&7
good
else found 3 only
avg
else if found 7 only
bad
These good, bad and avg should go back to table A type values.
could any one help me how to write this code in PLSQL.
Assuming that you are considering type 3 and 7 only for your calculations, you can use following merge statement, no need of PL-SQL
merge into table_a a
using (select id, case (listagg(type, ',') within group (order by type))
when '3,7' then 'Good'
when '3' then 'Avg'
when '7' then 'Bad'
else null
end new_type
from table_b
where type in (3,7)
group by id) b
on (a.id = b.id)
when matched then
update set type = new_type;
For Oracle versions prior to 11 g release 2, use following:
merge into table_a a
using (select id, case (trim(both ',' from min(decode(type, 3, 3, null))||','||min(decode(type, 7, 7, null))))
when '3,7' then 'Good'
when '3' then 'Avg'
when '7' then 'Bad'
else null
end new_type
from table_b
where type in (3,7)
group by id) b
on (a.id = b.id)
when matched then
update set type = new_type;
It has been assumed that there are unique combination of id an type in table_b.
I am interpreting what you mean as saying that you want to output 'good' when TableB contains both 3 and 7, 'avg' when it contains only 3, and so on. Here is a way to get this result:
select a.id,
(case when sum(case when b.type = 3 then 1 else 0 end) > 1 and
sum(case when b.type = 7 then 1 else 0 end) > 0
then 'good'
when sum(case when b.type = 3 then 1 else 0 end) > 1
then 'avg'
when sum(case when b.type = 7 then 1 else 0 end)
then 'bad'
end) as logic
from tableA a left outer join
tableB b
on a.id = b.id
group by a.id;

To retrieve records having only two specific values

Have the following Data in the table
Example Table
ID Value
1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
I need to retrieve records having ID with only two values a and b.
So i am expecting only the Record with ID 3 .
Can anyone help me with the query
I guess you could do something like
select
ID,
sum(case when value = 'a' then 1
when value = 'b' then 1
else 3 end)
from
table1
group by id
having
sum (case when value = 'a' then 1
when value = 'b' then 1
else 3 end) =2
SQL Fiddle
That will work:
select x.id from
(
select id from mytable where value = 'a'
union all
select id from mytable where value = 'b'
) x
group by x.id
having COUNT(*) = 2
and not exists (select * from mytable t where t.id = x.id and value <> 'a' and value <> 'b')