Efficient way to pull counts for all permutations of a field - sql

I have an oracle DB w/ a table that contains records associated to a person (based on an ID). The records are categorized as category = 1, 2, or 3.
I would like to pull as follows:
- # of people with only a category 1 record (no category=2 or 3)
- # of people with only a category 2 record (no category=1 or 3)
- # of people with only a category 3 record (no category=1 or 2)
- # of people with both category 1 & 2 records (no category=3)
- # of people with both category 1 & 3 records (no category=2)
- # of people with all category records 1,2, & 3
- # of people with both a category 2 & 3 records (no category=1)
I could only think of the following solution (modified for each case):
select count(*) from table1
where id in (select id from table1 where category=1)
and id not in (select id from table1 where category=2)
and id not in (select id from table1 where category=3)
But, I believe this is a highly inefficient way of doing this, was wondering if anyone had quicker/better way of getting this info.
Thanks!

One way to do this is to bring the categories together, using listagg() and then reaggregate:
select categories, count(*)
from (select listagg(t1.category, ',') within group (order by t1.category) as categories, personid
from table1 t1
group by personid
) x
group by categories;
EDIT:
If you need distinct values:
select categories, count(*)
from (select listagg(t1.category, ',') within group (order by t1.category) as categories, personid
from (select distinct t1.category, t1.personid from table1 t1) t1
group by personid
) x
group by categories;

Here is a query that, for each ID, shows the count of distinct categories and the MIN and MAX category. This query can be used as a sub-query in further processing (you didn't explain exactly HOW you want the results to be presented). When the COUNT is 1, then the single category is that in the MIN_CAT column; when the COUNT is 3, then all three categories are present for that ID; and when the COUNT is 2, then the two categories that are present are in the MIN and the MAX columns. Whatever else you need to do from here should be very simple; for example you can now GROUP BY CT, MIN_CAT, MAX_CT and count ID's.
I do a count(distinct category) to allow the possibility of non-unique (id, category) - as illustrated in the sample data I include in a WITH clause (which is NOT part of the SQL query!)
with
test_data ( id, category ) as (
select 101, 3 from dual union all
select 101, 1 from dual union all
select 101, 3 from dual union all
select 104, 2 from dual union all
select 105, 2 from dual union all
select 105, 2 from dual union all
select 105, 1 from dual union all
select 106, 1 from dual union all
select 106, 2 from dual union all
select 106, 3 from dual union all
select 106, 3 from dual
)
select id,
count(distinct category) as ct,
min(category) as min_cat,
max(category) as max_cat
from test_data
group by id
;
ID CT MIN_CAT MAX_CAT
--- -- ------- -------
101 2 1 3
105 2 1 2
104 1 2 2
106 3 1 3

Oracle Setup:
CREATE TABLE test_data ( id, category ) as
select 101, 3 from dual union all
select 101, 1 from dual union all
select 101, 3 from dual union all
select 104, 2 from dual union all
select 105, 2 from dual union all
select 105, 2 from dual union all
select 105, 1 from dual union all
select 106, 1 from dual union all
select 106, 2 from dual union all
select 106, 3 from dual union all
select 106, 3 from dual union all
select 107, 1 from dual union all
select 107, 3 from dual;
Query:
SELECT c1,
c2,
c3,
LTRIM(
DECODE( c1, 1, ',1' ) || DECODE( c2, 1, ',2' ) || DECODE( c3, 1, ',3' ),
','
) AS categories,
COUNT(1) AS num_people,
LISTAGG( id, ',' ) WITHIN GROUP ( ORDER BY id ) AS people
FROM ( SELECT DISTINCT * FROM test_data )
PIVOT ( COUNT(1) FOR category IN ( 1 AS c1, 2 AS c2, 3 AS c3 ) )
GROUP BY c1, c2, c3;
Output:
C1 C2 C3 CATEGORIES NUM_PEOPLE PEOPLE
-- -- -- ---------- ---------- ----------
0 1 0 2 1 104
1 0 1 1,3 2 101,107
1 1 0 1,2 1 105
1 1 1 1,2,3 1 106

Related

Oracle listagg - Can I pull data from other table based on the values selected by listagg

I have two tables and want to get data from one table based on the values got from Listtagg in the second table:
T1
ID Name
==============
1 Name1
2 Person2
3 Someone3
4 Mr.4
T2
ID Acct
===============
1 1234
1 5678
2 1234
3 5678
3 8769
4 1234
My listagg query on T2 has returned the following:
Acct Id
====== ========
1234 1,2,4
5678 1,3
I need the result with Names from other table something like:
Acct Id Name
====== ======== ==========
1234 1,2,4 Name1, Person2, Mr.4
5678 1,3 Name1, Someone3
Why would you first aggregate IDs, and then put effort in splitting them to collect NAMEs? Do it immediately. Not that it can't be done (it can, in a relatively simple manner, but - why?!?).
Sample data is from line #1 - 15; query you might need begins at line #16.
SQL> with
2 t1 (id, name) as
3 (select 1, 'Name1' from dual union all
4 select 2, 'Person2' from dual union all
5 select 3, 'Someone3' from dual union all
6 select 4, 'Mr4' from dual
7 ),
8 t2 (id, acct) as
9 (select 1, 1234 from dual union all
10 select 1, 5678 from dual union all
11 select 2, 1234 from dual union all
12 select 3, 5678 from dual union all
13 select 3, 8769 from dual union all
14 select 4, 1234 from dual
15 )
16 select b.acct,
17 listagg(b.id, ', ') within group (order by b.id) id,
18 listagg(a.name, ', ') within group (order by b.id) name
19 from t1 a join t2 b on a.id = b.id
20 group by b.acct;
ACCT ID NAME
---------- ---------- --------------------
1234 1, 2, 4 Name1, Person2, Mr4
5678 1, 3 Name1, Someone3
8769 3 Someone3
SQL>
#Littlefoot's answer is absolutely correct. But just as an addition: don't use listagg if you are going to split those aggregated values. Just use collect() aggregate function to get needed data as a collection.
For example:
select
cast(collect(level) as sys.odcinumberlist) as varray_of_numbers,
cast(collect(level) as ORA_MINING_NUMBER_NT) as nested_table_of_numbers,
cast(collect(sys.ku$_objnum(level)) as sys.KU$_OBJNUMSET) as nested_table_of_objnum
from dual connect by level<=3;
--Result:
VARRAY_OF_NUMBERS NESTED_TABLE_OF_NUMBERS NESTED_TABLE_OF_OBJNUM(OBJ_NUM)
------------------------- ------------------------------ ------------------------------------------------------------
ODCINUMBERLIST(1, 2, 3) ORA_MINING_NUMBER_NT(1, 2, 3) KU$_OBJNUMSET(KU$_OBJNUM(1), KU$_OBJNUM(2), KU$_OBJNUM(3))
Update: This is a query for your tables, as you asked in the comment:
select b.acct,
cast(collect(b.id) as ORA_MINING_NUMBER_NT) as nested_table_of_numbers,
cast(collect(a.name) as ORA_MINING_VARCHAR2_NT) as nested_table_of_varchar2
-- listagg(b.id, ', ') within group (order by b.id) id,
-- listagg(a.name, ', ') within group (order by b.id) name
from t1 a join t2 b on a.id = b.id
group by b.acct;
Full example:
with
t1 (id, name) as
(select 1, 'Name1' from dual union all
select 2, 'Person2' from dual union all
select 3, 'Someone3' from dual union all
select 4, 'Mr4' from dual
),
t2 (id, acct) as
(select 1, 1234 from dual union all
select 1, 5678 from dual union all
select 2, 1234 from dual union all
select 3, 5678 from dual union all
select 3, 8769 from dual union all
select 4, 1234 from dual
)
select b.acct,
cast(collect(b.id) as ORA_MINING_NUMBER_NT) as nested_table_of_numbers,
cast(collect(a.name) as ORA_MINING_VARCHAR2_NT) as nested_table_of_varchar2
-- listagg(b.id, ', ') within group (order by b.id) id,
-- listagg(a.name, ', ') within group (order by b.id) name
from t1 a join t2 b on a.id = b.id
group by b.acct;

converting comma separated value to multiple rows

I have a table like this:
ID NAME Dept_ID
1 a 2,3
2 b
3 c 1,2
Department is another table having dept_id and dept_name as columns. i want result like,
ID Name Dept_ID
1 a 2
1 a 3
2 b
3 c 1
3 c 2
any help please?
You can do it as:
--Dataset Preparation
with tab(ID, NAME,Dept_ID) as (Select 1, 'a', '2,3' from dual
UNION ALL
Select 2, 'b','' from dual
UNION ALL
Select 3, 'c' , '1,2' from dual)
--Actual Query
select distinct ID, NAME, regexp_substr(DEPT_ID,'[^,]+', 1, level)
from tab
connect by regexp_substr(DEPT_ID,'[^,]+', 1, level) is not null
order by 1;
Edit:
based on which column i need to join? in one table i have comma
separated ids and in other table i have just ids
with tab(ID, NAME,Dept_ID) as (Select 1, 'a', '2,3' from dual
UNION ALL
Select 2, 'b','' from dual
UNION ALL
Select 3, 'c' , '1,2' from dual) ,
--Table Dept
tbl_dept (dep_id,depname) as ( Select 1,'depa' from dual
UNION ALL
Select 2,'depb' from dual
UNION ALL
Select 3,'depc' from dual
) ,
--Seperating col values for join. Start your query from here using with clause since you already have the two tables.
tab_1 as (select distinct ID, NAME, regexp_substr(DEPT_ID,'[^,]+', 1, level) col3
from tab
connect by regexp_substr(DEPT_ID,'[^,]+', 1, level) is not null
order by 1)
--Joining table.
Select t.id,t.name,t.col3,dt.depname
from tab_1 t
left outer join tbl_dept dt
on t.col3 = dt.dep_id
order by 1
with tmp_tbl as(
select
1 ID,
'a' NAME,
'2,3' DEPT_ID
from dual
union all
select
2 ID,
'b' NAME,
'' DEPT_ID
from dual
union all
select
3 ID,
'c' NAME,
'1,2' DEPT_ID
from dual)
select
tmp_out.ID,
tmp_out.NAME,
trim(tmp_out.DEPT_ID_splited)
from(
select
tmp.ID,
tmp.NAME,
regexp_substr(tmp.DEPT_ID,'[^,]+', 1, level) DEPT_ID_splited
from
tmp_tbl tmp
connect by
regexp_substr(tmp.DEPT_ID,'[^,]+', 1, level) is not null) tmp_out
group by
tmp_out.ID,
tmp_out.NAME,
tmp_out.DEPT_ID_splited
order by
tmp_out.ID,
tmp_out.DEPT_ID_splited

Oracle SQL compare aggregated lines

Kind of stuck in relatively simple SQL...
Could someone propose some code for retrieving the GroupID for aggregated lines (group by GroupID) whose aValue is different ?
For example in the below table I'd need to get GroupID '4' as the 2 Items with in the same group (4) have different aValue
GroupId ItemID aValue
4 19 Hello
4 20 Hello1
5 78 Hello5
5 86 Hello5
You can use the having clause and look at the count of distinct values:
-- CTE for your sample data
with your_table (groupid, itemid, avalue) as (
select 4, 19, 'Hello' from dual
union all select 4, 20, 'Hello1' from dual
union all select 5, 78, 'Hello5' from dual
union all select 5, 86, 'Hello5' from dual
)
select groupid
from your_table
group by groupid
having count(distinct avalue) > 1;
GROUPID
----------
4
If you actually also want to see the individual values, you can use an analytic count in a subquery and filter that with where instead of having:
-- CTE for your sample data
with your_table (groupid, itemid, avalue) as (
select 4, 19, 'Hello' from dual
union all select 4, 20, 'Hello1' from dual
union all select 5, 78, 'Hello5' from dual
union all select 5, 86, 'Hello5' from dual
)
select groupid, itemid, avalue
from (
select groupid, itemid, avalue,
count(distinct avalue) over (partition by groupid) as value_count
from your_table
)
where value_count > 1;
GROUPID ITEMID AVALUE
---------- ---------- ------
4 19 Hello
4 20 Hello1
I would do this as :
select GroupId
from table t
group by GroupId
having min(aValue) <> max(aValue);
However, if you want all columns/expression then you can use EXISTS
select t.*
from table t
where exists (select 1
from table t1
where t1.GroupId = t.GroupId and
t1.avalue <> t.avalue
);

How to select the minimum value in a table or the next one in oracle sql

I have a table L1_CI_PER_ADDRESS with these columns
PER_ID,
SEQ_NUM,
ADDRESS_ID,
ADDRESS_TYPE_XFLG,
START_DT,
END_DT,
SEASON_START_MMDD,
SEASON_END_MMDD,
ADDRESS_PRIO_FLG,
DELIVERABLE_FLG,
VERSION,
LOAD_DATE
I want to select ADDRESS_TYPE_XFLG where the value is MAIN-AE if it exists or the MAIN-EN if the first one does not exists. Else I want to select CORRESPOND-AE or CORRESPOND-AE if MAIN-AE and MAIN-EN do not exists.
How can I do this? I am new to Oracle SQL. I want to remove the duplicates returned when I do my select.
One of the issues is that some person ID's have all four (MAIN-AE, MAIN-EN, CORRESPOND-AE, CORRESPOND-EN), so in this case I just want MAIN-AE to be returned.
I hope my question is clear.
enter image description here
It's top-n query. Use row_number():
select *
from (
select PER_ID, address_id, ADDRESS_TYPE_XFLG,
row_number() over (partition by per_id
order by case ADDRESS_TYPE_XFLG
when 'MAIN-AE' then 1
when 'MAIN-EN' then 2
when 'CORRESPOND-AE' then 3
when 'CORRESPOND-EN' then 4
end) as rn
from L1_CI_PER_ADDRESS)
where rn = 1
If person can own two addresses with the same flag then you need to add proper order after case when section, probably something like , seq_num desc.
Test:
with L1_CI_PER_ADDRESS(PER_ID, address_id, ADDRESS_TYPE_XFLG ) as (
select 1, 1, 'CORRESPOND-AE' from dual union all
select 1, 2, 'MAIN-AE' from dual union all
select 1, 3, 'CORRESPOND-EN' from dual union all
select 1, 4, 'MAIN-EN' from dual union all
select 2, 5, 'CORRESPOND-AE' from dual union all
select 3, 6, 'MAIN-AE' from dual union all
select 4, 7, 'CORRESPOND-EN' from dual union all
select 4, 8, 'MAIN-AE' from dual
)
select PER_ID, address_id
from (
select PER_ID, address_id, ADDRESS_TYPE_XFLG,
row_number() over (partition by per_id
order by case ADDRESS_TYPE_XFLG
when 'MAIN-AE' then 1
when 'MAIN-EN' then 2
when 'CORRESPOND-AE' then 3
when 'CORRESPOND-EN' then 4
end) as rn
from L1_CI_PER_ADDRESS)
where rn = 1
Output:
PER_ID ADDRESS_ID ADDRESS_TYPE_XFLG
---------- ---------- -----------------
1 2 MAIN-AE
2 5 CORRESPOND-AE
3 6 MAIN-AE
4 8 MAIN-AE

How to do select count(*) group by and select * at same time?

For example, I have table:
ID | Value
1 hi
1 yo
2 foo
2 bar
2 hehe
3 ha
6 gaga
I want my query to get ID, Value; meanwhile the returned set should be in the order of frequency count of each ID.
I tried the query below but don't know how to get the ID and Value column at the same time:
SELECT COUNT(*) FROM TABLE group by ID order by COUNT(*) desc;
The count number doesn't matter to me, I just need the data to be in such order.
Desire Result:
ID | Value
2 foo
2 bar
2 hehe
1 hi
1 yo
3 ha
6 gaga
As you can see because ID:2 appears most times(3 times), it's first on the list,
then ID:1(2 times) etc.
you can try this -
select id, value, count(*) over (partition by id) freq_count
from
(
select 2 as ID, 'foo' as value
from dual
union all
select 2, 'bar'
from dual
union all
select 2, 'hehe'
from dual
union all
select 1 , 'hi'
from dual
union all
select 1 , 'yo'
from dual
union all
select 3 , 'ha'
from dual
union all
select 6 , 'gaga'
from dual
)
order by 3 desc;
select t.id, t.value
from TABLE t
inner join
(
SELECT id, count(*) as cnt
FROM TABLE
group by ID
)
x on x.id = t.id
order by x.cnt desc
How about something like
SELECT t.ID,
t.Value,
c.Cnt
FROM TABLE t INNER JOIN
(
SELECT ID,
COUNT(*) Cnt
FROM TABLE
GROUP BY ID
) c ON t.ID = c.ID
ORDER BY c.Cnt DESC
SQL Fiddle DEMO
I see the question is already answered, but since the most obvious and most simple solution is missing, I'm posting it anyway. It doesn't use self joins nor subqueries:
SQL> create table t (id,value)
2 as
3 select 1, 'hi' from dual union all
4 select 1, 'yo' from dual union all
5 select 2, 'foo' from dual union all
6 select 2, 'bar' from dual union all
7 select 2, 'hehe' from dual union all
8 select 3, 'ha' from dual union all
9 select 6, 'gaga' from dual
10 /
Table created.
SQL> select id
2 , value
3 from t
4 order by count(*) over (partition by id) desc
5 /
ID VALU
---------- ----
2 bar
2 hehe
2 foo
1 yo
1 hi
6 gaga
3 ha
7 rows selected.