Oracle SQL query count distinct branches - sql

What I wanted to do is to know the number of employees that have their mgr and have no mgr. The table is like this:
Emp Branch Mgr
EmpA Branch1 Mgr1
EmpB Branch2 Mgr2
EmpC Branch1 Mgr2
EmpD Branch1
EmpE Branch2 Mgr2
EmpF Branch1 Mgr2
And the output that I wanted to get is like this:
Branch HasMgr HasNoMgr
Branch1 3 1
Branch2 2 0
already tried this code but the result is wrong
SELECT branches,
(SELECT COUNT(*) FROM tbl WHERE mgr IS NULL),
(SELECT COUNT(*) FROM tbl WHERE mgr IS NOT NULL )
FROM tbl GROUP BY branches

Use a sub-query to sum up all managers with/without a value. Hope this helps. Thanks.
SELECT branch,
SUM(case when Mgr is not null then 1 else 0 end) hasmgr,
SUM(case when Mgr is not null then 0 else 1 end) hasnomgr
FROM tbl
GROUP by branch;

With dat as(
Select 'Emp' emp , 'Branch' Branch, 'Mgr' as manager UNION ALL
Select 'EmpA' , 'Branch1', 'Mgr1' union all
Select 'EmpB' , 'Branch2', 'Mgr2' union all
Select 'EmpC' , 'Branch1' , 'Mgr2' union all
Select 'EmpD' , 'Branch1' , null union all
Select 'EmpE' , 'Branch2' ,'Mgr2' union all
Select 'EmpF' , 'Branch1' , 'Mgr2'
)
SELECT Branch,count(manager) hasMgr,sum(case when manager is null then 1
else 0 end) hasNoMgr FROM dat
group by branch

select branch,
sum(decode(mgr,null,0,1)) as "hasmgr",
sum(decode(mgr,null,1,0)) as "hasnomgr"
FROM TAB1
GROUP BY BRANCH

Related

SQL - Finding Duplicate Records based certain criteria

I have these records in the table - employee_projects
id
employee_id
project_id
status
1
emp1
proj1
VERIFIED
2
emp2
proj2
REJECTED
3
emp1
proj1
VERIFIED
4
emp1
proj3
REJECTED
5
emp2
proj2
REQUIRED
6
emp3
proj4
SUBMITTED
7
emp4
proj5
VERIFIED
8
emp4
proj6
VERIFIED
9
emp3
proj4
REQUIRED
Here are the criteria for determining duplicates:
Same employee ID, same project ID under the same status (Example: rows 1 and 3 are duplicates)
Same employee ID, same project ID but in different status (Example: rows 6 and 9 are duplicates).
An exception to duplication criteria#2 is if one project is REQUIRED and the same project is also REJECTED under the same employee, this is NOT considered a duplicate. For example, rows 2 and 5 are NOT duplicates.
I have a query for the first criterion:
select
emp_id,
proj_id,
status,
COUNT(*)
from
employee_projects
group by
emp_id,
proj_id,
status
having
COUNT(*) > 1
What I'm struggling to construct is the SQL for the second criterion.
maybe a self join can help you.
with t (employee_id ,project_id,status)
as
(
select 'emp1', 'proj1' , 'VERIFIED'
Union all select 'emp2', 'proj2' , 'REJECTED'
Union all select 'emp1', 'proj1' , 'VERIFIED'
Union all select 'emp1', 'proj3' , 'REJECTED'
Union all select 'emp2', 'proj2' , 'REQUIRED'
Union all select 'emp3', 'proj4' , 'SUBMITTED'
Union all select 'emp4', 'proj5' , 'VERIFIED'
Union all select 'emp4', 'proj6' , 'VERIFIED'
Union all select 'emp3', 'proj4' , 'REQUIRED'
)
select
t.employee_id,
t.project_id,
t.status,
'' as status,
'criteria#1' as SQL
from
t
group by
t.employee_id,
t.project_id,
t.status
having
COUNT(*) > 1
union all
SELECT
t.employee_id,
t.project_id,
t.status,
a.status,
'criteria#2' as SQL
FROM
t
left join t as a on
t.employee_id = a.employee_id and
t.project_id = a.project_id
where
t.status != a.status and
concat(t.status,a.status) != 'REQUIREDREJECTED' and
concat(t.status,a.status) != 'REJECTEDREQUIRED'
Try the following:
select T.emp_id, T.proj_id, T.status, D.dup_cnt
from employee_projects T join
(
select emp_id, proj_id, count(*) as dup_cnt
from employee_projects
group by emp_id, proj_id
having count(*) > 1 and
count(distinct case when status in ('REQUIRED', 'REJECTED') then status end) < 2
) D
on T.emp_id = D.emp_id and T.proj_id = D.proj_id
order by T.emp_id, T.proj_id
If you want to consider an employee with statuses ('REQUIRED', 'REJECTED', any other statuses) as duplicate, modify the having clause as the following:
select T.emp_id, T.proj_id, T.status, D.dup_cnt
from employee_projects T join
(
select emp_id, proj_id, count(*) as dup_cnt
from employee_projects
group by emp_id, proj_id
having count(*) > 1 and
(count(distinct case when status in ('REQUIRED', 'REJECTED') then status end) < 2 or count(distinct status) > 2)
) D
on T.emp_id = D.emp_id and T.proj_id = D.proj_id
order by T.emp_id, T.proj_id
See a demo.

how to transpose rows to columns for the grouped data?

While doing the employees and supervisors analysis, I got in trouble with the BigQuery statements.
SELECT SupervisorName, Emp_Status, COUNT(DISTINCT EmpNO)AS NUMBER
FROM
(SELECT
EmpNO,
EmpName,
(CASE WHEN TerminationDate IS NULL THEN 'Active'
ELSE 'Terminated'
END
)AS Emp_Status,
DateOfBirth,
DATE_DIFF(CURRENT_DATE(),DateOfBirth,YEAR) AS Age,
SupervisorName
FROM `Table1`
)
GROUP BY SupervisorName, Emp_Status
ORDER BY SupervisorName, NUMBER DESC
The result is shown below:
Row SupervisorName Emp_Status NUMBER
1 null Terminated 321
2 null Active 2
3 Ahearn Active 3
4 Ahearn Terminated 2
5 Allen Active 6
6 Allen Terminated 3
......
How can I change it to like this:
Row SupervisorName Active Termination Total
1 Null 2 321 323
2 Ahearn 3 2 5
3 Allen 6 3 9
......
The standard pattern here is to use SUM and CASE to get the result -- as below:
SELECT
SupervisorName,
SUM(CASE WHEN Emp_Status = 'Active' THEN 1 ELSE 0 END) AS Active,
SUM(CASE WHEN Emp_Status = 'Terminated' THEN 1 ELSE 0 END) AS Termination,
COUNT(*) AS Total
FROM (
SELECT
EmpNO,
EmpName,
CASE WHEN TerminationDate IS NULL THEN 'Active' ELSE 'Terminated' END AS Emp_Status,
DateOfBirth,
DATE_DIFF(CURRENT_DATE(),DateOfBirth,YEAR) AS Age,
SupervisorName
FROM `Table1`
)
GROUP BY SupervisorName
Note, I left the same sub-query you had, but as given you don't actually need a sub-query, you can just change the CASE statement to look at termination date instead of the string you created in the sub-query.
I assume your actual code is more complicated so I left it like this.
maybe that's what you want
select
SupervisorName,
count(distinct if(TerminationDate is null, EmpNO, null)) as active,
count(distinct if(TerminationDate is null, null, EmpNO)) as terminated,
count(distinct EmpNO) as dist_total,
count(*) as total
from
`Table1`
where
-- you should use keyword "date" and iso8601 format
LastHireDate between date'2018-01-01'
and current_date()
group by
1
order by
1, 4 desc

Display even empty records in output

I am querying data (multiple columns) for different item types through a UNION of different queries. If there are no values in any of those columns for a particular item type, that record does not show up. But, I need all rows (including empty ones) pertaining to each item type. The empty rows can show 0.
My data is:
create table sales_table ([yr] int, [qtr] varchar(40), [item_type] varchar(40), [sale_price] int);
create table profit_table ([yr] int, [qtr] varchar(40), [item_type] varchar(40), [profit] int);
create table item_table ([item_type] varchar(40));
insert into sales_table values
(2010,'Q1','abc',31),(2010,'Q1','def',23),(2010,'Q1','mno',12),(2010,'Q1','xyz',7),(2010,'Q2','abc',54),(2010,'Q2','def',67),(2010,'Q2','mno',92),(2010,'Q2','xyz',8);
insert into profit_table values
(2010,'Q1','abc',10),(2010,'Q1','def',6),(2010,'Q1','mno',23),(2010,'Q1','xyz',7),(2010,'Q2','abc',21),(2010,'Q2','def',13),(2010,'Q2','mno',15),(2010,'Q2','xyz',2);
insert into item_table values
('abc'),('def'),('ghi'),('jkl'),('mno'),('xyz');
My Query is:
SELECT a.yr, a.qtr, b.item_type, MAX(a.sales), MAX(a.avg_price), MAX(a.profit)
FROM
(SELECT [yr], [qtr],
CASE
WHEN item_type = 'abc' THEN 'ABC'
WHEN item_type = 'def' THEN 'DEF'
WHEN item_type = 'ghi' THEN 'GHI'
WHEN item_type = 'jkl' THEN 'JKL'
WHEN item_type IN ('mno', 'xyz') THEN 'Other'
END AS [item_type],
COUNT(sale_price) OVER (PARTITION BY yr, qtr, item_type) [sales],
AVG(sale_price) OVER (PARTITION BY yr, qtr, item_type) [avg_price],
NULL [profit]
FROM sales_table
WHERE yr >=2010
UNION ALL
SELECT yr, qtr,
CASE
WHEN item_type = 'abc' THEN 'ABC'
WHEN item_type = 'def' THEN 'DEF'
WHEN item_type = 'ghi' THEN 'GHI'
WHEN item_type = 'jkl' THEN 'JKL'
WHEN item_type IN ('mno', 'xyz') THEN 'Other'
END AS [item_type],
NULL [sales],
NULL [avg_price],
SUM(profit) OVER (PARTITION BY yr, qtr, item_type) [profit]
FROM profit_table
WHERE yr >=2010
) a
FULL OUTER JOIN
(SELECT
CASE
WHEN item_type = 'abc' THEN 'ABC'
WHEN item_type = 'def' THEN 'DEF'
WHEN item_type = 'ghi' THEN 'GHI'
WHEN item_type = 'jkl' THEN 'JKL'
WHEN item_type IN ('mno', 'xyz') THEN 'Other'
END AS [item_type]
FROM item_table
WHERE item_type in ('abc','def','ghi','jkl','mno','xyz')
) b
ON a.item_type = b.item_type
GROUP BY a.yr, a.qtr, b.item_type
ORDER BY a.yr, a.qtr, b.item_type;
The current output is like this:
yr qtr item_type sales avg_price profit
(null) (null) GHI (null) (null) (null)
(null) (null) JKL (null) (null) (null)
2010 Q1 ABC 1 31 10
2010 Q1 DEF 1 23 6
2010 Q1 Other 1 12 23
2010 Q2 ABC 1 54 21
2010 Q2 DEF 1 67 13
2010 Q2 Other 1 92 15
What I want is like as shown below.
yr qtr item_type sales avg_price profit
2010 Q1 ABC 1 31 10
2010 Q1 DEF 1 23 6
2010 Q1 GHI 0 0 0
2010 Q1 JKL 0 0 0
2010 Q1 Other 2 9.5 30
2010 Q2 ABC 1 54 21
2010 Q2 DEF 1 67 13
2010 Q2 GHI 0 0 0
2010 Q2 JKL 0 0 0
2010 Q2 Other 2 50 17
Please advise.
Here is another option using union all + group by
select
max(max([year])) over ()
, max(max([quarter])) over ()
, [item_type]
, isnull(max([sales]), 0)
, isnull(max([avg price]), 0)
, isnull(max([profit]), 0)
from (
SELECT [year], [quarter], [item_type], [sales], [avg price], [profit]
FROM sales_table
union all
select distinct null, null, item_type, null, null, null
from item_table
) t
group by [item_type]
Your query with full join should work, but you must deal with null values. And I think it should look like:
SELECT
max(a.year) over ()
, max(a.quarter) over ()
, b.item_type
, isnull(a.sales, 0)
, isnull(a.avg_price, 0)
, isnull(a.profit, 0)
FROM
(SELECT [year], [quarter], [item_type], [sales], [avg price], [profit]
FROM sales_table) a
FULL OUTER JOIN
(SELECT item_type FROM item_table) b
ON a.item_type = b.item_type
ORDER BY a.year, a.quarter, b.item_type
Got it to Work.
The key was to Cross-Join the Item_type with date (for this example, need to create a temporary calendar table) and then do a left join with the calculated results from the sales_table and the profit_table.
insert into #date_table values
(2010,'Q1'),(2010,'Q2'), (2010,'Q3'),(2010,'Q4');
SELECT
b.yr
, b.qtr
, b.item_type
, COALESCE(MAX(a.sales),0) AS sales
, COALESCE(MAX(a.avg_price),0) AS avg_price
, COALESCE(MAX(a.profit),0) AS profit
FROM
(
SELECT
dt.[yr]
,dt.[qtr]
,CASE
WHEN it.[item_type] IN ('mno', 'xyz') THEN 'Other'
ELSE UPPER(it.[item_type])
END AS [item_type]
FROM
#date_table AS dt
CROSS JOIN
#item_table AS it
WHERE
dt.[yr] >=2010
GROUP BY
dt.[yr]
,dt.[qtr]
,CASE
WHEN it.[item_type] IN ('mno', 'xyz') THEN 'Other'
ELSE UPPER(it.[item_type])
END
) AS b
LEFT JOIN
(SELECT [yr], [qtr],
CASE
WHEN item_type IN ('mno', 'xyz') THEN 'Other'
ELSE UPPER([item_type])
END AS [item_type],
COUNT(sale_price) OVER (PARTITION BY yr, qtr, item_type) [sales],
AVG(sale_price) OVER (PARTITION BY yr, qtr, item_type) [avg_price],
NULL [profit]
FROM #sales_table
WHERE yr >=2010
UNION ALL
SELECT yr, qtr,
CASE
WHEN item_type IN ('mno', 'xyz') THEN 'Other'
ELSE UPPER([item_type])
END AS [item_type],
NULL [sales],
NULL [avg_price],
SUM(profit) OVER (PARTITION BY yr, qtr, item_type) [profit]
FROM #profit_table
WHERE yr >=2010
) a
ON
a.[yr] = b.[yr]
AND
a.[qtr] = b.[qtr]
AND
a.[item_type] = b.[item_type]
GROUP BY
b.yr, b.qtr, b.item_type
ORDER BY b.yr, b.qtr, b.item_type;

Tuning oracle subquery in select statement

I have a master table and a reference table as below.
WITH MAS as (
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 200 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 1 PROCESS_ID, 44 PROCESS_TYPE, 250 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 300 as AMOUNT FROM DUAL UNION ALL
SELECT 10 as CUSTOMER_ID, 2 PROCESS_ID, 45 PROCESS_TYPE, 350 as AMOUNT FROM DUAL
), REFTAB as (
SELECT 44 PROCESS_TYPE, 'A' GROUP_ID FROM DUAL UNION ALL
SELECT 44 PROCESS_TYPE, 'B' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'C' GROUP_ID FROM DUAL UNION ALL
SELECT 45 PROCESS_TYPE, 'D' GROUP_ID FROM DUAL
) SELECT ...
My first select statement which works correctly is this one:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'A')
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN PROCESS_TYPE IN (SELECT PROCESS_TYPE FROM REFTAB WHERE GROUP_ID = 'D')
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID
However, to address a performance issue, I changed it to this select statement:
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A
LEFT JOIN REFTAB B ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID
For the AMOUNT2 and COUNT1 columns, the values stay the same. But for AMOUNT1, the value is multiplied because of the join with the reference table.
I know I can add 1 more left join with an additional join condition on GROUP_ID. But that won't be any different from using a subquery.
Any idea how to make the query work with just 1 left join while not multiplying the AMOUNT1 value?
I know I can add 1 more left join with adding aditional GROUP_ID clause but it wont be different from subquery.
You'd be surprised. Having 2 left joins instead of subqueries in the SELECT gives the optimizer more ways of optimizing the query. I would still try it:
select m.customer_id,
sum(m.amount) as amount1,
sum(case when grpA.group_id is not null then m.amount end) as amount2,
count(grpD.group_id) as count1
from mas m
left join reftab grpA
on grpA.process_type = m.process_type
and grpA.group_id = 'A'
left join reftab grpD
on grpD.process_type = m.process_type
and grpD.group_id = 'D'
group by m.customer_id
You can also try this query, which uses the SUM() analytic function to calculate the amount1 value before the join to avoid the duplicate value problem:
select m.customer_id,
m.customer_sum as amount1,
sum(case when r.group_id = 'A' then m.amount end) as amount2,
count(case when r.group_id = 'D' then 'X' end) as count1
from (select customer_id,
process_type,
amount,
sum(amount) over (partition by customer_id) as customer_sum
from mas) m
left join reftab r
on r.process_type = m.process_type
group by m.customer_id,
m.customer_sum
You can test both options, and see which one performs better.
Starting off with your original query, simply replacing your IN queries with EXISTS statements should provide a significant boost. Also, be wary of summing NULLs, perhaps your ELSE statements should be 0?
SELECT CUSTOMER_ID,
SUM(AMOUNT) as AMOUNT1,
SUM(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'A' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN EXISTS(SELECT 1 FROM REFTAB WHERE REFTAB.GROUP_ID = 'D' AND REFTAB.PROCESS_TYPE = MAS.PROCESS_TYPE)
THEN 1 ELSE NULL END) as COUNT1
FROM MAS
GROUP BY CUSTOMER_ID
The normal way is to aggregate the values before the group by. You can also use conditional aggregation, if the rest of the query is correct:
SELECT CUSTOMER_ID,
SUM(CASE WHEN seqnum = 1 THEN AMOUNT END) as AMOUNT1,
SUM(CASE WHEN GROUP_ID = 'A' THEN AMOUNT ELSE NULL END) as AMOUNT2,
COUNT(CASE WHEN GROUP_ID = 'D' THEN 1 ELSE NULL END) as COUNT1
FROM MAS A LEFT JOIN
(SELECT B.*, ROW_NUMBER() OVER (PARTITION BY PROCESS_TYPE ORDER BY PROCESS_TYPE) as seqnum
FROM REFTAB B
) B
ON A.PROCESS_TYPE = B.PROCESS_TYPE
GROUP BY CUSTOMER_ID;
This ignores the duplicates created by the joins.

Using SELECT UNION and returning output of two columns from one table

I am creating a query that counts the amount of male and female actors in my table. My current statement is as such:
Select COUNT(ActorGender) “Male Actors”
from (tblActor ta WHERE ta.ActorGender in(‘m’)
UNION
Select COUNT(ActorGender) “Female Actors”
from tblActor ta
WHERE ta.ActorGender in(‘f’);
The output ends up being:
Male Actors
-----------
7
21
I want the output to look like:
Male Actors Female Actors
----------- -------------
7 21
I am looking for an alternative to go about this without using the CASE WHEN or THEN clauses.
Thanks in advance for the help as usual.
Another way (without CASE expression):
SELECT
( SELECT COUNT(*)
FROM tblActor
WHERE ActorGender = 'm'
) AS MaleActors
, ( SELECT COUNT(*)
FROM tblActor
WHERE ActorGender = 'f'
) AS FemaleActors
FROM
dual ;
and more solution with CROSS join:
SELECT m.MaleActors, f.FemaleActors
FROM
( SELECT COUNT(*) AS MaleActors
FROM tblActor
WHERE ActorGender = 'm'
) m
CROSS JOIN
( SELECT COUNT(*) AS FemaleActors
FROM tblActor
WHERE ActorGender = 'f'
) f ;
This would do:
SELECT COUNT(CASE WHEN ActorGender = 'm' THEN 1 ELSE NULL END) MaleActors,
COUNT(CASE WHEN ActorGender = 'f' THEN 1 ELSE NULL END) FemaleActors
FROM tblActor
WHERE ActorGender IN ('m','f')
another way without using case:
select sum(males) as "Male Actors", sum(females) as "Female Actors"
from
(select count(actorGender) as Males, 0 as Females
from tblActor
where actorGender = 'm'
union all
select 0 as males, count(actorGender) as Females
from tblActor
where actorGender = 'f')
should result in
Male Actors Female Actors
----------- -------------
7 21
If you are using Oracle 11g+, then you can use PIVOT:
select *
from
(
select actorgender
from tblActor
) src
pivot
(
count(actorgender)
for actorgender in ('m' MaleActors, 'f' FemaleActors)
) piv
See SQL Fiddle with Demo
The result would be:
| MALEACTORS | FEMALEACTORS |
-----------------------------
| 4 | 5 |
Or you can use a CROSS JOIN to get the same result:
select m.MaleActors, f.FemaleActors
from
(
select count(ActorGender) MaleActors, 'm' Gender
from tblActor
where ActorGender = 'm'
) m
cross join
(
select count(ActorGender) FemaleActors, 'f' Gender
from tblActor
where ActorGender = 'f'
) f