Related
I have 3 tables say company, department and employee. Now I want to find out all the employees who works in company A under department D and want to display data as below.
There is parent key relation between all this tables
**TABLE COMPANY**
COMPANY_NAME COMPANY ID
C1 COMP1
C2 COMP2
C3 COMP3
**TABLE DEPARTMENT**
DEPARTMENT_NAME COMPANY_NAME
D1 C1
D2 C1
D3 C2
**TABLE EMPLOYEE**
EMPLOYEE_ID DEPARTMENT_NAME
E1 D1
E2 D1
E3 D1
E4 D2
E5 D2
Company -- > Department --- > Employee.
I also want to display entity against each column as dummy column.
ENTITY COMPANY_NAME DEPARTMENT_NAME EMPLOYEE_ID
COMPANY C1 - -
DEPARTMENT C1 D1
EMPLOYEE C1 D1 E1
EMPLOYEE C1 D1 E2
EMPLOYEE C1 D1 E3
DEPARTMENT C1 D2 -
EMPLOYEE C1 D2 E4
EMPLOYEE C1 D2 E5 ``
I Have tried this with group by, but with group by I get data as vertical tree like structure. but what i want is there should be heading like company or department and under that employees under that department should display. Is there are possible way we can do this in sql or plsql. ?
for left outer join
ENTITY COMPANY_NAME DEPARTMENT_NAME EMPLOYEE_ID
COMPANY C1 - -
DEPARTMENT C1 D1
EMPLOYEE C1 D1 E1
EMPLOYEE C1 D1 E2
EMPLOYEE C1 D1 E3
DEPARTMENT C1 D2 -
EMPLOYEE C1 D2 E4
EMPLOYEE C1 D2 E5
COMPANY C2 - -
DEPARTMENT C2 D3 -
COMPANY C3 - -
You can do this using aggregation -- and grouping sets. This looks like:
select (case when grouping(employee_id) = 0 then 'EMPLOYEE'
when grouping(department_name) = 0 then 'DEPARTMENT'
else 'COMPANY'
end) as entity,
company_name, department_name, e.employee_id
from company c join
department d
using (company_name) join
employee e
using (department_name)
group by grouping sets ( (company_name, department_name, e.employee_id), (company_name, department_name), (company_name) )
order by company_name, department_name nulls first, employee_id nulls first;
Here is a db<>fiddle.
You can use UNION ALL as follows:
SELECT COMPANY_NAME, NULL AS DEPARTMENT_NAME, NULL AS EMPLOYEE_ID
FROM COMPANY C
WHERE EXISTS (SELECT 1 FROM EMPLOYEE E JOIN DEPARTMENT D USING (DEPARTMENT_NAME)
WHERE C.COMPANY_NAME = D.COMPANY_NAME)
UNION ALL
SELECT COMPANY_NAME, DEPARTMENT_NAME, NULL AS EMPLOYEE_ID
FROM COMAPNY C JOIN DEPARTMENT D USING (COMPANY_NAME)
WHERE EXISTS (SELECT 1 FROM EMPLOYEE E
WHERE E.DEPARTMENT_NAME = D.DEPARTMENT_NAME)
UNION ALL
SELECT COMPANY_NAME, DEPARTMENT_NAME, EMPLOYEE_ID
FROM EMPLOYEE JOIN DEPARTMENT USING (DEPARTMENT_NAME)
ORDER BY COMPANY_NAME,
DEPARTMENT_NAME NULLS FIRST,
EMPLOYEE_ID NULLS FIRST;
my data is like this
Dept Sub_Dept Sal
d1 sd1 100
d1 sd1 150
d1 sd2 100
d1 sd2 200
d1 sd2 350
d2 sd1 100
d2 sd1 250
d2 sd1 250
d2 sd2 200
d2 sd2 350
My output should be the count of each sub dept, the AVG of the Sal values of each sub department (sub_dept), and the AVG of all departments (dept)
I want my output to look like this
Result
d1 sd1 2 125
d1 sd2 3 200
Total 5 180
d2 sd1 3 200
d2 sd2 2 225
Total 5 230
grand total 10 205
How to get the inner and outer AVG values ?
use union all
select dept,sub_dept,count(*) cnt ,avg(Sal) as av
from table_name group by dept,sub
union all
select 'total','', count(*),avg(Sal)
from table_name
Most dialects of SQL support the standard grouping sets (or at least roll up). Typical syntax would be:
select dept, sub_dept, avg(sal)
from t
group by grouping sets ( (dept, sub_dept), () );
I have a table like below:
Region Country Manufacturer Brand Period Spend
R1 C1 M1 B1 2016 5
R1 C1 M1 B1 2017 10
R1 C1 M1 B1 2017 20
R1 C1 M1 B2 2016 15
R1 C1 M1 B3 2017 20
R1 C2 M1 B1 2017 5
R1 C2 M2 B4 2017 25
R1 C2 M2 B5 2017 30
R2 C3 M1 B1 2017 35
R2 C3 M2 B4 2017 40
R2 C3 M2 B5 2017 45
...
I wrote the query below to aggregate them:
SELECT [Region]
,[Country]
,[Manufacturer]
,[Brand]
,Period
,SUM([Spend]) AS [Spend]
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
ORDER BY 1,2,3,4
which yields something like below:
Region Country Manufacturer Brand Period Spend
R1 C1 M1 B1 2016 5
R1 C1 M1 B1 2017 30 -- this row is an aggregate from raw table above
R1 C1 M1 B2 2016 15
R1 C1 M1 B3 2017 20
R1 C2 M1 B1 2017 4 -- aggregated result
R1 C2 M2 B4 2017 25
R1 C2 M2 B5 2017 30
R2 C3 M2 B4 2017 40
R2 C3 M2 B5 2017 45
I'd like to add another column to the above table that shows the DISTINCT COUNT of Brand grouped by Region,Country,Manufacturer and Period. So the final table would become as follow:
Region Country Manufacturer Brand Period Spend UniqBrandCount
R1 C1 M1 B1 2016 5 2 -- two brands by R1, C1, M1 in 2016
R1 C1 M1 B1 2017 30 1
R1 C1 M1 B2 2016 15 2 -- same as first row's result
R1 C1 M1 B3 2017 20 1
R1 C2 M1 B1 2017 4 1
R1 C2 M2 B4 2017 25 2
R1 C2 M2 B5 2017 30 2
R2 C3 M2 B4 2017 40 2
R2 C3 M2 B5 2017 45 2
I know how to get to final result in three steps.
Run this query (Query #1):
SELECT [Region]
,[Country]
,[Manufacturer]
,[Period]
,COUNT(DISTINCT [Brand]) AS [BrandCount]
INTO Temp1
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Period]
Run this query (Query #2)
SELECT [Region]
,[Country]
,[Manufacturer]
,[Brand]
,YEAR([Period]) AS Period
,SUM([Spend]) AS [Spend]
INTO Temp2
FROM myTable
GROUP BY [Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
Then LEFT JOIN Temp2 and Temp1 to bring in [BrandCount] from the latter like below:
SELECT a.*
,b.*
FROM Temp2 AS a
LEFT JOIN Temp1 AS b ON a.[Region] = b.[Region]
AND a.[Country] = b.[Country]
AND a.[Advertiser] = b.[Advertiser]
AND a.[Period] = b.[Period]
I'm pretty sure there is a more efficient way to do this, is there? Thank you in advance for your suggestions/answers!
Borrowing heavily from this question: https://dba.stackexchange.com/questions/89031/using-distinct-in-window-function-with-over
Count Distinct doesn't work, so dense_rank is required. Ranking the brands in forward and then reverse order, and then subtracting 1 gives the distinct count.
Your sum function can also be rewritten using PARTITION BY logic. This way you can use different grouping levels for each aggregation:
SELECT
[Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]
,dense_rank() OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Period] Order by Brand)
+ dense_rank() OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Period] Order by Brand Desc)
- 1
AS [BrandCount]
,SUM([Spend]) OVER
(PARTITION BY
[Region]
,[Country]
,[Manufacturer]
,[Brand]
,[Period]) as [Spend]
from
myTable
ORDER BY 1,2,3,4
You may then need to reduce the number of rows in your output, as this syntax gives the same number of rows as myTable, but with the aggregation totals appearing on each row they apply to:
R1 C1 M1 B1 2016 2 5
R1 C1 M1 B1 2017 2 30 --dup1
R1 C1 M1 B1 2017 2 30 --dup1
R1 C1 M1 B2 2016 2 15
R1 C1 M1 B3 2017 2 20
R1 C2 M1 B1 2017 1 5
R1 C2 M2 B4 2017 2 25
R1 C2 M2 B5 2017 2 30
R2 C3 M1 B1 2017 1 35
R2 C3 M2 B4 2017 2 40
R2 C3 M2 B5 2017 2 45
Selecting distinct rows from this output gives you what you need.
How the dense_rank trick works
Consider this data:
Col1 Col2
B 1
B 1
B 3
B 5
B 7
B 9
dense_rank() ranks data according to the number of distinct items before the current one, plus 1. So:
1->1, 3->2, 5->3, 7->4, 9->5.
In reverse order (using desc) this yields the reverse pattern:
1->5, 3->4, 5->3, 7->2, 9->1:
Adding these ranks together gives the same value:
1+5 = 2+4 = 3+3 = 4+2 = 5+1 = 6
The wording is helpful here,
(number of distinct items before + 1) + (number of distinct items after + 1)
= number of distinct OTHER items before AND after + 2
= Total number of distinct items + 1
So to get the total number of distinct items, add the ascending and descending dense_ranks together and subtract 1.
The tag to your question;
window-functions
suggests you have a pretty good idea.
For DISTINCT COUNT of Brand grouped by Region,Country,Manufacturer and Period: you may write:
Select Region
,Country
,Manufacturer
,Brand
,Period
,Spend
,DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand asc)
+ DENSE_RANK() Over (Partition By Region, Country, Manufacturer, Period Order By Brand desc)
-1 UniqBrandCount
From myTable T1
Order By 1,2,3,4
The double dense_rank idea means that you need two sorts (assuming no index exists that provides sort order). Assuming no NULL brands (as that idea does) you can use a single dense_rank and a windowed MAX as below (demo)
WITH T1
AS (SELECT *,
DENSE_RANK() OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period] ORDER BY Brand) AS [dr]
FROM myTable),
T2
AS (SELECT *,
MAX([dr]) OVER (PARTITION BY [Region], [Country], [Manufacturer], [Period]) AS UniqBrandCount
FROM T1)
SELECT [Region],
[Country],
[Manufacturer],
[Brand],
Period,
SUM([Spend]) AS [Spend],
MAX(UniqBrandCount) AS UniqBrandCount
FROM T2
GROUP BY [Region],
[Country],
[Manufacturer],
[Brand],
[Period]
ORDER BY [Region],
[Country],
[Manufacturer],
[Period],
Brand
The above has some inevitable spooling (it isn't possible to do this in a 100% streaming manner) but a single sort.
Strangely the final order by clause is needed to keep the number of sorts down to one (or zero if a suitable index exists).
I have a table like below. I need to find out the employes who have rank R1 but never have rank C1 and C2.
Id ECode Name Rank
1 EMP1 AA R1
2 EMP2 BB R1
3 EMP1 AA R2
4 EMP1 AA C1
5 EMP1 AA C2
6 EMP1 AA C3
7 EMP2 BB C4
8 EMP2 BB C5
9 EMP3 CC R1
10 EMP3 CC C1
11 EMP3 CC C2
12 EMP3 CC C4
13 EMP4 DD R1
14 EMP4 DD C3
One approach uses aggregation by employee:
SELECT ECode, Name
FROM yourTable
GROUP BY ECode, Name
HAVING
SUM(CASE WHEN Rank = 'R1' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN Rank IN ('C1', 'C2') THEN 1 ELSE 0 END) = 0
Try this:
SELECT *
FROM EMPLOYES A
WHERE RANK = 'R1'
AND NOT EXISTS(SELECT 1
FROM EMPLOYES B
WHERE B.ECODE = A.ECODE
AND RANK IN ('C1','C2')
AND ROWNUM = 1)
I would do:
SELECT ecode, name
FROM t
WHERE rank IN ('R1', 'C1', 'C2')
GROUP BY ecode, name
HAVING MIN(rank) = MAX(rank) AND MAX(rank) = 'R1';
Evidently use NOT EXISTS :
select *
from mytable t
where t.rank = 'R1'
and not exists ( select ECode from mytable where ECode = t.ecode and rank in ('C1','C2') );
You can use combination of exists and not exists:
select *
from table t
where exists (select 1 from table where ECode = t.ECode and Rank = 'R1') AND
not exists (select 1 from table where ECode = t.ECode and Rank IN ('C1', 'C2'))
Select * from (tablename)
where Rank = 'R1'
and Rank not in (Select Rank from (tablename)
where Rank = 'C1'
or Rank = 'C2')
Following is my table in SQL Server
ID NAME SALARY
10 A 10
10 B 5
10 C 20
10 D 20
11 E 40
11 F 40
11 G 30
11 H 50
12 I 50
12 J 35
My objective is to add six other columns first_value,second_value,third_value, first_rank, second_rank, third_rank corresponding to each ID.
The output should look like as following:
ID NAME SALARY R1 R2 R3 R1_name R2_name R3_name
10 A 10 5 10 20 B A C
10 B 5 5 10 20 B A C
10 C 20 5 10 20 B A C
10 D 20 5 10 20 B A C
11 E 40 30 40 40 G E F
11 F 40 30 40 40 G E F
11 G 30 30 40 40 G E F
11 H 50 30 40 40 G E F
12 I 50 35 50 NULL J I NULL
12 J 35 35 50 NULL J I NULL
Following is the insert query:
CREATE TABLE EMP(ID NVARCHAR(10), NAME NVARCHAR(20), SALARY MONEY)
INSERT INTO EMP
VALUES
(10, 'A', 10),(11, 'E',40 ),(10,'B',5),(11,'F',40),(12,'I',50)
,(10,'C',20),(11,'G',30),(12,'J',35),(10,'D',20),(11,'H',50)
Thanks in advance.
;WITH TOUpdate AS
(
SELECT ID,
MAX(case when RN=1 THEN SALARY ELSE 0 END) AS R1,
MAX(case when RN=2 THEN SALARY ELSE 0 END) AS R2,
MAX(case when RN=3 THEN SALARY ELSE 0 END) AS R3,
MAX(case when RN=1 THEN Name ELSE NULL END) AS R1_Name,
MAX(case when RN=2 THEN Name ELSE NULL END) AS R2_Name,
MAX(case when RN=3 THEN Name ELSE NULL END) AS R3_Name
FROM(
SELECT *,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SALARY) AS RN
FROM #EMP
) X
WHERE X.RN<4
GROUP BY ID
)
SELECT *
FROM #EMP E
INNER JOIN TOUpdate U
ON E.ID=U.ID
Quite ugly but it works, you could try
DECLARE #EMP AS TABLE(ID NVARCHAR(10), NAME NVARCHAR(20), SALARY MONEY)
INSERT INTO #EMP
VALUES
(10, 'A', 10),(11, 'E',40 ),(10,'B',5),(11,'F',40),(12,'I',50)
,(10,'C',20),(11,'G',30),(12,'J',35),(10,'D',20),(11,'H',50)
;WITH temp AS
(
SELECT e.* , row_number() over(partition by e.ID ORDER BY e.SALARY ASC) AS Rn
FROM #EMP e
)
SELECT e.*, t1.SALARY AS R1, t1.Name AS R1_Name, t2.SALARY AS R2, t2.Name AS R2_Name, t3.SALARY AS R3, t3.Name AS R3_Name
FROM #EMP e
LEFT JOIN temp t1 ON e.ID = t1.ID AND t1.Rn = 1
LEFT JOIN temp t2 ON e.ID = t2.ID AND t2.Rn = 2
LEFT JOIN temp t3 ON e.ID = t3.ID AND t3.Rn = 3
ORDER BY e.ID ASC
Demo link: Rextester
We can actually achieve your desired output by doing a single join to a CTE which ranks the salaries for each ID.
WITH cte1 AS (
SELECT ID, NAME, SALARY,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SALARY) rn
FROM EMP
),
cte2 AS (
SELECT
ID,
MAX(CASE WHEN rn = 1 THEN SALARY END) AS R1,
MAX(CASE WHEN rn = 2 THEN SALARY END) AS R2,
MAX(CASE WHEN rn = 3 THEN SALARY END) AS R3,
MAX(CASE WHEN rn = 1 THEN NAME END) AS R1_name,
MAX(CASE WHEN rn = 2 THEN NAME END) AS R2_name,
MAX(CASE WHEN rn = 3 THEN NAME END) AS R3_name
FROM cte1
GROUP BY ID
)
SELECT
t1.ID,
t1.NAME,
t1.SALARY,
t2.*
FROM EMP t1
INNER JOIN cte2 t2
ON t1.ID = t2.ID
Output:
Demo here:
Rextester