PIVOT using JOIN in SQL

PIVOT using JOIN in SQL - sql

I was asked to pivot this data using basic SQL and wasn't sure how to answer it. I googled some answers and realized you can use MAX or SUM with CASE expressions, but at the end of the interview I asked how you would solve the question and the interviewer said by using joins. Can anyone show me how it's done using joins?
BEGINNING TABLE
emp_id
col_id
col_desc
attvalue
month
1
1
salary
2000
2010-05-09
1
2
bonus
0
2010-05-09
1
3
compensation
2000
2010-05-09
1
1
salary
2000
2010-05-10
1
2
bonus
500
2010-05-10
1
3
compensation
2500
2010-05-10
2
1
salary
1000
2010-05-09
2
2
bonus
500
2010-05-09
2
3
compensation
1500
2010-05-09
Code to create the beginning table
CREATE TABLE Employees(emp_id INT, col_id INT, col_desc NVARCHAR(MAX), attvalue INT, month DATE);
INSERT INTO Employees
VALUES
(1,1,'salary',2000,'2010-05-09'),
(1,2,'bonus',0,'2010-05-09'),
(1,3,'compensation',2000,'2010-05-09'),
(1,1,'salary',2000,'2010-05-10'),
(1,2,'bonus',500,'2010-05-10'),
(1,3,'compensation',2500,'2010-05-10'),
(2,1,'salary',1000,'2010-05-09'),
(2,2,'bonus',500,'2010-05-09'),
(2,3,'compensation',1500,'2010-05-09');
RESULTING TABLE
emp_id
month
salary
bonus
compensation
1
2010-05-09
2000
0
2000
1
2010-05-10
2000
500
2500
2
2010-05-09
1000
500
1500

Below are the self join, case expression and pivot way
-- Self Join way
select s.emp_id, s.month,
s.attvalue as salary,
b.attvalue as bonus,
c.attvalue as compensation
from Employees s
inner join Employees b on s.emp_id = b.emp_id
and s.month = b.month
inner join Employees c on s.emp_id = c.emp_id
and s.month = c.month
where s.col_desc = 'salary'
and b.col_desc = 'bonus'
and c.col_desc = 'compensation'
order by s.emp_id, s.month
-- case expression way
select emp_id, month,
max(case when col_desc = 'salary' then attvalue else 0 end) as salary,
max(case when col_desc = 'bonus' then attvalue else 0 end) as bonus,
max(case when col_desc = 'compensation' then attvalue else 0 end) as compensation
from Employees
group by emp_id, month
order by emp_id, month
-- Pivot way
select *
from (
select emp_id, month, col_desc, attvalue
from Employees
) d
pivot
(
max(attvalue)
for col_desc in ([salary], [bonus], [compensation])
) p
order by emp_id, month

Related

subquery sum and newest record by different type

table employee {
id,
name
}
table payment_record {
id,
type, // 1 is salary, 2-4 is bonus
employee_id,
date_paid,
amount
}
i want to query employee's newest salary and sum(bonus) from some date.
like my payments is like
id, type, employee_id, date_paid, amount
1 1 1 2022-10-01 5000
2 2 1 2022-10-01 1000
3 3 1 2022-10-01 1000
4 1 1 2022-11-01 3000
5 1 2 2022-10-01 1000
6 1 2 2022-11-01 2000
7 2 2 2022-11-01 3000
query date in ['2022-10-01', '2022-11-01']
show me
employee_id, employee_name, newest_salary, sum(bonus)
1 Jeff 3000 2000
2 Alex 2500 3000
which jeff's newest_salary is 3000 becuase there is 2 type = 1(salary) record 5000 and 3000, the newest one is 3000.
and jeff's bonus sum is 1000(type 2) + 1000(type 3) = 2000
the current sql i try is like
select
e.employee_id,
employee.name,
e.newest_salary,
e.bonus
from
(
select
payment_record.employee_id,
SUM(case when type in ('2', '3', '4') then amount end) as bonus,
Max(case when type = '1' then amount end) as newest_salary
from
payment_record
where
date_paid in ('2022-10-01', '2022-11-01')
group by
employee_id
) as e
join
employee
on
employee.id = e.employee_id
order by
employee_id
it's almost done, but the rule of newest_salary is not correct, i just get the max value althought usually the max value is newest record.

The query is below:
SELECT
t1.id employee_id,
t1.name employee_name,
t3.amount newest_salary,
t2.bonus bonus
FROM employee t1
LEFT JOIN
(
SELECT
employee_id,
MAX(CASE WHEN type=1 THEN date_paid END) date_paid,
SUM(CASE WHEN type IN (2,3,4) THEN amount END) bonus
FROM payment_record
WHERE date_paid BETWEEN '2022-10-01' AND '2022-11-01'
GROUP BY employee_id
) t2
ON t1.id=t2.employee_id
LEFT JOIN payment_record t3
ON t3.type=1 AND
t2.employee_id=t3.employee_id AND
t2.date_paid=t3.date_paid
ORDER BY t1.id
db fiddle

I think Postgres is close enough to work with this solution I tested in sql-server, but it should at least be close enough to translate
My approach is to split the payments in the desired range into salary vs bonus, and sum the bonus but use a partitioned row number to identify the newest salary payment for each employee in the desired date range and only join that one to the bonus totals. Note that I used a LEFT JOIN because an employee might not get a bonus.
DECLARE #StartDate DATE = '2022-10-01';
DECLARE #EndDate DATE = '2022-11-01';
with cteSample as ( --BEGIN sample data
SELECT * FROM ( VALUES
(1, 1, 1, CONVERT(DATE,'2022-10-01'), 5000)
, (2, 2, 1, '2022-10-01', 1000)
, (3, 3, 1, '2022-10-01', 1000)
, (4, 1, 1, '2022-11-01', 3000)
, (5, 1, 2, '2022-10-01', 1000)
, (6, 1, 2, '2022-11-01', 2000)
, (7, 2, 2, '2022-11-01', 3000)
) as TabA(Pay_ID, Pay_Type, employee_id, date_paid, amount)
) --END Sample data
, ctePayments as ( --Filter down to just the payments in the date range you are interested in
SELECT Pay_ID, Pay_Type, employee_id, date_paid, amount
FROM cteSample --Replace this with your real table of payments
WHERE date_paid >= #StartDate AND date_paid <= #EndDate
), cteSalary as ( --Identify salary payments in range and order them newest first
SELECT employee_id, amount
, ROW_NUMBER() over (PARTITION BY employee_id ORDER BY date_paid DESC) as Newness
FROM ctePayments as S
WHERE S.Pay_Type = 1
), cteBonus as ( --Identify bonus payments in range and sum them
SELECT employee_id, SUM(amount) as BonusPaid
FROM ctePayments as S
WHERE S.Pay_Type in (2,3,4)
GROUP BY employee_id
)
SELECT S.employee_id, S.amount as SalaryNewest
, COALESCE(B.BonusPaid, 0) as BonusTotal
FROM cteSalary as S --Join the salary list to the bonusa list
LEFT OUTER JOIN cteBonus as B ON S.employee_id = B.employee_id
WHERE S.Newness = 1 --Keep only the newest
Result:
employee_id
SalaryNewest
BonusTotal
1
3000
2000
2
2000
3000

SQL query to calculate totals

I'm quite new to SQL querying so please go easy on me if what I've done so far is really odd :)
I have two tables - A for Income and B for Expenditure:
Business_ID Income_Desc Income_Amount
1 Income A 1000
1 Income B 3000
1 Income C 2000
Business_ID Expen_Amount
1 2500
I'd like to produce a table that shows each of the income amounts, the one expenditure amount, the total income, the total expenditure and a Grand Total of total income-total expenditure.
Something like this if possible
Business_ID Income Description Income Amount Expenditure Amount Total
1 Income A 1000 2500 -
1 Income B 3000 - -
1 Income C 2000 - -
1 All Amounts 6000 2500 3500
This is what I've tried so far
SELECT a. Business_ID, COALESCE (a.Income_Desc, 'All Amounts') AS 'Income Description', SUM(a.Income_Amount) AS 'Income Amount', SUM(b.Expen_Amount) AS Expenditure Amount', (sum(a.Income_Amount)-SUM(b.Expen_Amount)) AS 'Total'
FROM Income AS a LEFT JOIN Expenditure AS b ON a.Business_ID = b. Business_ID
GROUP BY a. Business_ID, a.Income_Desc WITH ROLLUP
The result I'm getting is this
Business_ID Income Description Income Amount Expenditure Amount Total
1 Income A 1000 2500 -1500
1 Income B 3000 2500 500
1 Income C 2000 2500 -500
1 All Amounts 6000 7500 -1500
All Amounts 6000 7500 -1500
Is it possible to get an output like the one I provided above? Could you show me how to achieve it (or something very close) please?
Thanks

You can use row_number() for the join:
with ie as (
select i.business_id, i.income_desc, i.income_amount,
e.expen_amount
from (select i.*,
row_number() over (partition by business_id order by income_desc) as seqnum
from income i
) i left join
(select e.*,
row_number() over (partition by business_id order by expen_amount) as seqnum
from expenditure e
) e
on i.business_id = e.business_id and i.seqnum = e.seqnum
)
select ie.*
from ie
union all
select business_id, 'Total', sum(income_amount), sum(expen_amount)
from ie
group by business_id;

You could make a sub query out of your original query and only select values where the business ID is not null. Furthermore, use CASE WHEN to identify those values < 0 and replace it with "-":
SELECT x.Business_ID
, x.`Income Description`
, x.`Income Amount`
, x.`Expenditure Amount`
, x.Total
FROM
(SELECT a. Business_ID
, COALESCE (a.Income_Desc, 'All Amounts') AS 'Income Description'
, SUM(a.Income_Amount) AS 'Income Amount'
, SUM(b.Expen_Amount) AS 'Expenditure Amount'
, CASE WHEN (sum(a.Income_Amount)-SUM(b.Expen_Amount)) < 0
THEN '-'
ELSE (sum(a.Income_Amount)-SUM(b.Expen_Amount))
END AS 'Total'
FROM Income AS a
LEFT JOIN Expenditure AS b ON a.Business_ID = b.Business_ID
WHERE a.Business_ID is not null and b.Business_ID is not null
GROUP BY a.Business_ID, a.Income_Desc WITH ROLLUP) as x
where x.Business_ID is not null
DB Fiddle

select group by column optionally in rows

I need to calculate employee salaries.
I have a table and two views which holds data i need to perform a query, here are the tables
Employees_View
--------
ID Name PayRate PayUnit Commission
1 James 10 C 0
2 Mike 10000 S 0
3 Jude 20000 SC 5
4 Clara 8 C 0
When PayUnit is C (Commission) then PayRate is in Percent, this is percent of the total sale by this employee, whereas the Commission field is commission percent on total sales and its only for SC employees
Jobs
ID Created
1 2016-01-21 10:56:05
2 2016-01-21 10:56:05
3 2016-01-21 10:56:05
4 2016-01-21 10:56:05
5 2016-01-21 12:11:59
6 2016-01-25 08:03:07
7 2015-11-01 22:55:22
Jobs_Item_View
Job_ID Amount Emp_ID
1 135 4
1 500 2
3 1500 2
3 250 4
4 1000 2
5 500 4
6 500 4
7 500 1
PayUnits
Code Name
S Salary
C Commission
SC Salary plus Commission
Here is what i tried
SELECT
ev.PayRate,
ev.name AS Employee,
CASE ev.PayUnitCode WHEN 'C' THEN
SUM(jiv.Amount) - (SUM(jiv.Amount) * (ev.PayRate / 100))
WHEN 'SC' THEN
ev.payrate + SUM(jiv.Amount) - (SUM(jiv.Amount) * (ev.Commission / 100))
ELSE ev.payrate
END AS pay,
LEFT(DATENAME(month, j.Created), 3) + '-' + CAST(YEAR(j.Created) AS NVARCHAR) AS Month,
jiv.Emp_ID,
pu.Name AS PayUnit,
ev.Code, ev.Commission
FROM
dbo.Employees_View AS ev LEFT OUTER JOIN
dbo.Job_Items_View AS jiv ON jiv.Emp_ID = ev.ID LEFT OUTER JOIN
dbo.Jobs AS j ON j.ID = jiv.Job_ID LEFT OUTER JOIN
dbo.PayUnits AS pu ON pu.Code = ev.PayUnitCode
GROUP BY jiv.Emp_ID,
pu.Name,
ev.PayUnitCode,
ev.PayRate, ev.name,
LEFT(DATENAME(month, j.Created), 3) + '-' + CAST(YEAR(j.Created) AS NVARCHAR),
ev.Code, ev.Commission
This is what i got
Result
PayRate Employee pay Month Emp_ID PayUnit Commission
20000 Jude NULL NULL NULL Salary plus Commission 5.00
10 James 900 Nov-2015 1 Commission 0.00
8 Clara 2760 Jan-2016 4 Commission 0.00
10000 Mike 10000 Jan-2016 2 Salary 0.00
Expected Output
Result
PayRate Employee pay Month Emp_ID PayUnit Commission
20000 Jude 20241.75 Jan-2016 3 Salary plus Commission 5.00
10 James 900 Nov-2015 1 Commission 0.00
8 Clara 2760 Jan-2016 4 Commission 0.00
10000 Mike 10000 Jan-2016 2 Salary 0.00
the Pay for the SC employee isn't correct. it is suppose to be Salary (20000) plus 5% or total sales (4835) which is 241.75 - 20,241.75

The reason for the nulls is that one of the employees did not contribute to any jobs, so SUM(jiv.Amount) will always be null for them and any calculations involving that expression will also result in null. You can fix by doing ISNULL(SUM(jiv.Amount), 0). Similarly for the dates, they are also driven by the jobs data, but if an employee wasn't involved in any jobs they will be null also. ISNULL() could be used for these as well.
I've broken the problem down by using Common Table Expressions:
DECLARE #startDateTime DATETIME = '2016-01-01 00:00:00'
DECLARE #endDateTime DATETIME = '2016-01-31 23:59:59'
;WITH sales AS
(
SELECT
ev.ID,
ISNULL(SUM(jiv.Amount), 0) AS TotalSales,
MONTH(j.Created) AS [Month],
YEAR(j.Created) AS [Year]
FROM Employees_View AS ev
LEFT JOIN Job_Items_View AS jiv ON jiv.Emp_ID = ev.ID
LEFT JOIN Jobs AS j ON j.ID = jiv.Job_ID
WHERE j.Created BETWEEN #startDateTime AND #endDateTime
GROUP BY
ev.ID,
MONTH(j.Created),
YEAR(j.Created)
),
commissions AS
(
SELECT
s.ID,
CASE ev.PayUnitCode
WHEN 'C' THEN s.TotalSales * (ev.PayRate / 100)
WHEN 'SC' THEN (SELECT SUM(Amount) FROM Job_Items_View) * (ev.Commission / 100)
ELSE 0
END AS TotalCommission
FROM sales AS s
JOIN Employees_View AS ev ON ev.ID = s.ID
),
salaries AS
(
SELECT
ID,
CASE PayUnitCode
WHEN 'C' THEN 0
ELSE PayRate
END AS Salary
FROM Employees_View
),
totals AS
(
SELECT
salaries.ID,
ISNULL(sales.Month, MONTH(#startDateTime)) AS [Month],
ISNULL(sales.Year, YEAR(#startDateTime)) AS [Year],
ISNULL(sales.TotalSales, 0) AS TotalSales,
salaries.Salary,
ISNULL(commissions.TotalCommission, 0) AS TotalCommission
FROM salaries
LEFT JOIN sales ON salaries.ID = sales.ID
LEFT JOIN commissions ON commissions.ID = sales.ID
)
SELECT
ev.PayRate,
ev.Name,
t.Salary + t.TotalCommission AS Pay,
LEFT(DATENAME(MONTH, DATEADD(MONTH , t.[Month], -1)), 3)
+ '-' + CAST(t.[Year] AS VARCHAR) AS [Month],
ev.ID AS Emp_ID,
pu.Name AS PayUnit,
ev.Commission
FROM totals AS t
JOIN Employees_View AS ev ON ev.ID = t.ID
JOIN PayUnits AS pu ON pu.Code = ev.PayUnitCode
Click here to see it in action and have a play on SQL Fiddle.

MySQL AVG() in sub-query

What one query can produce table_c?
I have three columns: day, person, and revenue_per_person. Right now I have to use two queries since I lose 'person' when producing table_b.
table_a uses all three columns:
SELECT day, person, revenue_per_person
FROM purchase_table
GROUP BY day, person
table_b uses only two columns due to AVG() and GROUP BY:
SELECT day, AVG(revenue) as avg_revenue
FROM purchase_table
GROUP BY day
table_c created from table_a and table_b:
SELECT
CASE
WHEN revenue_per_person > avg_revenue THEN 'big spender'
ELSE 'small spender'
END as spending_bucket
FROM ????

Maybe this could help, try this one
SELECT a.day,
CASE
WHEN a.revenue_per_person > b.avg_revenue THEN 'big spender'
ELSE 'small spender'
END as spending_bucket
FROM
(
SELECT day, person, AVG(revenue) revenue_per_person
FROM purchase_table
GROUP BY day, person
) a INNER JOIN
(
SELECT day, AVG(revenue) as avg_revenue
FROM purchase_table
GROUP BY day
) b ON a.day = b.day

You might want to use analytic functions.
An Oracle example showing if a person's salary is greater than average salary in his department.
08:56:54 HR#vm_xe> ed
Wrote file s:\toolkit\service\buffer.sql
1 select
2 department_id
3 ,employee_id
4 ,salary
5 ,avg_salary
6 ,case when salary > avg_salary then 1 else 0 end case_is_greater
7 from (
8 select
9 department_id
10 ,employee_id
11 ,salary
12 ,round(avg(salary) over(partition by department_id),2) avg_salary
13 from employees
14 )
15* where department_id = 30
08:58:56 HR#vm_xe> /
DEPARTMENT_ID EMPLOYEE_ID SALARY AVG_SALARY CASE_IS_GREATER
------------- ----------- ---------- ---------- ---------------
30 114 11000 4150 1
30 115 3100 4150 0
30 116 2900 4150 0
30 117 2800 4150 0
30 118 2600 4150 0
30 119 2500 4150 0
6 rows selected.
Elapsed: 00:00:00.01

If you are using a database that supports windows functions, you can do this as:
SELECT (CASE WHEN revenue_per_person > avg_revenue THEN 'big spender'
ELSE 'small spender'
END) as spending_bucket
FROM (select pt.*,
avg(revenue) over (partition by day, person) as revenue_per_person,
avg(revenue) over (partition by day) as avg_revenue,
row_number() over (partition by day, person order by day) as seqnum
from purchase_table pt
) t
where seqnum = 1
The purpose of seqnum is to just get one row per person/day combination.

SQL server full join query

I have a full join sql query and i am retrieving the data from the same table.the problem is i am getting the null value where i am expecting the column name.
Example:
I am having a table where there are two columns typeOfPost,dob.
DOB TypeOfPost
--------- --------------
20/11/1998 Manager
1/1/2000 Sales
13/6/1999 Manager
20/1/1987 Manager
1/11/1985 Sales
Now when I am writing a join query like
select DATENAME(month,dob) as Red,count(TypeOfPost)
from tablename
where TypeOfPost='Manager'
group by DATENAME(month,dob) as A
full join
select DATENAME(month,dob) as Green,count(TypeOfPost)
from tablename
where TypeOfPost='Sales'
group by DATENAME(month,dob) as B on B.Green = A.Red
Output-- Expected Output--
--------------------- ---------------------
Month Man Sal Month Man Sal
-------- ----- ------ -------- ----- ------
January 1 1 January 1 1
NULL 1 NULL June 1 NULL
November 1 1 November 1 1
Now here the problem rise, I want 'June' in the column Month instead of NULL value.
So is there any way to get that??
Help me out.
Thanks.

One option is to
use a CASE statement in a subselect
Determine for given record if it is a manager or sales
substitute with 1 or 0 accordingly
SELECT and GROUP from this subselect the final results.
SQL Statement
SELECT Month
, SUM(Man) AS Man
, SUM(Sal) AS Sal
FROM (
SELECT DATENAME(MONTH, DOB) AS Month
, CASE WHEN TypeOfPost = 'Manager' THEN 1 ELSE 0 END AS Man
, CASE WHEN TypeOfPost = 'Sales' THEN 1 ELSE 0 END AS Sal
FROM tableName
) g
GROUP BY
Month
or
SELECT Month
, SUM(Man)
, SUM(Sal)
FROM (
SELECT DATENAME(MONTH, DOB) AS Month
, COUNT(TypeOfPost) AS Man
, 0 AS Sal
FROM tableName
WHERE TypeOfPost = 'Manager'
GROUP BY
DATENAME(MONTH, DOB)
UNION ALL
SELECT DATENAME(MONTH, DOB) AS Month
, 0 AS Man
, COUNT(TypeOfPost) AS Sal
FROM tableName
WHERE TypeOfPost = 'Sales'
GROUP BY
DATENAME(MONTH, DOB)
) g
GROUP BY
Month

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

PIVOT using JOIN in SQL - sql

Related

subquery sum and newest record by different type

SQL query to calculate totals

select group by column optionally in rows

MySQL AVG() in sub-query

SQL server full join query

Categories

Resources