Getting the sum of columns based on row values - sql

I have a table that looks like the following.
EMPNUM EMPNAME LOCATION CATEGORY COUNT
123 JOHN DOE BLDG A 1 5
123 JOHN DOE BLDG A 1 6
123 JOHN DOE BLDG A 2 4
123 JOHN DOE BLDG A 3 7
123 JOHN DOE BLDG B 1 1
123 JOHN DOE BLDG B 2 3
234 EMILY DOE BLDG A 1 1
234 EMILY DOE BLDG A 2 2
234 EMILY DOE BLDG A 3 4
234 EMILY DOE BLDG B 2 3
234 EMILY DOE BLDG B 2 9
234 EMILY DOE BLDG B 3 3
I would like to transport it into columns that will yield to an output similar to below. I need to get the sum of COUNT based on the values of LOCATION and CATEGORY
EMPNUM EMPNAME SUM_A1 SUM_A2 SUM_A3 SUM_B1 SUM_B2 SUM_B3
123 JOHN DOE 11 4 7 1 3 0
234 EMILY DOE 1 2 4 0 12 3
Is there any way to do this as an SQL query? or in Crystal reports (though I prefer output using SQL)

If you are using 11g or later try
select * from table1
PIVOT (SUM("COUNT")
FOR ("LOCATION","CATEGORY") IN
(('BLDG A',1) AS sum_a1,
('BLDG A',2) AS sum_a2,
('BLDG A',3) AS sum_a3,
('BLDG B',1) AS sum_b1,
('BLDG B',2) AS sum_b2,
('BLDG B',3) AS sum_b3));
Here is a fiddle
Otherwise use APC's solution

This will work providing the values in LOCATION and CATEGORY are constant:
select empnum
, empname
, sum(case when location='BLDG A' and category = 1 then count else 0 end) sum_a1
, sum(case when location='BLDG A' and category = 2 then count else 0 end) sum_a2
, sum(case when location='BLDG A' and category = 3 then count else 0 end) sum_a3
, sum(case when location='BLDG B' and category = 1 then count else 0 end) sum_b1
, sum(case when location='BLDG B' and category = 2 then count else 0 end) sum_b2
, sum(case when location='BLDG B' and category = 3 then count else 0 end) sum_b3
from your_table
group by empnum
, empname
If the values are not known or not stable when you run the query you will need to use dynamic SQL.
Note that if you are on 11g you should employ A B Cade's PIVOT solution, which is more elegant.

The other answers will work great if you have a known number of values to transform into columns. But if you have an unknown number, then you can use dynamic sql to generate the results.
You would create the following procedure:
CREATE OR REPLACE procedure test_dynamic_pivot(p_cursor in out sys_refcursor)
as
sql_query varchar2(1000) := 'select empnum, empname';
begin
for x in (select distinct location, category from yourtable order by 1)
loop
sql_query := sql_query ||
' , sum(case when location = '''||x.location||''' and category='||x.category||' then cnt else 0 end) as sum_'||substr(x.location, -1, 1)||x.category;
dbms_output.put_line(sql_query);
end loop;
sql_query := sql_query || ' from yourtable group by empnum, empname';
open p_cursor for sql_query;
end;
/
And then to execute it:
variable x refcursor
exec test_dynamic_pivot(:x)
print x
The result is the same as the hard-coded version:
| EMPNUM | EMPNAME | SUM_A1 | SUM_A2 | SUM_A3 | SUM_B1 | SUM_B2 | SUM_B3 |
----------------------------------------------------------------------------
| 234 | EMILY DOE | 1 | 2 | 4 | 0 | 12 | 3 |
| 123 | JOHN DOE | 11 | 4 | 7 | 1 | 3 | 0 |

Related

SQL count products

I have table:
name product
john beer
john milk
john tea
john beer
emily milk
emily milk
emily tea
john beer
i need select from this table, when output will be:
name count(tea) count(beer) count(milk) count(total)
john 1 3 1 5
emily 1 0 2 3
any idea how to do this?
DB: oracle 12
Use conditional aggregation:
select name
sum(case when product = 'tea' then 1 else 0 end) cnt_tea,
sum(case when product = 'beer' then 1 else 0 end) cnt_beer,
sum(case when product = 'milk' then 1 else 0 end) cnt_milk,
count(*) total
from mytable
group by name
Depending on your database, there may be neater options available to express the conditional counts.

simple sql over (partition by) not working as expected

Feels like it should be simple but my mind has gone blank so would appreciate any help!
Let's say I have this dataset
Date sale_id salesperson Missed_payment_this_month
01/01/2016 1001 John 1
01/01/2016 1002 Bob 0
01/01/2016 1003 Bob 0
01/01/2016 1004 John N/A
01/02/2016 1001 John 1
01/02/2016 1002 Bob 1
01/02/2016 1003 Bob 0
01/02/2016 1004 John 1
01/03/2016 1001 John 1
01/03/2016 1002 Bob 0
01/03/2016 1003 Bob 0
01/03/2016 1004 John 1
And want to add these two columns to the end. They look at the number of missed payments previously, by sales_id and salesperson.
Previous_missed_payment_by_sale_id Previous_missed_payment_by_sales person
0 0
0 0
0 0
0 0
1 1
0 0
0 0
0 1
2 3
1 1
0 1
1 3
sales_id is ok but getting it over sales persons is giving me an error (group by) or adding in extra columns. I need to keep the rows constant.
My best guess that returns extra columns:
select t1.Date, t1.sale_id, t1.salesperson
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) previous_missed_sales_id
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) OVER (PARTITION by t1.salesperson) previous_missed_salesperson
from [dbo].[simple_join_table2] t1
inner join [dbo].[simple_join_table2] t2 on
(t2.[Date] < t1.[Date] AND t1.[sale_id] = t2.[sale_id])
group by t1.Date, t1.sale_id, t1.salesperson
,case when t2.Missed_payment_this_month = '1' then 1 else 0 end
this is the output:
Date sale_id salesperson previous_missed_sales_id previous_missed_salesperson
01/02/2016 1002 Bob 0 1
01/02/2016 1003 Bob 0 1
01/03/2016 1002 Bob 0 1
01/03/2016 1002 Bob 1 1
01/03/2016 1003 Bob 0 1
01/02/2016 1001 John 1 3
01/02/2016 1004 John 0 3
01/03/2016 1001 John 2 3
01/03/2016 1004 John 0 3
01/03/2016 1004 John 1 3
Is this possible without another sub query? I guess another way to put it is i'm trying to mimic the sumx and earlier functions of Powerpivot.
If you are on 2012+ use windowing aggregates. Previous = sum all_previous_including_curret - sum current. Ms sql default window is exactly ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
with [simple_join_table2] as(
-- sample data
select cast(valuesDate as Date) valuesDate, sale_id, salesperson, Missed_payment_this_month
from (
values
('20160101',1001,'John', 1)
,('20160101',1002,'Bob ', 0)
,('20160101',1003,'Bob ', 0)
,('20160101',1004,'John',null)
,('20160201',1001,'John', 1)
,('20160201',1002,'Bob ', 1)
,('20160201',1003,'Bob ', 0)
,('20160201',1004,'John', 1)
,('20160301',1001,'John', 1)
,('20160301',1002,'Bob ', 0)
,('20160301',1003,'Bob ', 0)
,('20160301',1004,'John', 1)
) t(valuesDate, sale_id, salesperson, Missed_payment_this_month)
)
select valuesDate,sale_id, salesperson, Missed_payment_this_month,
byidprevmonth = sum(Missed_payment_this_month ) over(partition by sale_id order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, sale_id),
bypersonprevmonth = sum(Missed_payment_this_month) over(partition by salesperson order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, salesperson)
from [simple_join_table2]
order by salesperson, valuesDate

Removing Duplicates Based date

I have the following Query to select (will use for an update statement) remove duplicates based on the min service date and keeping the most recent svc date.
select st.SubID, st.RecordNo, st.Fname, st.Lname, st.MemberID, st.ServiceDate, IsDeduped, DedupCriteria
from stagingtable st
join (select MemberID
from stagingtable
where SubID = 99999
and waveseqid = 1
group by MemberID
having count(*) > 1) st2
on st.MemberID = st2.MemberID
and st.ServiceDate = (Select min(ServiceDate) from stagingtable s where s.subid = 99999 and s.waveseqid = 1 and st.MemberID = s.MemberID)
where SubID = 99999
and waveseqid = 1
order by RecordNo
This seems to pull in sometime only pull in multiples with the same date for the memberid:
SurveyID RecordNo Fname Lname MemberID Option9 IsDeduped DedupCriteria
99999 1 John Doe 123 10/1/2015 0 NULL x These show on the query
99999 2 John Doe 123 10/1/2015 0 NULL x These show on the query
99999 3 John Doe 123 10/8/2015 0 NULL But expected these as well
99999 4 John Doe 123 10/12/2015 0 NULL But expected these as well
99999 4 John Doe 123 10/14/2015 0 NULL But expected these as well
99999 6 John Doe 123 10/29/2015 0 NULL But expected these as well
99999 7 John Doe 123 12/14/2015 0 NULL But expected these as well
Your "AND" statement restricts the results to only rows with the minimum service date.
and st.ServiceDate = (Select min(ServiceDate) from stagingtable s where s.subid = 99999 and s.waveseqid = 1 and st.MemberID = s.MemberID)
That's why you get two rows and not all of them.

Multiple joins with aggregates

I have the two following tables:
Person:
EntityId FirstName LastName
----------- ------------------ -----------------
1 Ion Ionel
2 Fane Fanel
3 George Georgel
4 Mircea Mircel
SalesQuotaHistory
SalesQuotaId EntityId SalesQuota SalesOrderDate
------------ ----------- ----------- -----------------------
1 1 1000 2014-01-01 00:00:00.000
2 1 1000 2014-01-02 00:00:00.000
3 1 1000 2014-01-03 00:00:00.000
4 3 3000 2013-01-01 00:00:00.000
5 3 3000 2013-01-01 00:00:00.000
7 4 4000 2015-01-01 00:00:00.000
8 4 4000 2015-01-02 00:00:00.000
9 4 4000 2015-01-03 00:00:00.000
10 1 1000 2015-01-01 00:00:00.000
11 1 1000 2015-01-02 00:00:00.000
I am trying to get the SalesQuota for each user in 2014 and 2015.
Using this query i am getting an erroneous result:
SELECT p.EntityId
, p.FirstName
, SUM(sqh2014.SalesQuota) AS '2014'
, SUM(sqh2015.SalesQuota) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh2014
ON p.EntityId = sqh2014.EntityId
AND YEAR(sqh2014.SalesOrderDate) = 2014
LEFT OUTER JOIN SalesQuotaHistory sqh2015
ON p.EntityId = sqh2015.EntityId
AND YEAR(sqh2015.SalesOrderDate) = 2015
GROUP BY p.EntityId, p.FirstName
EntityId FirstName 2014 2015
--------- ----------- ---------- --------------------
1 Ion 6000 6000
2 Fane NULL NULL
3 George NULL NULL
4 Mircea NULL 12000
In fact, Id 1 has a total SalesQuota of 3000 in 2014 and 2000 in 2015.
What i am asking here, is .. what is really happening behind the scenes? What is the order of operation in this specific case?
Thanks to my last post i was able to solve this using the following query:
SELECT p.EntityId
, p.FirstName
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2014 THEN sqh.SalesQuota ELSE 0 END) AS '2014'
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2015 THEN sqh.SalesQuota ELSE 0 END) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh
ON p.EntityId = sqh.EntityId
GROUP BY p.EntityId, p.FirstName
EntityId FirstName 2014 2015
----------- --------------------- ----------- -----------
1 Ion 3000 2000
2 Fane 0 0
3 George 0 0
4 Mircea 0 12000
but without understanding what's wrong with the first attempt .. i can't get over this ..
Any explanation would be greatly appreciated.
Is easy to see what is happening if you change your select to
SELECT *
and remove the group by
You first approach need something like this
Sql Fiddle Demo
SELECT p.[EntityId]
, p.FirstName
, COALESCE(s2014,0) as [2014]
, COALESCE(s2015,0) as [2015]
FROM Person p
LEFT JOIN (SELECT EntityId, SUM(SalesQuota) s2014
FROM SalesQuotaHistory
WHERE YEAR(SalesOrderDate) = 2014
GROUP BY EntityId
) as s1
ON p.[EntityId] = s1.EntityId
LEFT JOIN (SELECT EntityId, SUM(SalesQuota) s2015
FROM SalesQuotaHistory
WHERE YEAR(SalesOrderDate) = 2015
GROUP BY EntityId
) as s2
ON p.[EntityId] = s2.EntityId
Joining with the result data only if exist for that id and year.
OUTPUT
| EntityId | FirstName | 2014 | 2015 |
|----------|-----------|------|-------|
| 1 | Ion | 3000 | 2000 |
| 2 | Fane | 0 | 0 |
| 3 | George | 0 | 0 |
| 4 | Mircea | 0 | 12000 |
You have multiple rows for each year, so the first method is producing a Cartesian product.
For instance, consider EntityId 100:
1 1 1000 2014-01-01 00:00:00.000
2 1 1000 2014-01-02 00:00:00.000
3 1 1000 2014-01-03 00:00:00.000
10 1 1000 2015-01-01 00:00:00.000
11 1 1000 2015-01-02 00:00:00.000
The intermediate result from the join produces six rows, with these SalesQuotaId:
1 10
1 11
2 10
2 11
3 10
3 11
You can then do the math -- the result is off because of the multiple rows.
You seem to know how to fix the problem. The conditional aggregation approach produces the correct answer.
You could improve the speed of your query by adding a WHERE condition to filter only the years over which you're looking for data:
SELECT p.EntityId
, p.FirstName
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2014
THEN sqh.SalesQuota ELSE 0 END) AS '2014'
, SUM(CASE WHEN YEAR(sqh.SalesOrderDate) = 2015
THEN sqh.SalesQuota ELSE 0 END) AS '2015'
FROM Person p
LEFT OUTER JOIN SalesQuotaHistory sqh
ON p.EntityId = sqh.EntityId
WHERE YEAR(sqh.SalesOrderDate) IN (2014, 2015)
GROUP BY p.EntityId, p.FirstName
Otherwise, the query that you found is the way to go (good job!)

SQL query to pivot a column using CASE WHEN

I have the following table:
Bank:
name val amount
John 1 2000
Peter 1 1999
Peter 2 1854
John 2 1888
I am trying to write an SQL query to give the following result:
name amountVal1 amountVal2
John 2000 1888
Peter 1999 1854
So far I have this:
SELECT name,
CASE WHEN val = 1 THEN amount ELSE 0 END AS amountVal1,
CASE WHEN val = 2 THEN amount ELSE 0 END AS amountVal2
FROM bank
However, it gives the slightly wrong result:
name amountVal1 amountVal2
John 2000 0
Peter 1999 0
John 0 1888
Peter 0 1854
How can I modify my query to give the correct presentation?
Thanks
SELECT
name,
SUM(CASE WHEN val = 1 THEN amount ELSE 0 END) AS amountVal1,
SUM(CASE WHEN val = 2 THEN amount ELSE 0 END) AS amountVal2
FROM bank GROUP BY name
Looks like you need to join the table on itself. Try this:
select bank1.name, bank1.amount, bank2.amount
from bank bank1
inner join bank bank2 on
bank1.name = bank2.name
and bank1.val = 1
and bank2.val = 2