SQL Azure - Create complex Pivot Table - sql

This question is all for SQL Azure. I have a data set for various commodity prices by year and a unit price like:
Rice - 2007 - 0.5
Rice - 2007 - 0.3
Rice - 2007 - 0.8
Wheat - 2006 - 1.1
Wheat - 2006 - 1.4
etc
How can I create a pivot table that gives me the MAX and MIN price paid for each year for each commodity? I know how to do a pivot table that would give me something like the average - thats pretty easy. But I need my "main" pivot column to be the year and then each year would have its 2 "sub columns" for a MIN and MAX price and I'm not quite sure how to do that. Help!

Unless I am missing something in your explanation, you can do this easily without the PIVOT function:
select product,
year,
min(price) MinPrice,
max(price) MaxPrice
from yourtable
group by product, year
See SQL Fiddle with Demo.
If you want the data in separate columns, then there are a few ways that you can do this.
Aggregate function with CASE:
select product,
min(case when year=2006 then price else 0 end) [2006_MinPrice],
max(case when year=2006 then price else 0 end) [2006_MaxPrice],
min(case when year=2007 then price else 0 end) [2007_MinPrice],
max(case when year=2007 then price else 0 end) [2007_MaxPrice]
from yourtable
group by product
See SQL Fiddle with Demo
UNPIVOT and PIVOT:
The UNPIVOT is used to transform your column data into rows. Once in the rows, you can create the new columns with the year and then pivot:
select *
from
(
select product,
cast(year as varchar(4))+'_'+col as piv_col,
value
from
(
select product,
year,
min(price) MinPrice,
max(price) MaxPrice
from yourtable
group by product, year
) x
unpivot
(
value for col in (minPrice, maxPrice)
) u
) d
pivot
(
max(value)
for piv_col in ([2006_MinPrice], [2006_MaxPrice],
[2007_MinPrice], [2007_MaxPrice])
) piv;
See SQL Fiddle with Demo. These give the result:
| PRODUCT | 2006_MINPRICE | 2006_MAXPRICE | 2007_MINPRICE | 2007_MAXPRICE |
---------------------------------------------------------------------------
| Rice | 0 | 0 | 0.3 | 0.8 |
| Wheat | 1.1 | 1.4 | 0 | 0 |
If you have an unknown number of years, then you coul also implement dynamic sql.

Related

Cumulative Sum Query in SQL table with distinct elements

I have a table like this, with column names as Date of Sale and insurance Salesman Names -
Date of Sale | Salesman Name | Sale Amount
2021-03-01 | Jack | 40
2021-03-02 | Mark | 60
2021-03-03 | Sam | 30
2021-03-03 | Mark | 70
2021-03-02 | Sam | 100
I want to do a group by, using the date of sale. The next column should display the cumulative count of the sellers who have made the sale till that date. But same sellers shouldn't be considered again.
For example,
The following table is incorrect,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 5 | 300
The following table is correct,
Date of Sale | Count(Salesman Name) | Sum(Sale Amount)
2021-03-01 | 1 | 40
2021-03-02 | 3 | 200
2021-03-03 | 3 | 300
I am not sure how to frame the SQL query, because there are two conditions involved here, cumulative count while ignoring the duplicates. I think the OVER clause along with the unbounded row preceding may be of some use here? Request your help
Edit - I have added the Sale Amount as a column. I need the cumulative sum for the Sales Amount also. But in this case , all the sale amounts should be considered unlike the salesman name case where only unique names were being considered.
One approach uses a self join and aggregation:
WITH cte AS (
SELECT t1.SaleDate,
COUNT(CASE WHEN t2.Salesman IS NULL THEN 1 END) AS cnt,
SUM(t1.SaleAmount) AS amt
FROM yourTable t1
LEFT JOIN yourTable t2
ON t2.Salesman = t1.Saleman AND
t2.SaleDate < t1.SaleDate
GROUP BY t1.SaleDate
)
SELECT
SaleDate,
SUM(cnt) OVER (ORDER BY SaleDate) AS NumSalesman,
SUM(amt) OVER (ORDER BY SaleDate) AS TotalAmount
FROM cte
ORDER BY SaleDate;
The logic in the CTE is that we try to find, for each salesman, an earlier record for the same salesman. If we can't find such a record, then we assume the record in question is the first appearance. Then we aggregate by date to get the counts per day, and finally take a rolling sum of counts in the outer query.
The best way to do this uses window functions to determine the first time a sales person appears. Then, you just want cumulative sums:
select saledate,
sum(case when seqnum = 1 then 1 else 0 end) over (order by saledate) as num_salespersons,
sum(sum(sales)) over (order by saledate) as running_sales
from (select t.*,
row_number() over (partition by salesperson order by saledate) as seqnum
from t
) t
group by saledate
order by saledate;
Note that this in addition to being more concise, this should have much, much better performance than a solution that uses a self-join.

sql query to get difference of sum over two columns spread across two tables grouped by month [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I want to write a query to get difference of sum over two columns spread across two tables grouped by month.
Schema:
TableA
mass numeric,
weight numeric
sampleDt date
TableB
mass numeric,
weight numeric,
sampleDt date
Sample Data for Table A
100|200|2017-01-03
10 |20 |2017-01-05
200|400|2017-12-23
Sample Data for Table B
10 | 20 | 2017-01-20
10 | 20 | 2017-01-21
100 | 200 | 2017-12-12
2 | 4 | 2017-06-12
Expected Output
Month,Year |AMassTotal |AWeightTotal |BMassTotal |BWeightTotal |AMassTotal-BMassTotal
Jan,17 | 110 | 220 | 20 | 40 | 90
Jun,17 | 0 | 0 | 2 | 4 | -2
Dec,17 | 200 | 400 |100 |200 | 100
Use a full outer join and a group by:
select to_char(sampledt, 'yyyy-mm') as month_year,
coalesce(sum(a.mass),0) as a_mass_total,
coalesce(sum(a.weight),0) as a_weight_total,
coalesce(sum(b.mass),0) as b_mass_total,
coalesce(sum(b.weight),0) as b_weight_total,
coalesce(sum(a.mass),0) - coalesce(sum(b.mass),0) as mass_total_diff
from table_a a
full join table_b b using (sampledt)
group by to_char(sampledt, 'yyyy-mm');
If you want year and month in separate columns, you can use:
select extract(year from sampledt) as year,
extract(month from sampledt) as month,
coalesce(sum(a.mass),0) as a_mass_total,
coalesce(sum(a.weight),0) as a_weight_total,
coalesce(sum(b.mass),0) as b_mass_total,
coalesce(sum(b.weight),0) as b_weight_total,
coalesce(sum(a.mass),0) - coalesce(sum(b.mass),0) as mass_total_diff
from table_a a
full join table_b b using (sampledt)
group by extract(year from sampledt), extract(month from sampledt)
order by 1,2;
Except for the to_char() function the above is ANSI standard SQL.
Online example: http://rextester.com/YAN23912
For MySQL
SELECT TB.`Month,Year`,
IFNULL(AMassTotal,0),
IFNULL(AWeightTotal,0),
IFNULL(BMassTotal,0),
IFNULL(BWeightTotal,0),
(IFNULL(AMassTotal,0)-IFNULL(BMassTotal,0)) AS 'AMassTotal-BMassTotal'
FROM
(
SELECT DATE_FORMAT(sampleDt,'%M,%Y') AS `Month,Year`,
SUM(CASE WHEN mass IS NULL THEN 0 ELSE mass END) AS AMassTotal,
SUM(CASE WHEN weight IS NULL THEN 0 ELSE weight END) AS AWeightTotal
From TableA
GROUP BY DATE_FORMAT(sampleDt,'%M,%Y')
) AS TA
RIGHT JOIN
(
SELECT DATE_FORMAT(sampleDt,'%M,%Y') AS `Month,Year`,
SUM(CASE WHEN mass IS NULL THEN 0 ELSE mass END) AS BMassTotal,
SUM(CASE WHEN weight IS NULL THEN 0 ELSE weight END) AS BWeightTotal
FROM TableB
GROUP BY DATE_FORMAT(sampleDt,'%M,%Y')
) AS TB
ON TA.`Month,Year`=TB.`Month,Year`
Live Demo
http://sqlfiddle.com/#!9/2a6e24/22
Try this:
This works in SQL Server:
SELECT CONVERT(CHAR(3), sampleDt, 0)+','+CAST(DATEPART(YEAR,sampleDt) AS VARCHAR) [Month,Year]
,ISNULL(SUM(CASE WHEN D.Tab=1 THEN mass END),0) AMassTotal
,ISNULL(SUM(CASE WHEN D.Tab=1 THEN weight END),0) AWeightTotal
,ISNULL(SUM(CASE WHEN D.Tab=2 THEN mass END),0) BMassTotal
,ISNULL(SUM(CASE WHEN D.Tab=2 THEN weight END),0) BWeightTotal
,ISNULL(SUM(CASE WHEN D.Tab=1 THEN mass END)-SUM(CASE WHEN D.Tab=2 THEN mass END),0) [AMassTotal-BMassTotal]
FROM(
SELECT 1 AS Tab,* FROM TableA
UNION ALL
SELECT 2,* FROM TableB
)D
GROUP BY LEFT(DATEPART(MONTH,sampleDt),3)+DATEPART(YEAR,sampleDt)
select CONVERT(CHAR(3), GETDATE(), 0)
SQL Fiddle Demo: SQL Fiddle Demo

Calculating % of single row to sum of rows - Teradata SQL

I am trying to calculate the % contribution of the a's in the "qual" column to the total sales for each loc.
loc | qual | sales
- - - - - - - - - - - -
us | a | 1,000
us | b | 500
gc | a | 200
gc | b | 400
So the answer that I would be looking for is US = 66.66% (1,000/1,500) and gc = 33.33% (200/600). So the return result would be....
loc | Pct
us | 66.66%
gc | 33.33%
Thanks!
You can do this with aggregation and window functions:
select loc, count(*) as sales, sum(count(*)) over () as total_sales,
count(*) / sum(1.0*count(*)) over () as sales_proportion
from t
group by loc;
If sales is really an integer, may should convert it to a floating point or decimal representation (you can just multiple by 1.0).
EDIT:
Oops, the above does something useful, but not what the OP asks for. Here is the simplest method:
select loc,
avg(case when qual = 'a' then 1.0 else 0 end) as proportion_a
from t
group by loc;
You need a conditional aggregate:
select loc,
100.00
* sum(case when qual = 'a' then sales else 0 end) -- only 'a' sales
/ sum(sales) as Pct -- all sales
from tab
group by loc
Change the precision of 100.00 * according to your needs.
Caution, unless the datatype of sales is a FLOAT or NUMBER you must multiply 100 * first and then divide.
To calculate the percentage, you know that you will need to sum over loc. But you don't want to group by loc and aggregate, because that will destroy parts of the table that you want to keep, and inner joining that resulting table back to the table you have here ... well, there's gotta be an easier way. And that's where a correlated subquery comes in! You can perform a subselect that counts the sales correlated to the loc in your table. The following code should give you what you want.
select loc,
sales * 100 / (
select sum(sales)
from my_table sq
where sq.loc = t.loc
) as pct
from my_table t
where qual = 'a'
Hope this helps!

Group values in to categories with annual break down

I have a table of licence applications I want to display the data by category for each financial year.
For my query, there are 2 key columns.
Firstly, there is a fee column and the values within this column determine the type of licence.
Between 0 and 300 is Minor
between 300 and 600 is Standard
between 600 and 2000 is Major
Secondly, there is a date field which is to be used for the financial year.
I would like the results to look like this.
Category | 2013/14 | 2012/13
Minor | 23 | 21
Standard | 10 | 11
Major | 5 | 3
I have this query below, but i cant get it right for the year part.
Would really appreciate any advice people can give me.
select category.gr as [category],
sum(case when ((year(licence.[start_date]) in ('2010'))
and (month(licence.[start_date]) in (4,5,6,7,8,9,10,11,12)))
or ((year(licence.[start_date]) in ('2011'))
and (month(licence.[start_date]) in (1,2,3))) then 1 else 0 end) AS '10/11 Count',
from ( select case
when [fee_INC] between 0 and 350 then 'Minor'
when [fee_INC] between 350 and 600 then 'Standard'
else 'Major' end as gr
from [L_LICENCE_FIN]) as category,
from [L_LICENCE_FIN] as licence
group by category.gr
SELECT
[category],
[2013/14],
[2012/13]
FROM (
SELECT
[category],
STR(YEAR(DATEADD(month,-3,[start_date])),4)
+'/'
+RIGHT(STR(YEAR(DATEADD(month,-3,[start_date]))+1,4),2)
AS [fiscal_year],
COUNT(*) AS [count]
FROM #L_LICENCE_FIN
INNER JOIN (VALUES
( 0, 300, 'Minor'),
(300, 600, 'Standard'),
(600,2000, 'Major')
) categories([fee_min], [fee_max], [category])
ON ([fee] >= [fee_min] AND [fee] < [fee_max])
GROUP BY [category],[start_date]
) p1
PIVOT(SUM([count]) FOR [fiscal_year] IN ([2013/14],[2012/13])) p2

Compare 2 subsets of data from table?

I'm not sure if this is possible - I'm having real trouble getting my head around it.
This is for a product schedule, showing how much we are expecting to deliver on a given date. Data is imported into this schedule weekly which creates a new entry.
For example, if the schedule for the day currently totals 10, and you import 15, a new row is inserted with Qty 5, bringing the sum to 15.
The data I have is like so:
Product | Delivery Required Date | Qty
Prod1 | 1/1/13 | 10
Prod1 | 1/1/13 | -10
Prod1 | 1/1/13 | 10
Prod1 | 1/1/13 | -10
Prod1 | 1/1/13 | 25
I want to design a query which shows the variance between the previous schedule, and the current schedule.
For example, the query will sum all of the rows "Qty", excluding the last entry - and compare it to the last entry. In the data above, the variance is 25 (Existing total was 0, latest entry is 25, 0+25 =25).
Is this possible?
Thanks
I suspect there'a better answer using Common Table Expressions, but a quick & ugly solution might be
select sum(case when EntryNo <> MAX(EntryNo) then Qty else 0 end) as 'sumLessLast'
from MyTable
If MyTable has a million rows in it you'll want a better solution.
SqlServer 2005 and 2008:
;with r1 as (
select DeliveryReqDate, sum(Qty) as TotalQty
from TableName
group by DeliveryReqDate)
, r2 as (
select DeliveryReqDate, Qty
, row_number() over (partition by DeliveryReqDate order by EntryNo desc) rn
from TableName)
select r1.DeliveryReqDate, r1.TotalQty, r2.Qty as LastQty
, r1.TotalQty - r2.Qty as TotalButLastQty
from r1
join r2 on r2.DeliveryReqDate = r1.DeliveryReqDate and r2.rn = 1
SqlServer 2012
;with r1 as (
select DeliveryReqDate, Qty
, sum(Qty) over (partition by DeliveryReqDate) as TotalQty
, row_number() over (partition by DeliveryReqDate order by EntryNo desc) rn
from TableName)
select DeliveryReqDate, TotalQty, Qty as LastQty
, TotalQty - Qty as TotalButLastQty
from r1
where rn = 1
I'm not sure that I completely understand logic regarding the accounting of product and date, but I hope you can adapt above queries to your needs.