Pivoting results in more rows than needed - sql

I have query like this:
`SELECT [/BIC/IORSVPTX] as Region,
COUNTRY_ID,
[/BIC/IOWCNTRY] as Country,
[/BIC/IOC_TRLNO] as Trial,
[/BIC/IOWQUAL] as ResourceType,
case
when [/BIC/IOWQUAL] like '%Supporter%'
then 1
when [/BIC/IOWQUAL] like '%Monitor%'
then 3
when [/BIC/IOWQUAL] like '%PM%'
then 2
end as ResourceGroup,
[1], [2], [3], [4], [5]
FROM
(
SELECT [/BIC/IORSVPTX],
COUNTRY_ID,
[/BIC/IOWCNTRY],
[/BIC/IOC_TRLNO],
[/BIC/IOWQUAL], case
when [/BIC/IOWQUAL] like '%Supporter%'
then 1
when [/BIC/IOWQUAL] like '%Monitor%'
then 3
when [/BIC/IOWQUAL] like '%PM%'
then 2
end as ResourceGroup,
left(CALMONTH,4) as StartYear,
right(CALMONTH,2) as StartMonth,
((left(CALMONTH,4) - 2013) * 12) + right(CALMONTH,2) AS YearMonth,
QUANTITY as Hours
FROM dbo.Actuals
where [/BIC/IOC_TRLNO]<>'0000' and left(CALMONTH,4)>2012 and COUNTRY_ID='10'
and ([/BIC/IOWQUAL] like '%PM%' or [/BIC/IOWQUAL] like'%Monitor%' or [/BIC/IOWQUAL] like '%Supporter%')
) up
PIVOT (sum(Hours) FOR YearMonth IN ([1],[2],[3],[4],[5])) AS pvt;
IN ([1],[2],[3],[4],[5])) AS pvt;`
which gives me the output with 2 rows for each ResourceType and ResourceGroup with hours for 1(Jan) and 2nd row with hours for 2(Feb) instead of 1 row
Region COUNTRY_ID Country Trial ResourceType ResourceGroup 1 2 3 4 5
North America 10 USA 3619 Monitor 3 158.5 NULL NULL NULL NULL
North America 10 USA 3619 Monitor 3 NULL 42 NULL NULL NULL
North America 10 USA 3619 PM / LTM / RTM 2 20 NULL NULL NULL NULL
North America 10 USA 3619 PM / LTM / RTM 2 NULL 22 NULL NULL NULL
North America 10 USA 3619 Supporter 1 18.5 NULL NULL NULL NULL
North America 10 USA 3619 Supporter 1 NULL 15.75 NULL NULL NULL
desired output should be like this:
Region COUNTRY_ID Country Trial ResourceType ResourceGroup 1 2 3 4 5
North America 10 USA 3619 Monitor 3 158.5 42 NULL NULL NULL
North America 10 USA 3619 PM / LTM / RTM 2 20 22 NULL NULL NULL
North America 10 USA 3619 Supporter 1 18.5 15.75 NULL NULL NULL
I will appreciate your help!

I see a few things wrong with your current query.
First, you are using the CASE expression in the outer and the subquery. I don't fully understand why you are doing that as you created a column called ResourceGroup.
Second, you will get multiple rows if your subquery contains columns with distinct values and you apply the pivot.
For example in your subquery you are using:
left(CALMONTH,4) as StartYear,
right(CALMONTH,2) as StartMonth,
But you do not have these in the final select list, if you added them to the final select they will probably show distinct values which are causing the problem during the pivot grouping.
Based on your existing query I am guessing you want to use:
SELECT
[/BIC/IORSVPTX] as Region,
COUNTRY_ID,
[/BIC/IOWCNTRY] as Country,
[/BIC/IOC_TRLNO] as Trial,
[/BIC/IOWQUAL] as ResourceType,
ResourceGroup,
[1], [2], [3], [4], [5]
FROM
(
SELECT [/BIC/IORSVPTX],
COUNTRY_ID,
[/BIC/IOWCNTRY],
[/BIC/IOC_TRLNO],
[/BIC/IOWQUAL],
case
when [/BIC/IOWQUAL] like '%Supporter%' then 1
when [/BIC/IOWQUAL] like '%Monitor%' then 3
when [/BIC/IOWQUAL] like '%PM%' then 2
end as ResourceGroup,
((left(CALMONTH,4) - 2013) * 12) + right(CALMONTH,2) AS YearMonth,
QUANTITY as Hours
FROM dbo.Actuals
where [/BIC/IOC_TRLNO]<>'0000'
and left(CALMONTH,4)>2012
and COUNTRY_ID='10'
and ([/BIC/IOWQUAL] like '%PM%'
or [/BIC/IOWQUAL] like'%Monitor%'
or [/BIC/IOWQUAL] like '%Supporter%')
) up
PIVOT
(
sum(Hours)
FOR YearMonth IN ([1],[2],[3],[4],[5])
) AS pvt;

Related

How to aggregate using distinct values across two columns?

I have the following data in an orders table:
revenue expenses location_1 location_2
3 6 London New York
6 11 Paris Toronto
1 8 Houston Sydney
1 4 Chicago Los Angeles
2 5 New York London
7 11 New York Boston
4 6 Toronto Paris
5 11 Toronto New York
1 2 Los Angeles London
0 0 Mexico City London
I would like to create a result set that has 3 columns:
a list of the 10 DISTINCT city names
the sum of revenue for each city
the sum of expenses for each city
The desired result is:
location revenue expenses
London 6 13
New York 17 33
Paris 10 17
Toronto 15 28
Houston 1 8
Sydney 1 8
Chicago 1 4
Los Angeles 2 6
Boston 7 11
Mexico City 0 0
Is it possible to aggregate on distinct values across two columns? If yes, how would I do it?
Here is a fiddle:
http://sqlfiddle.com/#!9/0b1105/1
Shorter (and often faster):
SELECT location, sum(revenue) AS rev, sum(expenses) AS exp
FROM (
SELECT location_1 AS location, revenue, expenses FROM orders
UNION ALL
SELECT location_2 , revenue, expenses FROM orders
) sub
GROUP BY 1;
May be faster:
WITH cte AS (
SELECT location_1, location_2, revenue AS rev, expenses AS exp
FROM orders
)
SELECT location, sum(rev) AS rev, sum(exp) AS exp
FROM (
SELECT location_1 AS location, rev, exp FROM cte
UNION ALL
SELECT location_2 , rev, exp FROM cte
) sub
GROUP BY 1;
The (materialized!) CTE adds overhead, which may outweigh the benefit. Depends on many factors like total table size, available indexes, possible bloat, available RAM, storage speed, Postgres version, ...
fiddle
You could UNION ALL two queries and then select from it...
select location, sum(rev) as rev, sum(exp) as exp
from (
select location_1 as location, sum(revenue) as rev, sum(expenses) as exp
from orders
group by location_1
union all
select location_2 as location, sum(revenue) as rev, sum(expenses) as exp
from orders
group by location_2
)z
group by location
order by 1

Sum of column depending on values

Can you guys let me know how to make a query that output the sum of amount based on column values(order, Continent and Country)? Also, I want to show all Continent values as unique value (North America)
Example table,
ID Code Continent Country amount
----------------------------------------------------
1 1 North America NULL NULL
2 1 America USA 10
3 1 NA USA 10
4 1 Unknown USA 10
5 2 North America NULL NULL
6 2 America Canada 15
7 2 NA Canada 15
8 2 Unknown Canada 15
9 3 North America NULL NULL
10 3 America Mexico 20
11 3 NA Mexico 20
12 3 Unknown Mexico 20
Output
ID Code Continent Country SumAmount
----------------------------------------------
1 1 North America USA 30
2 2 North America Canada 45
3 3 North America Mexico 60
I have tried to approach it like
select ID, Code, case when Continent != 'North America' then Continent = 'North America' end as Continent, Country, sum(Amount) as SumAmount
from Table group by ID, Continent, Country
or maybe I need to make a query like this and work with this query below?
select ID, Code, Continent, Country, sum(Amount) as SumAmount
from Table where Continent !='North America'
But it is not working. How should I do this?
I appreciate for any other approaches. It would be better than mine
The awkward design here (relations with no real indication of such other than the shared Code column) is going to lead to suboptimal queries like this
DECLARE #ContinentToReport varchar(32) = 'North America';
;WITH x AS
(
SELECT Code FROM dbo.TableName
WHERE Continent = #ContinentToReport
AND Country IS NULL
)
SELECT ID = ROW_NUMBER() OVER (ORDER BY x.Code),
x.Code,
Continent = #ContinentToReport,
t.Country,
SumAmount = SUM(t.amount)
FROM dbo.TableName AS t
INNER JOIN x ON t.Code = x.Code
WHERE t.Country IS NOT NULL
GROUP BY x.Code, t.Country
ORDER BY x.Code;
Output (though I made a guess at what ID means and why it's different then the ID and the source, and I find the Continent column is kind of redundant since it will always be the same):
ID
Code
Continent
Country
SumAmount
1
1
North America
USA
30
2
2
North America
Canada
45
3
3
North America
Mexico
60
Example db<>fiddle
The simplest query which returns the correct result seems to be something like this
select row_number() over (order by Code) ID,
Code,
'North America' Continent,
Country,
sum(amount) SumAmount
from dbo.TableName
where Country is not null
group by Code, Country
order by Code;
dbFiddle

Calculate value using previous and current month

I have below three tables
Stock Table
ID GlobalStock Date Country
1 10 2017/01/01 India
1 20 2017/01/01 India
2 5 2017/02/01 Africa
3 6 2017/08/01 Japan
4 7 2017/04/01 Japan
5 89 2017/08/01 Japan
2 10 2017/03/01 Japan
5 8 2017/03/01 Japan
1 20 2017/02/01 India
ShipFile
ID GlobalStock Date Country
2 10 2017/03/01 Africa
3 60 2017/08/01 India
11 70 2017/08/01 India
1 8 2017/02/01 India
1 9 2017/02/01 India
2 4 2017/03/01 Japan
2 5 2017/04/01 Japan
5 3 2017/03/01 Japan
3 8 2017/08/01 Japan
SalesFiles
ID GlobalStock Date Country
2 10 2017/03/01 India
2 20 2017/03/01 Africa
3 30 2017/08/01 Japan
7 5 2017/02/01 Japan
8 8 2018/01/01 Japan
1 9 2017/02/01 India
1 70 2017/02/01 Africa
13 10 2017/08/01 Japan
10 60 2017/11/01 Japan
I want to calculate -> StockTable(Month - 1) + ShipFile (Month) - Sales (Month)
For example
For ID 1 suppose we are considering Jan (GlobalStock -> 10 + 20) data then in other tables we must take Feb values and country should be same for all tables.
So calculation would be
(10 + 20) + (8 + 9) - (9) = 38
If we consider Feb ID of stocktable then we must consider March data from other tables and so on..
the joining all table i am considering ID and Country.
You can query using subquery or cte as below:
;With cte_Stock as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from Stock
group by Id, [Date], Country
), cte_ShipFiles as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from ShipFile
group by Id, [Date], Country
)
, cte_SalesFiles as (
Select ID, [Date], Country, sum(GlobalStock) Sum_GlobalStock from SalesFiles
group by Id, [Date], Country
)
select s.ID, s.[Date], sf.[Date], s.Country,
YourOutput = s.Sum_GlobalStock+sf.Sum_GlobalStock-sales.Sum_GlobalStock
from cte_Stock s
join cte_ShipFiles sf
on s.ID = sf.ID
and s.Country = sf.Country
and s.[Date] = dateadd(mm,-1, sf.[Date])
join cte_SalesFiles sales
on s.ID = sales.ID
and s.Country = sales.Country
and s.[Date] = dateadd(mm,-1, sales.[Date])
Output as below:
+----+------------+------------+---------+------------+
| ID | Date | Date | Country | YourOutput |
+----+------------+------------+---------+------------+
| 1 | 2017-01-01 | 2017-02-01 | India | 38 |
| 2 | 2017-02-01 | 2017-03-01 | Africa | -5 |
+----+------------+------------+---------+------------+
Here is an approach with derived tables:
DECLARE #CurrentMonth date = '20180101'
DECLARE #NextMonth date = DATEADD(MONTH,1,#CurrentMonth)
SELECT s.Country, SUM(s.GlobalStock) + ShipSum - SaleSum
FROM stock s
LEFT JOIN (SELECT ISNULL(SUM(GlobalStock),0) ShipSum, Country
FROM ShipFile
WHERE Date >= #NextMonth
AND Date <= EOMONTH(#NextMonth)
GROUP BY Country) sh on s.Country = sh.Country
LEFT JOIN (SELECT ISNULL(SUM(GlobalStock),0) SaleSum, Country
FROM SalesFile
WHERE Date >= #NextMonth
AND Date <= EOMONTH(#NextMonth)
GROUP BY Country) sa on s.Country = sa.Country
WHERE s.Date >= #CurrentMonth
AND s.Date <= EOMONTH(#CurrentMonth)
GROUP BY s.Country, ShipSum, SaleSum
Notes:
This uses Country for the joins because ID seems to change between tables.
It also uses a date range assuming that the day portion of your date column is not always the first of the month - if it is always the first that can be simplified to date = #CurrentMonth or date = #NextMonth

SQL Server: group by, coalesce and select one of coalesce'd

I have a table called Regions:
city district1 district2 district3 district4
---------------------------------------------------------
Michigan 2 NULL NULL 2
Michigan 2 20 NULL 20
Michigan 2 NULL 3 3
Ontario 3 NULL NULL 3
Quebec 4 1 NULL 1
Quebec 4 NULL NULL 4
Edmonton NULL 7 NULL 7
Edmonton NULL NULL 11 11
district4 is (coalesce(district3, district2, district1))
And I'd like to get a distinct grouped by City also with district1
city district1 district_final
--------------------------------------
Michigan 2 3
Ontario 3 3
Quebec 4 1
Edmonton NULL 11
district_final is not max; it's coalesce of group
select distinct r1.city, r1.district1, coalesce(r3.district3, r2.district2, r1.district1) district_final
from Regions r1
left outer join Regions r2 on r1.city = r2.city and r2.district2 is not null
left outer join Regions r3 on r1.city = r3.city and r3.district3 is not null
Following code should solve the purpose i guess:
SELECT CITY,dct1 as district1,MAX(DCT) as district_final FROM
(
SELECT CITY, district1 as dct1, district1 AS DCT FROM [TABLE]
UNION
SELECT CITY, district1 as dct1, district2 AS DCT FROM [TABLE]
UNION
SELECT CITY, district1 as dct1, district3 AS DCT FROM [TABLE]
) tempTable
group by CITY,dct1;

How to replace all values in grouped column except first row

I have table like this:
ID Region CreatedDate Value
--------------------------------
1 USA 2016-01-01 5
2 USA 2016-02-02 10
3 Canada 2016-02-02 2
4 USA 2016-02-03 7
5 Canada 2016-03-03 3
6 Canada 2016-03-04 10
7 USA 2016-03-04 1
8 Cuba 2016-01-01 4
I need to sum column Value grouped by Region and CreatedDate by year and month. The result will be
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
USA 2016 2 17
USA 2016 3 1
Canada 2016 2 2
Canada 2016 3 13
Cuba 2016 1 4
BUT I want to replace all repeated values in column Region with empty string except first met row. The finish result must be:
Region Year Month SumOfValue
--------------------------------
USA 2016 1 5
2016 2 17
2016 3 1
Canada 2016 2 2
2016 3 13
Cuba 2016 1 4
Thank you for a solution. It will be advantage if solution will replace also in column Year
You need to use SUM and GROUP BY to get the SumOfValue. For the formatting, you can use ROW_NUMBER:
WITH Cte AS(
SELECT
Region,
[Year] = YEAR(CreatedDate),
[Month] = MONTH(CreatedDate),
SumOfValue = SUM(Value),
Rn = ROW_NUMBER() OVER(PARTITION BY Region ORDER BY YEAR(CreatedDate), MONTH(CreatedDate))
FROM #tbl
GROUP BY
Region, YEAR(CreatedDate), MONTH(CreatedDate)
)
SELECT
Region = CASE WHEN Rn = 1 THEN c.Region ELSE '' END,
[Year],
[Month],
SumOfValue
FROM Cte c
ORDER BY
c.Region, Rn
ONLINE DEMO
Although this can be done in TSQL, I suggest you do the formatting on the application side.
Query that follows the same order as the OP.