Pivot data by category with aggregation - sql

I have following structure:
Date, Type, Country, Sales, NumOfItems
2020-12-24, Basic, USA, 700, 5
2020-12-24, Standard, USA, 300, 3
2020-12-24, Basic, USA, 100, 1
2020-12-24, Standard, USA, 200, 6
how to pivot this data in order to get below structure:
Date, Country, Sales, NumOfItems, SalesBasic, NumOfItemsBasic, SalesStandard, NumOfItemsBasic
2020-12-24, USA, 1300, 15, 800, 6 500 9
and group by Date and country?
thanks in advance for any advice and best regards!

You can aggregate on CASE statements, for example:
SELECT Date, Country, SUM(Sales) AS Sales, SUM(NumOfItems),
SUM(CASE WHEN Type = 'Basic' THEN Sales ELSE null END) AS SalesBasic,
...
GROUP BY
Date, Country

Related

First region by Earliest Date per Customer (Snowflake)

I have this sample dataset and would like to display customer_name, region, and order date.
The issue with my dataset is that the customer has multiple regions and multiple order dates. I would like to see only the customer name, first region per order (If the first order was in US East then US East ?), and all order date
Customer 1, US East, 2021-12-10
Customer 1, US West, 2022-07-26
result>
Customer 1, US East, 2021-12-10
Customer 1, US East, 2022-07-26
select
'Customer 1','US East', '2021-12-10'
union
select
'Customer 1','US West', '2022-07-26'
union
select
'Customer 2','Europe West', '2021-01-26'
union
select
'Customer 2','Europe', '2020-01-26'
Using FIRST_VALUE:
SELECT Customer, FIRST_VALUE(Region) OVER(PARTITION BY Customer
ORDER BY date) AS region
,Date
FROM tab

Count total users' first order for each region, each day

I have a table called orders.
Link for the table here:
table
I want to get the total users' first order in each region, each day.
First, I tried to get: the first order for each unique user by doing this:
SELECT customer_id,
MIN(order_date) first_buy,
region
FROM orders
GROUP BY 1
ORDER BY 2, 1;
This resulted with:
customer_id, first_buy, region
BD-11500, 2017-01-02, Central
DB-13060, 2017-01-03, West
GW-14605, 2017-01-03, West
HR-14770, 2017-01-03, West
SC-20380, 2017-01-03, West
VF-21715, 2017-01-03, Central
And so on.
You can see there are 4 unique users on 2017-01-03 in West.
I want to get this result:
first_buy, region, count_user
2017-01-02, Central, 1
2017-01-03, West, 4
2017-01-03, Central, 1
I haven't tested this but I think this will give what you wanting to achieve
SELECT first_buy, region, COUNT(customer_id) AS count_user
FROM (SELECT customer_id, MIN(order_date) first_buy, region
FROM orders
GROUP BY customer_id) AS t
GROUP BY first_buy, region
Try this:
SELECT
first_buy = (SELECT MIN(order_date) FROM orders WHERE orders.region = ord.region),
ord.region,
count_user = ISNULL((SELECT COUNT(*) FROM orders WHERE orders.region = ord.region GROUP BY orders.customer_id), 0)
FROM orders ord
GROUP BY ord.region

How to get a difference between two query result sets

I have two queries. The output for the first query is the total sales quantity for all Brands and the output for the second query is the total sales quantity only for 'New' Brands.
I need to create only one query (By merging below two queries: Query1 & Query2) where we can see the total sales of 'New' brands per Region, total sales of All brands per region and a new column named difference (Total sales quantity of All brands- Total sales quantity of New brands) side by side.
Expected Output :
InvoiceDate
Brand
Region
Quantity for 'New' Brand
Quantity for All Brand
Difference
2021/10/01
New
A
40
100
60
2021/10/01
New
B
10
90
80
2021/10/01
New
C
50
150
100
2021/10/01
New
D
30
200
170
These are my queries:
Query1:
SELECT InvoiceDate, Brand, Region, Quantity From TotalSales // For All Brands
Query2:
SELECT InvoiceDate, Brand, Region, Quantity From TotalSales where Brand='New' // For New Brands
There are a couple of ways of doing this...
First - I don't think you want the "Brand" column in your result. That doesn't make must sense. Also, I think you are going to want a summation for the AllBrands total...
Use subqueries
select allBrands.InvoiceDate, allBrands.Region, newBrands.Quantity as NewQuantity, allBrands.Quantity as allQuantity, allBrands.Quantity-newBrands.Quantity as Difference
FROM
(SELECT InvoiceDate, Region, SUM(Quantity) as Quantity From TotalSales GROUP BY InvoiceDate, Region) as allBrands
LEFT OUTER JOIN (SELECT InvoiceDate, Region, Quantity From TotalSales where Brand='New') as NewBrands ON NewBrands.InvoiceDate = allBrands.InvoiceDate AND NewBrands.Region = AllBrands.Region
or 2. use temp tables
SELECT InvoiceDate, Region, SUM(Quantity) as Quantity INTO #allSales From TotalSales GROUP BY InvoiceDate, Region;
SELECT InvoiceDate, Region, Quantity INTO #newSales From TotalSales where Brand='New';
select allBrands.InvoiceDate, allBrands.Region, newBrands.Quantity as NewQuantity, allBrands.Quantity as allQuantity, allBrands.Quantity-newBrands.Quantity as Difference
FROM #allBrands allBrands
LEFT OUTER JOIN #newBrands newBrands ON NewBrands.InvoiceDate = allBrands.InvoiceDate AND NewBrands.Region = AllBrands.Region;
You want to get the quantity for brand = 'new' and the total quantity for all brands and compare the two.
One way to achieve this is conditional aggregation:
select
invoicedate,
'New' as brand,
region,
sum(case when brand = 'New' then quantity else 0 end) as qty_new,
sum(quantity) as qty_all,
sum(quantity) - sum(case when brand = 'New' then quantity else 0 end) as diff
from totalsales
group by invoicedate, region
having sum(case when brand = 'New' then quantity else 0 end) > 0
order by invoicedate, region;
Another is a join
with qnew as
(
select invoicedate, brand, region, quantity
from totalsales
where brand = 'New'
)
, qall as
(
select invoicedate, region, sum(quantity) as total
from totalsales
group by invoicedate, region
)
select
qnew.*, qall.total, qall.total- qnew.quantity as diff
from qnew
join qall on qall.invoicedate = qnew.invoicedate
and qall.brand = qnew.brand
and qall.region = qnew.region
order by qnew.invoicedate, qnew.brand, qnew.region;
You can use simple conditional aggregation (SUM) on the data such as this:
DECLARE #TotalSales TABLE (InvoiceDate DATE, Brand NVARCHAR(16), Region NCHAR(1), Quantity INT)
INSERT INTO
#TotalSales(
InvoiceDate,
Brand,
Region,
Quantity
)
VALUES ('10/1/2021', 'New', 'A', 20),
('10/1/2021', 'New', 'A', 20),
('10/1/2021', 'Old', 'A', 30),
('10/1/2021', 'Old', 'A', 30),
('10/1/2021', 'New', 'B', 10),
('10/1/2021', 'Old', 'B', 30),
('10/1/2021', 'Old', 'B', 50),
('10/1/2021', 'New', 'C', 50),
('10/1/2021', 'Old', 'C', 100),
('10/1/2021', 'New', 'D', 10),
('10/1/2021', 'New', 'D', 10),
('10/1/2021', 'New', 'D', 10),
('10/1/2021', 'Old', 'D', 100),
('10/1/2021', 'Old', 'D', 70),
('11/1/2021', 'Old', 'A', 50)
;WITH Data AS (
SELECT
ts.InvoiceDate,
ts.Region,
SUM(ts.Quantity) AS QuantityAll,
SUM(CASE WHEN ts.Brand = 'New' THEN ts.Quantity ELSE 0 END) AS QuantityNew
FROM
#TotalSales ts
GROUP BY
ts.InvoiceDate,
ts.Region
)
SELECT
d.InvoiceDate,
d.Region,
d.QuantityAll,
d.QuantityNew,
d.QuantityAll - d.QuantityNew AS TheDifference
FROM
Data d
ORDER BY
d.InvoiceDate,
d.Region
I used a CTE so that we don't have to repeat the conditional SUM(CASE WHEN... for subtracting between QuantityNew and QuantityAll.
Output is:
InvoiceDate Region QuantityAll QuantityNew TheDifference
2021-10-01 A 100 40 60
2021-10-01 B 90 10 80
2021-10-01 C 150 50 100
2021-10-01 D 200 30 170
2021-11-01 A 50 0 50

Calculating % of total with a Grouped Case statement

So I have a CASE statement that I've Grouped. But, I was also trying to calculate the percentage of the total for each Grouped CASE result. When I run the commands I made below, it gives
Region Number Percentage
West Coast 11675 0
Not West Coast 104620 0
I don't understand why 'Percentage' comes as '0'.
Here's the code, with the 'problem line' labeled.
With [Summed Region] AS
(
SELECT
[State Province],
CASE [State Province]
WHEN 'Oregon' THEN 'West Coast'
WHEN 'Washington' THEN 'West Coast'
WHEN 'California' THEN 'West Coast'
ELSE 'Not West Coast'
END AS 'Region'
FROM
[WideWorldImportersDW].[Dimension].[City]
)
SELECT
Region,
count(Region) AS Number,
---THE PROBLEM LINE IS BELOW THIS---
count(region)/(select count(*) FROM [WideWorldImportersDW].[Dimension].
[City]) AS Percentage
FROM
[Summed Region]
GROUP BY
Region
What's the problem with that line? If I split out the two pieces, each returns the correct number. But when I divide one by the other I get '0'.
Thanks!
It is called integer division. So add a decimal somewhere:
SELECT Region, count(Region) AS Number,
count(region) * 1.0 / (select count(*) FROM [WideWorldImportersDW].[Dimension].[City]) AS Percentage
FROM [Summed Region]
GROUP BY Region;
You don't need the subquery either, so use window functions:
SELECT Region, count(*) AS Number,
count(*) * 1.0 / sum(count(*)) over () AS Percentage
FROM [Summed Region]
GROUP BY Region;

Join two tables

Hi I have two sql queries
Query 1:
Select Year, country, state, city, sales from db.sales1 (for current Year)
Query 2:
Select Year, country, state, city, sales from db.sales2(For last 4 years)
Requirement:
Select Current Yr, Country, City, Sales_current, Sales_yr2015, Sales_yr2016 from above 2 queries.
How can I do that?
Thanks
You are looking for UNION ALL
Select Year, country, state, city, sales from db.sales1
UNION ALL
Select Year, country, state, city, sales from db.sales2
Or if you want a more condense report
SELECT country, state, city,
SUM( CASE WHEN Year = 2015 THEN Sales ELSE 0 END) as Sales_yr2015,
SUM( CASE WHEN Year = 2016 THEN Sales ELSE 0 END) as Sales_yr2016
FROM (
Select Year, country, state, city, sales from db.sales1
UNION ALL
Select Year, country, state, city, sales from db.sales2
) T
GROUP BY country, state, city