Calculating % of total with a Grouped Case statement

Calculating % of total with a Grouped Case statement - sql

So I have a CASE statement that I've Grouped. But, I was also trying to calculate the percentage of the total for each Grouped CASE result. When I run the commands I made below, it gives
Region Number Percentage
West Coast 11675 0
Not West Coast 104620 0
I don't understand why 'Percentage' comes as '0'.
Here's the code, with the 'problem line' labeled.
With [Summed Region] AS
(
SELECT
[State Province],
CASE [State Province]
WHEN 'Oregon' THEN 'West Coast'
WHEN 'Washington' THEN 'West Coast'
WHEN 'California' THEN 'West Coast'
ELSE 'Not West Coast'
END AS 'Region'
FROM
[WideWorldImportersDW].[Dimension].[City]
)
SELECT
Region,
count(Region) AS Number,
---THE PROBLEM LINE IS BELOW THIS---
count(region)/(select count(*) FROM [WideWorldImportersDW].[Dimension].
[City]) AS Percentage
FROM
[Summed Region]
GROUP BY
Region
What's the problem with that line? If I split out the two pieces, each returns the correct number. But when I divide one by the other I get '0'.
Thanks!

It is called integer division. So add a decimal somewhere:
SELECT Region, count(Region) AS Number,
count(region) * 1.0 / (select count(*) FROM [WideWorldImportersDW].[Dimension].[City]) AS Percentage
FROM [Summed Region]
GROUP BY Region;
You don't need the subquery either, so use window functions:
SELECT Region, count(*) AS Number,
count(*) * 1.0 / sum(count(*)) over () AS Percentage
FROM [Summed Region]
GROUP BY Region;

Related

Count total users' first order for each region, each day

I have a table called orders.
Link for the table here:
table
I want to get the total users' first order in each region, each day.
First, I tried to get: the first order for each unique user by doing this:
SELECT customer_id,
MIN(order_date) first_buy,
region
FROM orders
GROUP BY 1
ORDER BY 2, 1;
This resulted with:
customer_id, first_buy, region
BD-11500, 2017-01-02, Central
DB-13060, 2017-01-03, West
GW-14605, 2017-01-03, West
HR-14770, 2017-01-03, West
SC-20380, 2017-01-03, West
VF-21715, 2017-01-03, Central
And so on.
You can see there are 4 unique users on 2017-01-03 in West.
I want to get this result:
first_buy, region, count_user
2017-01-02, Central, 1
2017-01-03, West, 4
2017-01-03, Central, 1

I haven't tested this but I think this will give what you wanting to achieve
SELECT first_buy, region, COUNT(customer_id) AS count_user
FROM (SELECT customer_id, MIN(order_date) first_buy, region
FROM orders
GROUP BY customer_id) AS t
GROUP BY first_buy, region

Try this:
SELECT
first_buy = (SELECT MIN(order_date) FROM orders WHERE orders.region = ord.region),
ord.region,
count_user = ISNULL((SELECT COUNT(*) FROM orders WHERE orders.region = ord.region GROUP BY orders.customer_id), 0)
FROM orders ord
GROUP BY ord.region

Using SQLite, how can I calculate the maximum year on year growth rate for each year?

I am learning about SQL and I am doing a practice exercise called World Populations SQL Practice on Codecademy. There is one table with three columns: country, population, and year. I am interested in calculating the country with the maximum year-on-year growth rate each year. (This wasn't suggested by Codecademy, I just think it's an interesting idea).
I can calculate all of the year-on-year growth rates with this query:
SELECT country,
100.0 * ((SELECT population FROM population_years AS p2
WHERE p2.year = p1.year + 1
AND p2.country = p1.country)
- population) / population AS year_on_year_growth,
year
FROM population_years AS p1
WHERE year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth;
and I can calculate the maximum year-on-year growth rate for a particular year, such as 2005, with a query such as this:
SELECT country,
100.0 * ((SELECT population FROM population_years AS p2
WHERE p2.year = p1.year + 1
AND p2.country = p1.country)
- population) / population AS year_on_year_growth,
year
FROM population_years AS p1
WHERE year = 2005
AND year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth DESC
LIMIT 1;
Using python, I can solve the problem using the first query saved as yoy_query if I do this:
yoy_result = c.execute(yoy_query).fetchall()
sorted([record for record in yoy_result if record[1] == max([row[1] for row in yoy_result if row[2] == record[2]])],key=lambda x:x[2])
and I get the desired result:
[('Montserrat', 7.34177215189872, 2000), ('Montserrat', 13.4433962264151, 2001), ('Afghanistan', 5.803891762260126, 2002), ('Montserrat', 10.467706013363028, 2003), ('Liberia', 4.7976709085316545, 2004), ('Jordan', 7.088496587486171, 2005), ('Jordan', 6.764378108744186, 2006), ('Montserrat', 12.638580931263864, 2007), ('Liberia', 4.157111008408977, 2008), ('Niger', 3.737166190281749, 2009)]
But I can't think of a way to do this using SQL. Any ideas? I think the reason it seems much easier in python is because I'm able to save the intermediate result, then run a second calculation on that.

You can do it with window functions LAG() and RANK():
select country, year_on_year_growth, year
from (
select *, rank() over (partition by year order by year_on_year_growth desc) as rnk
from (
select *,
100.0 * (population / lag(population) over (partition by country order by year) - 1) as year_on_year_growth
from population_years
)
)
The expression:
lag(population) over (partition by country order by year)
returns the country's population the previous year (assuming that there are no gaps between the years).
So I calculated the growth rate as:
((current year's population) / (previous year's population)) - 1

I guess the simplest thing to do would actually be to just use a view as follows:
CREATE VIEW yoy_growth
AS
SELECT country,
100.0 * ((SELECT population FROM population_years AS p2
WHERE p2.year = p1.year + 1
AND p2.country = p1.country)
- population) / population AS year_on_year_growth,
year
FROM population_years AS p1
WHERE year_on_year_growth IS NOT NULL
ORDER BY year_on_year_growth;
SELECT * FROM yoy_growth AS y1
WHERE year_on_year_growth = (
SELECT MAX(year_on_year_growth)
FROM yoy_growth AS y2
WHERE y1.year = y2.year
)
ORDER BY year;
That way I get the result I want, although the query does seem a little slow.

Combine and fuse the results of two SQL queries with UNION

I am writing two seperate SQL queries to get data for two different dates like so:
SELECT number, sum(sales) as sales, sum(discount) sa discount, sum(margin) as margin
FROM table_a
WHERE day = '2019-08-09'
GROUP BY number
SELECT number, sum(sales) as sales, sum(discount) sa discount, sum(margin) as margin
FROM table_a
WHERE day = '2018-08-10'
GROUP BY number
I tried fusing them like so to get the results for the same number in one row from two different dates:
SELECT number, sum(sales) as sales, sum(discount) sa discount, sum(margin) as margin, 0 as sales_n1, 0 as discount_n1, 0 as margin_n1
FROM table_a
WHERE day = '2019-08-09'
GROUP BY number
UNION
SELECT number, 0 as sales, 0 as discount, 0 as margin, sum(sales_n1) as sales_n1, sum(discount_n1) as discount_n1, sum(margin_n1) as margin_n1
FROM table_a
WHERE day = '2018-08-10'
GROUP BY number
But it didn't work as I get the rows for the first query with zeroes for the columns defined as zero followed by the columns of the second query in the same fashion.
How can I correct this to have the desired output ?

Use conditional aggregation:
SELECT number,
sum(case when day = '2019-08-09' then sales end) as sales_20190809,
sum(case when day = '2019-08-09' then discount end) sa discount, sum(margin) as margin_20190810,
sum(case when day = '2019-08-10' then sales end) as sales_20190809,
sum(case when day = '2019-08-10' then discount end) sa discount, sum(margin) as margin_20190810
FROM table_a
WHERE day IN ('2019-08-09', '2019-08-10')
GROUP BY number;
If you want the numbers in different rows (which you don't seem to), then use aggregation:
SELECT day, number, sum(sales) as sales, sum(discount) as discount, sum(margin) as margin
FROM table_a
WHERE day IN ('2019-08-09', '2019-08-10')
GROUP BY day, number

SQL Columns to Rows- for a View

I have a view which has
ID INQCLASS INQDETAIL Period BAL
1233 GROSS water 12-3-2017 233.32
1233 GROSS ENergy 12-3-2017 122.00
ID,INQCLASS, Period is same. Except the INQDETAIL and BAL
I want to combine this into a single row which displays water and energy Bal.
Any Suggestions would be helpful. Thank you

SELECT ID,
INQCLASS,
Period,
MAX(CASE WHEN INQDETAIL = 'water' then BAL else 0 end) as WaterBal,
MAX(CASE WHEN INQDETAIL = 'ENergy' then BAL else 0 end) as ENergyBal
FROM View_Name
GROUP BY ID, INQLASS, Period
The case statement serves to show the BAL only when the condition is met. So with case alone, this would still return two rows for each item, but one would have a Waterbal value and no energybal value, and the other would be the reverse.
When you do GROUP BY, every field has to either be in the GROUP BY list (in this case, ID, INQCLASS, Period), or have an aggregate function like SUM, MAX, COUNT, etc. (in this case Waterbal and energyBal have aggregate functions).
The GROUP BY in this case collapses the common ID, INQLASS, Period into single rows, and then takes the largest (MAX) value for Waterbal and energyBal. Since one is always 0, it simply supplies the other one.

A simple pivot table ought to do it. As long as you know Inqdetail values ahead of time:
select ID,
INQCLASS,
[Period],
[Water] AS [Water Bal],
[Energy] as [Energy Bal]
from
(
select [ID],
[INQCLASS],
[INQDETAIL],
[Period],
[BAL]
from #util
) As Utilities
PIVOT
(
SUM([BAL])
FOR [inqdetail] IN ([Water],[Energy])
) AS Pvttbl

Try something like this:
SELECT INQDETAIL
, PERIOD
, SUM(BAL) AS energy_Bal
FROM your_view
WHERE INQDETAIL LIKE 'water'
GROUP BY INQDETAIL
, PERIOD;

Try this:
SELECT *
FROM
(SELECT * FROM #temp) AS P
PIVOT
(
max(bal) FOR INQDETAIL IN ([water], [ENergy])
) AS pv1

SQL Summary of revenues by region (in ranked order from highest to lowest, calculate % of total for each region)

Summary of revenues by region (in ranked order from highest to lowest, calculate % of total for each region). Basically, I am trying to write a query that will show the revenues of each region relative to the total revenue.
I am using SQL in Microsoft Access.
My table has the following columns: ID, Region, Revenue
There are 3 regions: West, Central, East
Heres what I have so far:
SELECT Region, Sum(Revenue) AS TotalRevenue
FROM Sales
GROUP BY Region
ORDER BY Sum(Revenue) DESC
Any help would be greatly appreciated

Try this:
SELECT Region, SUM(Revenue) AS TotalRevenue,
(SUM(Revenue)/(SELECT Sum(Revenue) FROM Sales)) AS percentage
FROM Sales
GROUP BY Region
ORDER BY Sum(Revenue) DESC

This is one way that I'd offer, for only three regions its really overkill, but it should work.
SELECT Region, Sum(Revenue) AS TotalRevenue, Sum(Revenue)/x.allRegionRevenue
FROM Sales s,
inner join (select sum(revenue) allRegionRevenue)) x
on s.revenue*0 = x.allRegionRevenue*0
GROUP BY Region
ORDER BY Sum(Revenue) DESC
*Edit: * Modified this a bit as Access doesn't support actual "cross join" syntax, but I think we can "fake" it with an inner join on a condition that's always true - klugey trick here is merely to multiply references from each to zero forcing all recs to match. Hope this helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Calculating % of total with a Grouped Case statement - sql

Related

Count total users' first order for each region, each day

Using SQLite, how can I calculate the maximum year on year growth rate for each year?

Combine and fuse the results of two SQL queries with UNION

SQL Columns to Rows- for a View

SQL Summary of revenues by region (in ranked order from highest to lowest, calculate % of total for each region)

Categories

Resources