Spliting column values into groups and summing up values from linked table - sql

I have two tables in SQL, they are linked by Customer_ID.
customers
customer_id; account_created; company_name; city;
1 11/10/2011 abc new york
2 1/1/2018 xyz los angeles
3 11/10/2012 finance new jersey
4 21/04/2013 juices san francisco
orders
order_id; customer_id; order_date; shipping date; order_value; currency;
100 1 19/10/2019 20/10/2019 4000 USD
101 3 1/10/2019 2/10/2019 300 USD
102 2 13/11/2019 15/11/2019 7000 USD
103 4 12/9/2019 20/9/2019 100 USD
104 1 10/11/2019 12/11/2019 3000 USD
I would like to divide orders into two regions: East (contains New York, Boston and New Jersey) and West (Los Angeles, San Francisco) and then show sum of order_value for both regions in a way:
Region sum of order_value
East 10000
West 20000
Here are the tables, sorry they are in image, I can't format them (will learn asap!)

It seems really weird to call "New Jersey" as city. In any case, you want a case expression of some sort to assign the region, and then aggregation:
select (case when city in ('New York', 'Boston', 'New Jersey') then 'East'
when city in ('Los Angeles', 'San Francisco') then 'West'
else '???'
end) as region,
sum(order_value)
from customers c join
orders o
on o.customer_id = c.customer_id
group by (case when city in ('New York', 'Boston', 'New Jersey') then 'East'
when city in ('Los Angeles', 'San Francisco') then 'West'
else '???'
end)

Just add a third table with fields: City|Region
Then you just join the 3 tables and group by Region and sum your orders value.
No code needed.

Related

How to aggregate using distinct values across two columns?

I have the following data in an orders table:
revenue expenses location_1 location_2
3 6 London New York
6 11 Paris Toronto
1 8 Houston Sydney
1 4 Chicago Los Angeles
2 5 New York London
7 11 New York Boston
4 6 Toronto Paris
5 11 Toronto New York
1 2 Los Angeles London
0 0 Mexico City London
I would like to create a result set that has 3 columns:
a list of the 10 DISTINCT city names
the sum of revenue for each city
the sum of expenses for each city
The desired result is:
location revenue expenses
London 6 13
New York 17 33
Paris 10 17
Toronto 15 28
Houston 1 8
Sydney 1 8
Chicago 1 4
Los Angeles 2 6
Boston 7 11
Mexico City 0 0
Is it possible to aggregate on distinct values across two columns? If yes, how would I do it?
Here is a fiddle:
http://sqlfiddle.com/#!9/0b1105/1
Shorter (and often faster):
SELECT location, sum(revenue) AS rev, sum(expenses) AS exp
FROM (
SELECT location_1 AS location, revenue, expenses FROM orders
UNION ALL
SELECT location_2 , revenue, expenses FROM orders
) sub
GROUP BY 1;
May be faster:
WITH cte AS (
SELECT location_1, location_2, revenue AS rev, expenses AS exp
FROM orders
)
SELECT location, sum(rev) AS rev, sum(exp) AS exp
FROM (
SELECT location_1 AS location, rev, exp FROM cte
UNION ALL
SELECT location_2 , rev, exp FROM cte
) sub
GROUP BY 1;
The (materialized!) CTE adds overhead, which may outweigh the benefit. Depends on many factors like total table size, available indexes, possible bloat, available RAM, storage speed, Postgres version, ...
fiddle
You could UNION ALL two queries and then select from it...
select location, sum(rev) as rev, sum(exp) as exp
from (
select location_1 as location, sum(revenue) as rev, sum(expenses) as exp
from orders
group by location_1
union all
select location_2 as location, sum(revenue) as rev, sum(expenses) as exp
from orders
group by location_2
)z
group by location
order by 1

Unique table from multiple ones having same and different columns (SQL)

I have multiple datasets having different rows and fields.
dataset1
Customer_ID Date Category Address City School
4154124 1/2/2021 A balboa st. Canterbury Middleton
2145124 1/2/2012 A somewhere world St. Augustine
1621573 1/2/2012 A my_street somewhere St. Augustine
dataset2
Customer_ID Date Category Country Zipcode
14123 12/12/2020 B UK EW
416412 14/12/2020 B ES
dataset3
Customer_ID Date Category School University
4124123 07/12/2020 C Middleton Oxford
I would like a final dataset which includes all the columns (keeping only one copy of the common ones):
Customer_ID Date Category Address City School Country Zipcode University
4154124 1/2/2021 A balboa st. Canterbury Middleton
2145124 1/2/2012 A somewhere world St. Augustine
1621573 1/2/2012 A my_street somewhere St. Augustine
14123 12/12/2020 B UK EW
416412 14/12/2020 B ES
4124123 07/12/2020 C Middleton Oxford
would a left join be the best way to get the expected output? How I can keep Customer_ID Date and Category and duplicates column (e.g., School) only once?
You can achieve this using UNION ALL.
SELECT Customer_ID, Date, Category, Address, City, School, '' AS Country, '' AS ZipCode, '' AS university FROM dataset1
UNION ALL
SELECT Customer_ID, Date, Category, '', '', '', Country, Zipcode, '' FROM dataset2
UNION ALL
SELECT Customer_ID, Date, Category, '', '', School, '', '', University FROM dataset3

How to sum values if I don't have column to group it by

I need to sum sales grouped by country, but I have to group them manually because I don't have any other way.
Unfortunately, I don't have the column 'continent', but there are not too many countries on the list so I can do it manually. I can't create any new columns in the table, so I need to do it in a query.
For example:
country | sum of sales
Germany 1000
Italy 500
Canada 700
UK 1300
USA 3000
I would like to see the total sales for Europe and North America
continent | sum of sales
Europe 2800
North America 3700
You should be able to combine case expression and in predicate, something along this lines:
SELECT CASE
WHEN country in ('Germany', 'UK') THEN 'Europe'
WHEN country in ('Canada', 'USA') THEN 'North America'
END as continent,
sum("sum of sales")
FROM table
GROUP BY 1

SQL ordering cities ascending and persons descending

I have been stuck in complicated problem. I do not know the version of this SQL, it is school edition. But it is not relevant info now anyway.
I want order cities ascending and numbers descending. With descending numbers I mean when there is same city couple times it orders then biggest number first.
I also need row numbers, I have tried SELECT ROW_NUMBER() OVER(ORDER BY COUNT(FIRST_NAME)) row with no succes.
I have two tables called CUSTOMERS and EMPLOYEES. Both of them having FIRST_NAME, LAST_NAME, CITY.
Now I have this kind of code:
SELECT
CITY, COUNT(FIRST_NAME),
CASE WHEN COUNT(FIRST_NAME) >= 0 THEN 'CUSTOMERS'
END
FROM CUSTOMERS
GROUP BY CITY
UNION
SELECT
CITY, COUNT(FIRST_NAME),
CASE WHEN COUNT(FIRST_NAME) >= 0 THEN 'EMPLOYEES'
END
FROM EMPLOYEES
GROUP BY CITY
This SQL code gives me list like this:
CITY
NEW YORK 2 CUSTOMERS
MIAMI 1 CUSTOMERS
MIAMI 4 EMPLOYEES
LOS ANGELES 1 CUSTOMERS
CHIGACO 1 CUSTOMERS
HOUSTON 1 CUSTOMERS
DALLAS 2 CUSTOMERS
SAN JOSE 2 CUSTOMERS
SEATTLE 2 CUSTOMERS
SEATTLE 5 EMPLOYEES
BOSTON 1 CUSTOMERS
BOSTON 3 EMPLOYEES
I want it look like this:
ROW CITY
1 NEW YORK 2 CUSTOMERS
2 MIAMI 4 EMPLOYEES
3 MIAMI 1 CUSTOMERS
4 LOS ANGELES 1 CUSTOMERS
5 CHIGACO 1 CUSTOMERS
6 HOUSTON 1 CUSTOMERS
7 DALLAS 2 CUSTOMERS
8 SAN JOSE 2 CUSTOMERS
9 SEATTLE 5 EMPLOYEES
10 SEATTLE 2 CUSTOMERS
11 BOSTON 3 EMPLOYEES
12 BOSTON 1 CUSTOMERS
You can use window functions in the ORDER BY:
SELECT c.*
FROM ((SELECT CITY, COUNT(*) as cnt, 'CUSTOMERS' as WHICH
FROM CUSTOMERS
GROUP BY CITY
) UNION ALL
(SELECT CITY, COUNT(*), 'EMPLOYEES'
FROM EMPLOYEES
GROUP BY CITY
)
) c
ORDER BY MAX(cnt) OVER (PARTITION BY city) DESC,
city,
cnt DESC;

Joining SQL lookup table with data table

I have a lookup table say cities with fields CityId, CityName
CityId CityName
1 New York
2 San Francisco
3 Chicago
I have an orders table which has fields: CityId, CustId, CompletedOrders, PendingOrders
CityId CustId CompletedOrders PendingOrders
1 123 100 50
2 123 75 20
I want a table/report that lists orders details of a given customer in all cities, i.e. the result I need is:
CityId CityName CustId CompletedOrders PendingOrders
1 New York 123 100 50
2 San Francisco 123 75 20
3 Chicago 123 0 0
How to do that ?
SELECT
c.CityId
c.CityName
o.CustId,
o.CompletedOrders
o.PendingOrders
FROM cities c
LEFT JOIN orders o ON ( c.CityId = o.CityId )
This will return all the rows that you want, but for the rows that don't exist in details it will return NULL values, so you would get:
CityId CityName CustId CompletedOrders PendingOrders
1 New York 123 100 50
2 San Francisco 123 75 20
3 Chicago 123 NULL NULL
The solution to get 0 instead depends on your database. With MySQL use IFNULL, with Oracle use NVL.
try this
select c.CityId,c.CityName,o.CustId,o.CompletedOrders,o.PendingOrders
from orders Left join cities
on o.CityId = c.CityId