Issues with postgreSQL subqueries - sql

I have the following chunk of code, in which in trying to count the sales of beef, chicken and pork in each month of the last year (i also need to determine the market share of the meats each month)
SELECT
CAST(EXTRACT('MONTH' FROM TO_TIMESTAMP(FULLDATE, 'YYYY-MM-DD')) AS INT) AS month
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
(
SELECT
COUNT (purchaseid) AS total_sales
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
WHERE category = 'whole milk' OR category = 'yogurt' OR category = 'domestic eggs'
GROUP BY month
) a
GROUP BY month
ORDER BY month
The expected result is the following image
EDIT to add the exact error message
but in getting this error message
syntax error at or near "SELECT"
LINE 6: SELECT
^
[SQL: SELECT
CAST(EXTRACT('MONTH' FROM TO_TIMESTAMP(FULLDATE, 'YYYY-MM-DD')) AS INT) AS month
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
(
SELECT
COUNT (purchaseid) AS total_sales
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
WHERE category = 'whole milk' OR category = 'yogurt' OR category = 'domestic eggs'
GROUP BY month
) a
GROUP BY month
ORDER BY month
This is the data schema i'm working with.
EDIT
I'm aware i can query the total_sales like this:
SELECT
CAST(EXTRACT('MONTH' FROM TO_TIMESTAMP(FULLDATE, 'YYYY-MM-DD')) AS INT) AS month,
COUNT (purchaseid) AS total_sales
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
WHERE category = 'beef' OR category = 'pork' OR category = 'chicken'
GROUP BY month
ORDER BY month
But doing it like this locks me out of doing of writting the market_share formula on the select statement because of the WHERE statement no being inside a subquery.

This query should give you the count of sales by month and category. I can't test it because I don't have datas.
SELECT
c.category,
EXTRACT('MONTH' FROM FULLDATE) AS month,
count(purchaseid) AS total_sales
FROM purchases_2020 p JOIN categories c ON p.purchaseid = c.purchase_id
WHERE category in ('beef','pork','chicken')
GROUP BY month,c.category
ORDER BY month,c.category;

Related

Decluttering a SQL query

For a practice project i wrote the following query and i was wondering if there is way to make it more efficient than writing everything 12 times like a for loop for sql.
CREATE TABLE temp (month INT, total_sales INT, market_share decimal(5,2), year_change decimal(5,2))
insert into temp (month)
Values (1)
UPDATE temp
SET total_sales = (
SELECT COUNT(purchases_2020.purchaseid)
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31')
)
WHERE month = 1
UPDATE temp
SET market_share = (
SELECT (SELECT 100 * COUNT(purchases_2020.purchaseid)
FROM purchases_2020
JOIN categories ON purchases_2020.purchaseid = categories.purchase_id
WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31'))
* 1. /
(SELECT COUNT(purchases_2020.purchaseid)
FROM purchases_2020
WHERE purchases_2020.fulldate BETWEEN '2020-01-01' AND '2020-01-31')
)
WHERE month = 1
UPDATE temp
SET year_change = (
SELECT market_share -
(SELECT
(SELECT 100 * COUNT(purchases_2019.purchase_id)
FROM purchases_2019
JOIN categories ON purchases_2019.purchase_id = categories.purchase_id
WHERE (categories.category = 'whole milk' OR categories.category = 'yogurt' OR categories.category = 'domestic eggs') AND (purchases_2019.full_date BETWEEN '2019-01-01' AND '2019-01-31'))
* 1./
(SELECT COUNT(purchases_2019.purchase_id)
FROM purchases_2019
WHERE purchases_2019.full_date BETWEEN '2019-01-01' AND '2019-01-31'))
FROM temp
WHERE month = 1
)
WHERE month = 1
EDIT
I was given the 3 tables represented on the following database schema , and im trying to create a table with the total sales of dairy every month, the monthly market share of the dairy products and the difference between the 2020 monthly market share and the 2019 monthly market share (the year change colunm)
There is also an aritmethic error somewhere, when checking the project i get the following message ResultSet does not contain the correct numeric values! and im at my wits end looking for it butmy priority is to decluter the query.
Your error message tells me that you are trying to run this from a reporting tool or a host language.
It also makes no sense to put the data into separate tables by years.
SQL is a declarative language that works with data as sets.
Instead of pushing the results into table temp, try writing a query like this:
with all_data as (
select p.fulldate, p.purchaseid, c.category,
extract(year from p.fulldate) as year,
extract(month from p.fulldate) as month
from purchases_2020 p
join categories c on c.purchase_id = p.purchaseid
union all
select p.fulldate, p.purchaseid, c.category,
extract(year from p.fulldate) as year,
extract(month from p.fulldate) as month
from purchases_2019 p
join categories c on c.purchase_id = p.purchaseid
), kpis as (
select year, month,
count(purchaseid)
filter (where category in ('whole milk', 'yogurt', 'domestic eggs'))
as dairy_sales,
count(purchaseid) * 1.0 as total_sales
from all_data
group by year, month
)
select ty.month, ty.dairy_sales as total_sales,
100.0 * ty.dairy_sales / ty.total_sales as market_share,
100.0 * ( (ty.dairy_sales / ty.total_sales)
- (ly.dairy_sales / ly.total_sales)) as year_change
from kpis ty
join kpis ly
on (ly.year, ly.month) = (ty.year - 1, ty.month);

Combining multiple queries

I want a table with all customers and their last charge transaction date and their last invoice date. I have the first two, but don't know how to add the last invoice date to the table. Here's what I have so far:
WITH
--Last customer transaction
cust_trans AS (
SELECT customer_id, created
FROM charges a
WHERE created = (
SELECT MAX(created) AS last_trans
FROM charges b
WHERE a.customer_id = b.customer_id)),
--All customers
all_cust AS (
SELECT customers.id AS customer, customers.email, CAST(customers.created AS DATE) AS join_date, ((1.0 * customers.account_balance)/100) AS balance
FROM customers),
--Last customer invoice
cust_inv AS (
SELECT customer_id, date
FROM invoices a
WHERE date = (
SELECT MAX(date) AS last_inv
FROM invoices b
WHERE a.customer_id = b.customer_id))
SELECT * FROM cust_trans
RIGHT JOIN all_cust ON all_cust.customer = cust_trans.customer_id
ORDER BY join_date;
This should get what you need. Notice each individual subquery is left-joined to the customer table, so you always START with the customer, and IF there is a corresponding record in each subquery for max charge date or max invoice date, it will be pulled in. Now, you may want to apply a COALESCE() for the max dates to prevent showing nulls, such as
COALESCE(maxCharges.LastChargeDate, '') AS LastChargeDate
but your call.
SELECT
c.id AS customer,
c.email,
CAST(c.created AS DATE) AS join_date,
((1.0 * c.account_balance) / 100) AS balance,
maxCharges.LastChargeDate,
maxInvoices.LastInvoiceDate
FROM
customers c
LEFT JOIN
(SELECT
customer_id,
MAX(created) LastChargeDate
FROM
charges
GROUP BY
customer_id) maxCharges ON c.id = maxCharges.customer_id
LEFT JOIN
(SELECT
customer_id,
MAX(date) LastInvoiceDate
FROM
invoices
GROUP BY
customer_id) maxInvoices ON c.id = maxInvoices.customer_id
ORDER BY
c.created

FInding market share and year change with SQL

Here for database schema
The Case Problem:
What was the total number of purchases of dairy products for each month of 2020 (i.e., the total_sales)?
What was the total share of dairy products (out of all products purchased) for each month of 2020 (i.e., the market_share)?
For each month of 2020, what was the percentage increase or decrease in total monthly dairy purchases compared to the same month in 2019 (i.e., the year_change)?
As a result, it interested in these three categories (which they treat as dairy): ‘whole milk’, 'yogurt' and 'domestic eggs'.
The instruction:
Order your query by month in ascending order. Both month and total_sales should be expressed as integers, and market_share and year_change should be percentages rounded to two decimal places (e.g., 27.95% becomes 27.95).
Your query will need to return a table that resembles the following, including the same column names.
Here for the code:
with purchases_2019 as (SELECT p1.month as month,COUNT(p1.purchase_id) as count_2
FROM purchases_2019 as p1
LEFT JOIN categories as cat ON p1.purchase_id=cat.purchase_id
WHERE cat.category IN ('whole milk', 'yogurt' ,'domestic eggs')
GROUP BY p1.month
ORDER BY p1.month ASC),
purchases_2020 as ( SELECT to_char(CAST(p2.fulldate AS DATE),'MM')::int as month,
COUNT(p2.purchaseid) as total_sales,
ROUND((COUNT(p2.purchaseid)*100::numeric/18277)::numeric,2) as market_share
FROM purchases_2020 as p2
LEFT JOIN categories as cat ON p2.purchaseid=cat.purchase_id
WHERE cat.category IN ('whole milk', 'yogurt' ,'domestic eggs')
GROUP BY month
ORDER BY month ASC)
SELECT t2.month,t2.total_sales,t2.market_share,
ROUND(((t2.total_sales-t1.count_2)*100::numeric/t1.count_2) ,2) as year_change
FROM purchases_2020 as t2
INNER JOIN purchases_2019 as t1 ON t2.month=t1.month
The result is obtained:
But it's still wrong answer. I don't have any idea. Can you give me some enlightenment? Thank You
with p as
(select
extract(month from to_date(b.full_date, 'YYYY/MM/DD')) as "month",
sum(case when c.category in ('whole milk', 'yogurt', 'domestic eggs') then 1 else 0 end) as "old_sales"
from purchases_2019 b left join categories c
on b.purchase_id = c.purchase_id
group by 1
order by 1),
temp as
(select
extract(month from to_date(a.fulldate,'YYYY/MM/DD')) as "month",
sum(case when c.category in ('whole milk', 'yogurt', 'domestic eggs') then 1 else 0 end) as "total_sales",
round(100 * sum(case when c.category in ('whole milk', 'yogurt', 'domestic eggs') then 1 else 0 end)::numeric
/ count(a.purchaseid),2) as "market_share"
from
purchases_2020 a left join categories c
on a.purchaseid = c.purchase_id
group by 1
order by 1)
select
temp.month, total_sales, market_share,
round(100 * (total_sales - old_sales)::numeric / old_sales, 2) as "year_change"
from temp left join p on temp.month = p.month;
Why 18277?
This part:
ROUND((COUNT(p2.purchaseid)*100::numeric/18277)::numeric,2) as market_share
Could there be an error in the market_share calculation?
I think in this code, only 3 categories are calculated, but market share should not be all sales/3 category sales?
Just an idea.

How to return the most ordered item for each month

I am trying to return the most ordered product per month, of the year 2007. I would like to see the name of the product, how many of them where ordered that month, and the month. I am using the AdventureWorks2012 database. I have tried a few different ways but each time multiple product orders are returned for the same month, instead of the one product that had the most order quantity that month. Sorry if this is not clear. I am trying to test myself so I make up my own questions and try to answer them. If anyone knows a site that have questions and answers like this so I can verify that would be super helpful! Thanks for any help. Here is the farthest I have been able to get with the query.
WITH Ord2007Sum
AS (SELECT sum(od.orderqty) AS sorder,
od.productid,
oh.orderdate,
od.SalesOrderID
FROM Sales.SalesOrderDetail AS od
INNER JOIN
sales.SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE year(oh.OrderDate) = 2007
GROUP BY ProductID, oh.OrderDate, od.SalesOrderID)
SELECT max(sorder),
s.productid,
month(h.orderdate) AS morder --, s.salesorderid
FROM Ord2007Sum AS s
INNER JOIN
sales.SalesOrderheader AS h
ON s.OrderDate = h.OrderDate
GROUP BY s.ProductID, month(h.orderdate)
ORDER BY morder;
Make a CTE that groups our products by month and creates a sum
;WITH OrderRows AS
(
SELECT
od.ProductId,
MONTH(oh.OrderDate) SalesMonth,
SUM(od.orderqty) OVER (PARTITION BY od.ProductId, MONTH(oh.OrderDate) ORDER BY oh.OrderDate) ProdMonthSum
FROM SalesOrderDetail AS od
INNER JOIN SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE year(oh.OrderDate) = 2007
),
Make a simple numbers table to break out each month of the year
Months AS
(
SELECT 1 AS MonthNum UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8
UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12
)
We query our months table against the data and select the top product for each month based on the sum
SELECT
m.MonthNum,
d.ProductID,
d.ProdMonthSum
FROM Months m
OUTER APPLY
(
SELECT TOP 1 r.ProductID, r.ProdMonthSum
FROM OrderRows r
WHERE r.SalesMonth = m.MonthNum
ORDER BY ProdMonthSum DESC
) d
Your group by statement should not include oh.OrderDate, od.SalesOrderID because this will aggregate your data to the incorrect level. You want the ProductID that was most commonly sold per month so the group by conditions become ProductID, datepart(mm,oh.OrderDate). As Andrew suggested the Row_Number function is useful in this case as it lets you create a key that is ordered by month and sorder and which resets each month. Finally in the outer query limits the results to the first instance (which is the highest quantity)for each month.
WITH Ord2007Sum
AS(
SELECT sum(od.orderqty) AS sorder,
od.productid,
datepart(mm,oh.OrderDate) AS 'Month'
row_number() over (partition by datepart(mm,oh.OrderDate)
Order by datepart(mm,oh.OrderDate)desc, sorder desc) row
FROM Sales.SalesOrderDetail AS od
INNER JOIN
sales.SalesOrderHeader AS oh
ON od.SalesOrderID = oh.SalesOrderID
WHERE datepart(yyyy,oh.OrderDate) = 2007
GROUP BY ProductID, datepart(mm,oh.OrderDate)
)
SELECT productid,
sorder,
[month]
FROM Ord2007Sum
WHERE row =1

Output two columns for 1 field for different date ranges?

I have a SQL table "ITM_SLS" with the following fields:
ITEM
DESCRIPTION
TRANSACTION #
DATE
QTY SOLD
I want to be able to output QTY SOLD for a one month value and a year to date value so that the output would look like this:
ITEM, DESCRIPTION, QTY SOLD MONTH, QTY SOLD YEAR TO DATE
Is this possible?
You could calculate the total quantity sold using group by in a subquery. For example
select a.Item, a.Description, b.MonthQty, c.YearQty
from (
select distinct Item, Description from TheTable
) a
left join (
select Item, sum(Qty) as MonthQty
from TheTable
where datediff(m,Date,getdate()) <= 1
group by Item
) b on a.Item = b.Item
left join (
select Item, sum(Qty) as YearQty
from TheTable
where datediff(y,Date,getdate()) <= 1
group by Item
) c on a.Item = c.Item
The method to limit the subquery to a particular date range differs per DBMS, this example uses the SQL Server datediff function.
Assuming the "one month" is last month...
select item
, description
, sum (case when trunc(transaction_date, 'MM')
= trunc(add_months(sysdate, -1), 'MM')
then qty_sold
else 0
end) as sold_month
, sum(qty_sold) as sold_ytd
from itm_sls
where transaction_date >= trunc(sysdate, 'yyyy')
group by item, description
/
This will give you an idea of what you can do:
select
ITEM,
DESCRIPTION,
QTY SOLD as MONTH,
( select sum(QTY SOLD)
from ITM_SLS
where ITEM = I.ITEM
AND YEAR = i.YEAR
) as YEAR TO DATE
from ITM_SLS I