Complex SQL query with group by and having - sql

I am having a table orders
orders (
id int unsigned not null,
fcr_date TIMESTAMP,
completion_date TIMESTAMP,
factory_no varchar(255),
vendor_no varchar(255))
Please ignore the data type typos if any.
I want to write a sql query that helps me filter the data per vendor factory. The data to fetch includes the number of orders per vendor factory(a unique group of vendor_no, factory_no), vendor_no, factory_no and the percentage of orders for which fcr_date is greater than completion_date(so percentage = number of orders where fcr_date is greater than completion date / count of orders). After that i need to filter the data where percentage is greater than say 20%.
I wrote the following query:
SELECT vendor_no As vendor,
factory_no As factory,
COUNT(1) as count,
SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) as filter_orders,
ROUND(filter_orders / count * 100, 4) as percent
FROM #orders
GROUP BY vendor_no,
factory_no
HAVING percent>20
but postgresql complains that it needs to have a column called percent in table to filter the results based on that. Any help is appreciated.
Thanks.

Change it to:
HAVING ROUND(filter_orders / count * 100, 4) > 20
Because percent isn't an actual column, you need to give it the calculation to perform the filter.
Edit
OK, looking at this further, you've got at least two ways to write this: the one I'd recommend is the first, which involves wrapping in a sub-query (as someone already suggested):
Option 1
SELECT vendor As vendor,
factory As factory,
[count],
ROUND(filter_orders / count * 100, 4) as [percent]
FROM
(
SELECT vendor_no As vendor,
factory_no As factory,
COUNT(1) as count,
SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) as filter_orders
FROM #orders
GROUP BY vendor_no,
factory_no
) AS a
WHERE ROUND(filter_orders / count * 100, 4) > 20
Option 2
SELECT vendor_no As vendor,
factory_no As factory,
COUNT(1) as count,
SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) as filter_orders,
ROUND(SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) / count(1) * 100, 4) as [percent]
FROM #orders
GROUP BY vendor_no,
factory_no
HAVING ROUND(SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) / count(1) * 100, 4) > 20

Wrap your query with an outer filtering query:
SELECT * FROM (
SELECT vendor_no As vendor,
factory_no As factory,
COUNT(1) as count,
SUM(CASE WHEN fcr_date > completion_date THEN 1 ELSE 0 END) as filter_orders,
ROUND(filter_orders / count * 100, 4) as percent
FROM #orders
GROUP BY vendor_no,
factory_no
) x
WHERE percent>20

I'm pretty sure you can't use aliases (like percent) in having clauses or group by clauses. And by "pretty" I mean Oracle won't let me use aliases in having/group by clauses...not sure about other vendors.

Related

SQL : hiding a calculated column

CREATE table orders
{
id integer,
product_id integer,
type VARCHAR(16)
}
SELECT
(SELECT COUNT(*) FROM orders) AS "Order Count",
-- I don't want total to show up
(SELECT COUNT(*) FROM orders WHERE product_id = 500) AS "total",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'small') * 100 / "total" AS "% Small Sold",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'medium') * 100 / "total" AS "% Medium Sold",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'large') * 100 / "total" AS "% Large Sold"
FROM
orders
I have this SQL report. I have a number of columns and one of I'm creating to use to calculate my other columns, in this case "total". I don't want it to appear in the report though. Is there a way to code it in some other part of the query or mark it as hidden? I'm using Postgres.
You can use a common table expression (CTE) for your totals. Then select the fields you want to keep in your report from the CTE.
WITH totals AS (
SELECT
(SELECT COUNT(*) FROM orders) AS "Order Count",
-- I don't want total to show up
(SELECT COUNT(*) FROM orders WHERE product_id = 500) AS "total",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'small') * 100 / "total" AS "% Small Sold",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'medium') * 100 / "total" AS "% Medium Sold",
(SELECT COUNT(*) FROM orders WHERE product_id = 500 AND type = 'large') * 100 / "total" AS "% Large Sold"
FROM orders
)
SELECT
"Order Count",
"% Small Sold",
"% Medium Sold",
"% Large Sold"
FROM
totals
I'm not sure what you mean by hidden but I have to show you a better way to write this query
SELECT
COUNT(*) AS "Order Count",
-- I don't want total to show up
SUM(CASE WHEN PRODUCT_ID = 500 THEN 1 ELSE 0 END) AS "total",
SUM(CASE WHEN PRODUCT_ID = 500 AND type = 'small' THEN 1 ELSE 0 END) * 100 / SUM(CASE WHEN PRODUCT_ID = 500 THEN 1 ELSE 0 END) AS "% Small Soldl",
SUM(CASE WHEN PRODUCT_ID = 500 AND type = 'medium' THEN 1 ELSE 0 END) * 100 / SUM(CASE WHEN PRODUCT_ID = 500 THEN 1 ELSE 0 END) AS "% Medium Soldl",
SUM(CASE WHEN PRODUCT_ID = 500 AND type = 'large' THEN 1 ELSE 0 END) * 100 / SUM(CASE WHEN PRODUCT_ID = 500 THEN 1 ELSE 0 END) AS "% Large Soldl",
FROM orders
I expect you will see a significant increase in performance
Why don' t you filter first? It is decreasing your query performance.
Check this:
with maintab as (select case
when type is not Null then type
else 'total'
end type, count(*) cnt from orders
where product_id = 500 and type in ('small', 'medium', 'large')
group by rollup(type))
select type, cnt*100/(select cnt from maintab where type = 'total' ) percentage
from maintab
where type in ('small', 'medium', 'large');
If your table contains different values in product_id and type columns, of course you should go with this one:
with cnttab as (select count(*) cnt from orders), maintab as (select type, count(*) cnt from orders
where product_id = 500 and type in ('small', 'medium', 'large')
group by type)
select type, cnt*100/(select cnt from cnttab) percentage
from maintab;
You can compare explain plans.

Advanced SQL with window function

I have Table a(Dimension table) and Table B(Fact table) stores transaction shopper history.
Table a : shopped id(surrogate key) created for unique combination(any of column 2,colum3,column4 repeated it will have same shopper id)
Table b is transaction data.
I am trying to identify New customers and repeated customers for each week, expected output is below.
I am thinking following SQL Statement
Select COUNT(*) OVER (PARTITION BY shopperid,weekdate) as total_new_shopperid for Repeated customer,
for Identifying new customer(ie unique) in same join condition, I am stuck on window function..
thanks,
Sam
You can use the DENSE_RANK analytical function along with aggregate function as follows:
SELECT WEEK_DATE,
COUNT(DISTINCT CASE WHEN DR = 1 THEN SHOPPER_ID END) AS TOTAL_NEW_CUSTOMER,
SUM(CASE WHEN DR = 1 THEN AMOUNT END) AS TOTAL_NEW_CUSTOMER_AMT,
COUNT(DISTINCT CASE WHEN DR > 1 THEN SHOPPER_ID END) AS TOTAL_REPEATED_CUSTOMER,
SUM(CASE WHEN DR > 1 THEN AMOUNT END) AS TOTAL_REPEATED_CUSTOMER_AMT
FROM
(
select T.*,
DENSE_RANK() OVER (PARTITION BY SHOPPER_ID ORDER BY WEEK_DATE) AS DR
FROM YOUR_TABLE T);
GROUP BY WEEK_DATE;
Cheers!!
Tejash's answer is fine (and I'm upvoting it).
However, Oracle is quite efficient with aggregation, so two levels of aggregation might have better performance (depending on the data):
select week_date,
sum(case when min_week_date = week_date then 1 else 0 end) as new_shoppers,
sum(case when min_week_date = week_date then amount else 0 end) as new_shopper_amount,
sum(case when min_week_date > week_date then 1 else 0 end) as returning_shoppers,
sum(case when min_week_date > week_date then amount else 0 end) as returning_amount
from (select shopper_id, week_date,
sum(amount) as amount,
min(week_date) over (partition by shopper_id) as min_week_date
from t
group by shopper_id, week_date
) sw
group by week_date
order by week_date;
Note: If this has better performance, it is probably due to the elimination of count(distinct).

SQL - Dividing aggregated fields, very new to SQL

I have list of line items from invoices with a field that indicates if a line was delivered or picked up. I need to find a percentage of delivered items from the total number of lines.
SALES_NBR | Total | Deliveryrate
1 = Delivered 0 = picked up from FULFILLMENT.
SELECT SALES_NBR,
COUNT (ITEMS) as Total,
SUM (case when FULFILLMENT = '1' then 1 else 0 end) as delivered,
(SELECT delivered/total) as Deliveryrate
FROM Invoice_table
WHERE STORE IN '0123'
And SALE_DATE >='2020-02-01'
And SALE_DATE <='2020-02-07'
Group By SALES_NBR, Deliveryrate;
My query executes but never finishes for some reason. Is there any easier way to do this? Fulfillment field does not contain any NULL values.
Any help would be appreciated.
I need to find a percentage of delivered items from the total number of lines.
The simplest method is to use avg():
select SALES_NBR,
avg(fulfillment) as delivered_ratio
from Invoice_table
where STORE = '0123' and
SALE_DATE >='2020-02-01' and
SALE_DATE <='2020-02-07'
group by SALES_NBR;
I'm not sure if the group by sales_nbr is needed.
If you want to get a "nice" query, you can use subqueries like this:
select
qry.*,
qry.delivered/qry.total as Deliveryrate
from (
select
SALES_NBR,
count(ITEMS) as Total,
sum(case when FULFILLMENT = '1' then 1 else 0 end) as delivered
from Invoice_table
where STORE IN '0123'
and SALE_DATE >='2020-02-01'
and SALE_DATE <='2020-02-07'
group by SALES_NBR
) qry;
But I think this one, even being ugglier, could perform faster:
select
SALES_NBR,
count(ITEMS) as Total,
sum(case when FULFILLMENT = '1' then 1 else 0 end) as delivered,
sum(case when FULFILLMENT = '1' then 1 else 0 end)/count(ITEMS) as Deliveryrate
from Invoice_table
where STORE IN '0123'
and SALE_DATE >='2020-02-01'
and SALE_DATE <='2020-02-07'
group by SALES_NBR

Subquery returned more than 1 value in MS SQL

In MS Sql.
SELECT a.SellerID,
SUM(TransactionFee) as TransactionFees,
SUM(Quantity*a.PriceItem) as TransactionValue,
COUNT(*) as OrdersWithTransactionFees,
SUM(Quantity) as Qty,
(SELECT SUM(a.Quantity*a.PriceItem) as WholeMonthTransactionValue
from BuyProductDetails where SellerID = a.SellerID) as aa
FROM BuyProductDetails as a
WHERE MONTH(a.OrderDate)=3
AND YEAR(a.OrderDate)=2013
AND TransactionFee IS NOT NULL
GROUP BY a.SellerID
I have the above query... it can't seems to be able to run.
Basically, I have this table BuyProductDetails which stores all the orders from different Sellers.
Some orders will have TransactionFee.
Now, what I need is to calculate the total sales of these orders with TransactionFee, and the total sales for these sellers including those orders without TransactionFee.
The result set should have the following fields:
SellerID
Sum of Transaction fee
Sum of total sales
Number of Orders with Transaction fee
Qty ordered
Total sales for that seller
But when I run this sql, it returns the following error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Any help is much appreciated. Thank you.
Tried something like this ?
SELECT a.SellerID,
SUM(TransactionFee) as TransactionFees,
SUM(Quantity*a.PriceItem) as TransactionValue,
COUNT(*) as OrdersWithTransactionFees,
SUM(Quantity) as Qty,
MIN(a.WholeMonthTransactionValue) as WholeMonthTransactionValue
FROM BuyProductDetails as a,
(SELECT b.SellerID,
SUM(b.Quantity*b.PriceItem) as WholeMonthTransactionValue,
MONTH(b.OrderDate),
YEAR(b.OrderDate)
FROM BuyProductDetails b
GROUP BY b.SellerID,
MONTH(b.OrderDate) as MonthID,
YEAR(b.OrderDate) as YearID) as aa
WHERE MONTH(a.OrderDate)=3
AND YEAR(a.OrderDate)=2013
AND TransactionFee IS NOT NULL
AND a.SellerID = aa.SellerID
AND MONTH(a.OrderDate)=aa.MonthID
AND YEAR(a.OrderDate) = aa.YearID
GROUP BY a.SellerID)
You can use more effective option with CASE expression
SELECT a.SellerID,
SUM(CASE WHEN TransactionFee IS NOT NULL THEN TransactionFee END) AS TransactionFees,
SUM(CASE WHEN TransactionFee IS NOT NULL THEN Quantity * PriceItem END) AS TransactionValue,
COUNT(CASE WHEN TransactionFee IS NOT NULL THEN 1 END) as OrdersWithTransactionFees,
SUM(CASE WHEN TransactionFee IS NOT NULL THEN Quantity END) as Qty,
SUM(Quantity * PriceItem) AS WholeMonthTransactionValue
FROM BuyProductDetails AS a
WHERE MONTH(a.OrderDate) = 3 AND YEAR(a.OrderDate) = 2013
GROUP BY a.SellerID
Demo on SQLFiddle
Or merely add correct alias in the subquery
SELECT a.SellerID,
SUM(TransactionFee) as TransactionFees,
SUM(Quantity*a.PriceItem) as TransactionValue,
COUNT(*) as OrdersWithTransactionFees,
SUM(Quantity) as Qty,
(SELECT SUM(d.Quantity * d.PriceItem)
FROM BuyProductDetails d
WHERE d.SellerID = a.SellerID) as WholeMonthTransactionValue
FROM BuyProductDetails as a
WHERE MONTH(a.OrderDate)=3
AND YEAR(a.OrderDate)=2013
AND TransactionFee IS NOT NULL
GROUP BY a.SellerID
Demo on SQLFiddle

Issues with MIN & MAX when using Case statement

I am trying to generate a summary report using various aggregate functions: MIN, MAX, SUM, etc. The issue I have is when I try to get a MIN and MAX of a field when I am also using the case statement. I am unable to get the MIN value of a field when I am using the case statement. I can best explain it with sample data and the sql statement:
Fields: AccountNumber, Symbol, TradeDate, TransactionType, Price, Quantity, Amount
Table: Trades
AccountNumber, Symbol, TradeDate, TransactionType, Price, Quantity, Amount
123,"XYZ",1/2/2011,"Buy",15,100,1500
123,"XYZ",1/2/2011,"Buy",10,50,500
123,"XYZ",1/2/2011,"Sell",20,100,2000
456,"ABC",1/3/2011,"Buy",10,20,200
456,"ABC",1/3/2011,"Buy",15,30,450
789,"DEF",1/4/2011,"Sell",30,100,3000
Query:
SELECT AccountNumber,
Symbol,
SUM(case when TransactionType = "Buy" then 1 else 0) as TotalBuys,
SUM(case when TransactionType = "Sell" then 1 else 0) as TotalSells,
MIN(case when TransactionType = "Buy" then Price else 0) as MinBuy,
MAX(case when TransactionType = "Buy" then Price else 0) as MaxBuy,
MIN(case when TransactionType = "Sell" then Price else 0) as MinSell,
MAX(case when TransactionType = "Sell" then Price else 0) as MaxSell,
MIN(Price) as MinPrice,
MAX(Price) as MaxPrice
FROM Trades
Group By AccountNumber, Symbol
What I am expecting is the following results:
AccountNumber, Symbol, TotalBuys, TotalSells, MinBuy, MaxBuy, MinSell, MaxSell, MinPrice, MaxPrice
123,"XYZ",2,1,10,15,20,20,10,20
456,"ABC",2,0,10,15,0,0,10,15
789,"DEF",0,1,0,0,30,30,30,30
However, I am getting the following results:
AccountNumber, Symbol, TotalBuys, TotalSells, MinBuy, MaxBuy, MinSell, MaxSell, MinPrice, MaxPrice
123,"XYZ",2,1,**0**,15,**0**,20,**0**,20
456,"ABC",2,0,10,15,0,0,10,15
789,"DEF",0,1,0,0,30,30,30,30
When there are two different TransactionTypes for each grouping, the Min fields (MinBuy,MinSell, and MinPrice) are coming out as 0 as opposed to what is expected. What am I doing wrong on the sql statement? Is there another way to get the desired results?
Min between 0 and a positive number is 0, you should change:
MIN(case when TransactionType = "Buy" then Price else 0)
by
MIN(case when TransactionType = "Buy" then Price else Null)
Null don't compute in an aggregation function.
Thats all.
Edited 6 years later:
As P5Coder says, it is enough without else clause, also I guess the end is mandatory on some database brands. Here it is:
MIN(case when TransactionType = "Buy" then Price end)