How to use a SQL sub query? - sql

I am attempting to use a sub query to query our order database and return 3 columns for example:
Date Orders Replacements
09-MAY-14 100 5
... ... ...
Each order that is created can be given a reason, which basically means that it is a replacement product i.e. orders without a reason are new orders and orders with a reason are replacement orders.
I am using the below query in an attempt to get this information, but I'm getting lots of error messages, and each time I think I've fixed one I create another 10, so assume I completely have the wrong idea here.
SELECT Orders.EntryDate AS "Date", COUNT(Orders.OrderNo) AS "Orders",
(SELECT COUNT(Orders.OrderNo) AS "Replacements"
FROM Orders
WHERE Orders.Reason IS NOT NULL
AND Orders.EntryDate = '09-MAY-2014'
AND Orders.CustomerNo = 'A001'
GROUP BY Orders.EntryDate
)
FROM Orders
WHERE Orders.Reason IS NULL
AND Orders.EntryDate = '09-MAY-2014'
AND Orders.CustomerNo = 'A001'
GROUP BY Orders.EntryDate
;

Why the sub query use a case!
SELECT Orders.EntryDate AS "Date", COUNT(Orders.OrderNo) AS "Orders",
sum(CASE WHEN Orders.reason is null then 1 else 0 end) as "Replacements"
FROM Orders
WHERE Orders.Reason IS NULL
AND Orders.EntryDate = '09-MAY-2014'
AND Orders.CustomerNo = 'A001'
GROUP BY Orders.EntryDate
The subquery has to execute each time, since you need to evaluate each record the case can do that for you and then sum the results. If you need to get a count of -non replacement orders then just do a different case instead of a count.

You could could sum a case expression instead of having a another subquery with another where clause:
SELECT Orders.EntryDate AS "Date",
SUM (CASE WHEN Orders.Reason IS NULL THEN 1 ELSE 0 END) AS "Orders",
SUM (CASE WHEN Orders.Reason IS NOT NULL THEN 1 ELSE 0 END) AS "Replacements"
FROM Orders
WHERE Orders.EntryDate = '09-MAY-2014'
AND Orders.CustomerNo = 'A001'
GROUP BY Orders.EntryDate

Your errors were probably due to the fact that you did not include the subquery in your group by clause. You can try that approach but this one is simpler:
select entrydate "date"
, count(orderno) "orders"
, sum(case when reason is not null then 1 else 0 end) "replacements"
etc
group by entrydate

Is this what you are trying to do?
SELECT Orders.EntryDate
, COUNT(case when Orders.reason is null then 1 end) AS orders
, COUNT(case when Orders.reason is not null then 1 end) AS Replacements
FROM Orders
WHERE Orders.EntryDate = '09-MAY-2014'
AND Orders.CustomerNo = 'A001'
GROUP BY Orders.EntryDate
The Replacements expression can be simplified to:
COUNT(Orders.reason)

Related

How to exclude 0 from count()? in sql?

I have a code as below where I want to count number of first purchases for a given period of time. I have a column in my sales table where if the buyer is not a first time buyer, then is_first_purchase = 0
For example:
buyer_id = 456391 is already an existing buyer who made purchases on 2 different dates.
Hence is_first_purchase column will show as 0 as per below.
If i do a count() on is_first_purchase for this buyer_id = 456391 then it should return 0 instead of 2.
My query is as follows:
with first_purchases as
(select *,
case when is_first_purchase = 1 then 'Yes' else 'No' end as first_purchase
from sales)
select
count(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
from first_purchases
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
It returned the below which is not an intended output
Appreciate if someone can help explain how to exclude is_first_purchase = 0 from the count, thanks.
Because COUNT function count when the value isn't NULL (include 0), if you don't want to count, need to let CASE WHEN return NULL
There are two ways you can count as your expectation, one is SUM other is COUNT but remove the part of else 0
SUM(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
COUNT(case when first_purchase = 'Yes' then 1 end) as no_of_first_purchases
From your question, I would combine CTE and main query as below
select
COUNT(case when is_first_purchase = 1 then 1 end) as no_of_first_purchases
from sales
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
I think that you are using COUNT() when you want SUM().
with first_purchases as
(select *,
case when is_first_purchase = 1 then 'Yes' else 'No' end as first_purchase
from sales)
select
SUM(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
from first_purchases
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
You could simplify your query as:
SELECT COUNT(*) AS
FROM sales no_of_first_purchases
WHERE is_first_purchase = 1
AND buyer_id = 456391
AND date_id BETWEEN '2021-02-01' AND '2021-03-01'
ORDER BY 1 DESC;
It is better to avoid the use of functions like IF and CASE when it can be done with WHERE.
The simplest approach for Trino (f.k.a. Presto SQL) is to use an aggregate with a filter:
count(name) FILTER (WHERE first_purchase = 'Yes') AS no_of_first_purchases

When else with partition by isn't working in redshift queries

I would like to exclude the categories sub_tag1, sub_tag2 and sub_tag3 of tag from the TAG_SALES_by_month but the rest whatever i mentioned in the where condition need to be included in the count. I couldn't achieve the desired result.can anyone help me to achieve the same, which would be very much appreciated.
select o.tag,
o.SOME, o.THING, o.ILIKE, o.date, c.THE, c.MOST,
date_part(month, o.date) as Month,
date_part(day, o.date) as day,
count(o.id) over (partition by day, CUST_Id) as SALE_NO,
count(o.id) over (partition by Month, CUST_Id) as SALE_NO_by_month,
count(case when (tag <> 'sub_tag1' AND tag <> 'sub_tag2' AND tag <> 'sub_tag3') then o.id else 0 END) over (partition by Month, CUST_Id) as TAG_SALES_by_month,
c.id as CUST_Id
from order_info o
left join config c on o.SOME = c.SOME
where date >= '05/01/2021' AND tag in ('sub_tag1', 'sub_tag2', 'sub_tag3', 'sub_tag4', 'sub_tag5',
'sub_tag6') AND ILIKE = 'JACK'
group by o.tag, o.SOME, o.THING, o.ILIKE, o.date, c.THE, c.MOST, CUST_Id, o.id
order by date
Per the comments, the issue here is the that COUNT will return 1 for any value, it counts existence vs not existence of a value/row.
So COUNT(CASE WHEN... ELSE 0...) will still count 1 on the ELSE condition, since 0 is a value that exists.
The solution is to use ELSE NULL or omit the ELSE clause which will default to NULL, because NULL will not be counted.

how to sum all columns with same id

Hi i have the following results:
i need to sum up all the items that have cashamount and same Payment code = 9
i have tried this query:
SELECT
CASE
WHEN StoreID = 1 THEN 'CWM'
WHEN StoreID = 2 THEN 'CWD' END as accountcode,
DocEntry,
PaymentCode,
case when PaymentCode <> 1 then paymentamount end as OtherPaymentAmount,
sum(case when PaymentCode = 1 then paymentamount end) as CashAmount,
tenders.sapcreditcard AS sapcreditcard,
--paymentamount,
-- sum (case when PaymentCode >= 1 then paymentamount else NULL end) as Total,
FileName, BPA_ProcessStatus, ERP_PaymentProcessed
FROM [Plu].[dbo].[payments_header] LEFT JOIN tenders ON payments_header.PaymentCode = tenders.postenderid
WHERE BPA_ProcessStatus='N' and ERP_PaymentProcessed='N'
group by PaymentCode, paymentamount, docentry, storeid,sapcreditcard, FileName, BPA_ProcessStatus,ERP_PaymentProcessed, cashamount
what im missing?
The GROUP BY clause lists all the columns you want to use to create separate groups. Yours is as follows...
GROUP BY
PaymentCode,
paymentamount,
docentry,
storeid,
sapcreditcard,
FileName,
BPA_ProcessStatus,
ERP_PaymentProcessed,
cashamount
Any time any of these are different, you'll get a separate row.
This means that your sum(case when PaymentCode = 1 then paymentamount end) ends up making very little sense.
Your GROUP BY says you want each different payment amount on a different row
Your SELECT says you want to aggregate multiple paymounts amounts
My best guess is that you want this...
SELECT
CASE
WHEN StoreID = 1 THEN 'CWM'
WHEN StoreID = 2 THEN 'CWD'
END
AS accountcode,
DocEntry,
PaymentCode,
SUM(CASE WHEN PaymentCode <> 1 THEN paymentamount END) AS OtherPaymentAmount,
SUM(CASE WHEN PaymentCode = 1 THEN paymentamount END) AS CashAmount,
tenders.sapcreditcard,
FileName,
BPA_ProcessStatus,
ERP_PaymentProcessed
FROM
[Plu].[dbo].[payments_header]
LEFT JOIN
tenders
ON payments_header.PaymentCode = tenders.postenderid
WHERE
BPA_ProcessStatus='N'
AND ERP_PaymentProcessed='N'
GROUP BY
CASE
WHEN StoreID = 1 THEN 'CWM'
WHEN StoreID = 2 THEN 'CWD'
END,
DocEntry,
PaymentCode,
tenders.sapcreditcard,
FileName,
BPA_ProcessStatus,
ERP_PaymentProcessed
Added SUM() around the OtherPaymentAmount calculations, to match CashAmount
Changed the GROUP BY to match the non-aggregated columns in the SELECT
NOTE: In all the places where you specify a column name, your should always qualify it with the source table's name.

Why my CASE WHEN gave me an AGGREGATION error message?

I'm trying to make a promo grouping using one promo_code field in a month where there's a chance that a single customer_ID would have more than one transaction and could have two different promo code
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 0 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
It gave me an error message :
SELECT list expression references column flag_promo which is neither grouped nor aggregated at [4:41]
Below is for BigQuery Standard SQL
#standardSQL
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) > 1 THEN 'Mixed'
WHEN ANY_VALUE(flag_promo) = 1 THEN 'Promo'
WHEN ANY_VALUE(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM `project.dataset.table`
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
This is the query I think you intended to do:
SELECT
customer_id AS buyer,
CASE WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE
DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2;
This assumes that a flag_promo value of 1 means Promo and a value of 2 means Organic. If not, then we can easily edit the above query.

Case statement is ignoring where clause

I am trying to create a SQL statement that returns multiple counts. The count below works as I expect, but the case statement is ignoring the where clause for my query.
I'm trying to get the total number of PacketId's that meet the where criteria. Then get a second total showing the sum of PacketId's that meet the where criteria and have a StatusId of 3.
*edit Table1 and Table2 both share PacketId as a foreign key.
Select
Count(Distinct wpq.PacketId) AS Total,
SUM(Case When wpq.StatusId = 3 THEN 1 ELSE 0 END) as OtherCount
FROM [Table1] ppo JOIN [Table2] wpq ON ppo.PacketId = wpq.PacketId
WHERE wpq.CreateDate between '11/1/2017' and '1/1/2018' and ppo.IsSelected = 1
I suspect you may be getting a higher number than expected for the othercount but that may be due to the use of count(distinct...) which reduces the first column result, but not the second. Perhaps introducing a subquery to select only distinct values would help?
SELECT DISTINCT
wpq.PacketId
, wpq.StatusId
FROM [Table1] ppo
JOIN [Table2] wpq ON ppo.PacketId = wpq.PacketId
WHERE wpq.CreateDate BETWEEN '11/1/2017' AND '1/1/2018'
AND ppo.IsSelected = 1
;
then count from that, e.g:
SELECT
COUNT(PacketId) AS total
, COUNT(CASE WHEN StatusId = 3 THEN StatusId END) AS othercount
, SUM(CASE WHEN StatusId = 3 THEN 1 ELSE 0 END) AS othersum
FROM (
SELECT DISTINCT
wpq.PacketId
, wpq.StatusId
FROM [Table1] ppo
JOIN [Table2] wpq ON ppo.PacketId = wpq.PacketId
WHERE wpq.CreateDate BETWEEN '11/1/2017' AND '1/1/2018'
AND ppo.IsSelected = 1
) AS d
;
Note: the COUNT() function ignores nulls, so I have added an alternative calculation method to consider. I prefer to use COUNT() in such a query.
Also I would like to note that your use of what appears to be M/D/YYYY date literals is NOT safe. The safest date literal format in T-SQL is YYYYMMDD. Similarly using between is not best practice for date ranges and wpuld encourage you to use >= and < instead, like so:
SELECT
COUNT(PacketId) AS total
, COUNT(CASE WHEN StatusId = 3 THEN StatusId END) AS othercount
, SUM(CASE WHEN StatusId = 3 THEN 1 ELSE 0 END) AS othersum
FROM (
SELECT DISTINCT
wpq.PacketId
, wpq.StatusId
FROM [Table1] ppo
JOIN [Table2] wpq ON ppo.PacketId = wpq.PacketId
WHERE wpq.CreateDate >= '20171101' AND wpq.CreateDate < '20180101'
AND ppo.IsSelected = 1
) AS d
;
Note I'm not sure if you do want to include 1/1/2018, if you do then use < '20180102' instead
I would suggest that you use standard date formats. Most databases support YYYY-MM-DD:
SELECT COUNT(DISTINCT wpq.PacketId) AS Total,
SUM(Case When wpq.StatusId = 3 THEN 1 ELSE 0 END) as OtherCount
FROM [Table1] ppo JOIN
[Table2] wpq
ON ppo.PacketId = wpq.PacketId
WHERE wpq.CreateDate >= '2017-11-01' AND
wpq.CreateDate <= '2018-01-01' AND
ppo.IsSelected = 1;
It is possible that the date comparisons are really being done as strings, so they do not do what you expect.