How to calculate rows count in where statement in sql? - sql

I have two tables in SQL Server:
order (columns: order_id, payment_id)
payment (columns: payment_id, is_pay)
I want to get all orders with two more properties:
How many rows where is_pay is 1:
where payment_id = <...> payment.is_pay = 1
And the count of the rows (without the first filter)
select count(*)
from payment
where payment_id = <...>
So I wrote this query:
select
*,
(select count(1) from payment p
where p.payment_id = o.payment_id and p.is_pay = 1) as total
from
order o
The problem is how to calculate the rows without the is_pay = 1?
I mean the "some of many"

First aggregate in payment and then join to order:
SELECT o.*, p.total_pay, p.total
FROM [order] o
LEFT JOIN (
SELECT payment_id, SUM(is_pay) total_pay, COUNT(*) total
FROM payment
GROUP BY payment_id
) p ON p.payment_id = o.payment_id;
Change LEFT to INNER join if all orders have at least 1 payment.
Also, if is_pay's data type is BIT, change SUM(is_pay) to:
SUM(CASE WHEN is_pay = 1 THEN 1 ELSE 0 END)

Use a join with conditional aggregation:
SELECT
o.payment_id,
COUNT(CASE WHEN p.is_pay = 1 THEN 1 END) AS pay_cnt,
COUNT(p.payment_id) AS all_cnt
FROM "order" o
LEFT JOIN payment p
ON o.payment_id = p.payment_id
GROUP BY
o.payment_id;

You can use a lateral join (outer apply) for this:
select o.*, p.*
from orders o outer apply
(select count(*) as num_payments,
sum(case when is_pay = 1 then 1 else 0 end) as num_payments_1
from payments p
where p.payment_id = o.payment_id
) p;
Note: Assuming that is_pay only takes on the values of 0 and 1 (which seems reasonable given the name), you can simplify this to:
select o.*, p.*
from orders o outer apply
(select count(*) as num_payments,
sum(is_pay) as num_payments_1
from payments p
where p.payment_id = o.payment_id
) p;

If you are looking for counts per payment id then use this:
select
payment.payment_id,
count(*) as total,
count(case when payment.is_pay = 1 then 1 else 0) end as total_is_pay_orders
from orders
left join payment
on orders.payment_id = payment.payment_id
group by 1

Related

SQL Selecting & Counting From Another Table

I have this query that works excellently and gives me the results I want, however, does anybody know how I can remove any rows that have 0 orders? I am sure it is something simple, I just can't get my head around it.
In other words, should it only show the top 2 rows?
SELECT customers.id, customers.companyname, customers.orgtype,
(SELECT COALESCE(SUM(invoicetotal), 0)
FROM invoice_summary
WHERE invoice_summary.cid = customers.ID
and invoice_summary.submitted between '2022-08-01' and '2022-08-31'
) AS total,
(SELECT COUNT(invoicenumber)
FROM invoice_summary
WHERE invoice_summary.cid = customers.ID
and invoice_summary.submitted between '2022-08-01' and '2022-08-31'
) AS orders
FROM customers WHERE customers.orgtype = 10
ORDER BY total DESC
ID
Company
Org
Total
Orders
1232
ACME 1
10
523.36
3
6554
ACME 2
10
411.03
2
1220
ACME 3
10
0.00
0
4334
ACME 4
10
0.00
0
You can use a CTE to keep the request simple :
WITH CTE_Orders AS (
SELECT customers.id, customers.companyname, customers.orgtype,
(SELECT COALESCE(SUM(invoicetotal), 0)
FROM invoice_summary
WHERE invoice_summary.cid = customers.ID
and invoice_summary.submitted between '2022-08-01' and '2022-08-31'
) AS total,
(SELECT COUNT(invoicenumber)
FROM invoice_summary
WHERE invoice_summary.cid = customers.ID
and invoice_summary.submitted between '2022-08-01' and '2022-08-31'
) AS orders
FROM customers WHERE customers.orgtype = 10
ORDER BY total DESC
)
SELECT * FROM CTE_Orders WHERE orders > 0
You will find aditionals informations about CTE on Microsoft documentation : https://learn.microsoft.com/fr-fr/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver16
You can do this by transforming your subquery to a CROSS APPLYof a pre-aggregated table
SELECT
c.id,
c.companyname,
c.orgtype,
ins.total,
ins.orders
FROM customers c
CROSS APPLY (
SELECT
COUNT(*) AS orders,
ISNULL(SUM(ins.invoicetotal), 0) AS total
FROM invoice_summary ins
WHERE ins.cid = c.ID
AND ins.submitted between '20220801' and '20220831'
GROUP BY () -- do not remove the GROUP BY
) ins
WHERE c.orgtype = 10
ORDER BY
ins.total DESC;
You can also do this with an INNER JOIN against it
SELECT
c.id,
c.companyname,
c.orgtype,
ins.total,
ins.orders
FROM customers c
INNER JOIN (
SELECT
ins.cid,
COUNT(*) AS orders,
ISNULL(SUM(ins.invoicetotal), 0) AS total
FROM invoice_summary ins
WHERE ins.submitted between '20220801' and '20220831'
GROUP BY ins.cid
) ins ON ins.cid = c.ID
WHERE c.orgtype = 10
ORDER BY
ins.total DESC;
Quick and dirty way would be to dump your results into a temp table, delete the records you don't want, then select what remains.
Add this to the end of your select before the FROM clause:
INTO #temptable
Then delete the records you don't want:
DELETE FROM #temptable WHERE [Orders] = 0
Then just select from the temp table.
There are other ways to do this, and you should read up on the downsides of temp tables before implementing this solution.

Use 1 SQL query to join 3 tables and find the category of products that generates the most revenue for each customer segment

I am using SQLite3 for this following query.
I have a table called "products" that looks like this:
I have a table called "transactions" that looks like this:
I have a table called "segments" that looks like this:
For each active segment, I want to find the category that produces the highest revenue.
I think that I know how to do this in 3 different queries.
create table table1 as
SELECT s.seg_name, p.category, t.item_qty * t.item_price as revenue
from segments s
JOIN
transactions t
on s.cust_id = t.cust_id
JOIN products p
on p.prod_id = t.prod_id
where s.active_flag = 'Y'
order by s.seg_name, p.category
;
create table table2 as
select seg_name, category, sum(revenue) as revenue
from table1
group by seg_name, category;
select seg_name, category, max(revenue) as revenue
from table2
group by seg_name;
How can I do it in 1 query?
here is one way :
select seg_name,category,revenue
from (
select
s.seg_name,
p.category,
sum(t.item_qty * t.item_price) as revenue,
rank() over (partition by seg_name order by sum(t.item_qty * t.item_price) desc) rn
from segments s
join transactions t on s.cust_id = t.cust_id
join products p on p.prod_id = t.prod_id
where s.active_flag = 'Y'
group by seg_name, p.category
) t where rn = 1

Get the second last record in a date column within a inner join

I need to pull the second last record in a date column called OrderDate. However, I need to bring only one date (I am making the search into a table with all the purchases orders, dates and costs, in which a have to bring only the second last and its cost). The way its query is written today (and working) is pulling me the the newest date.
select distinct
a.PurchaseNum, a.ItemID, a.SupplierNum, a.Location, a.OrderDate, a.Cost
from
PurchaseOrder a
inner join
(select
l.SupplierNum, l.ItemID, l.Location, maxdate = max(l.OrderDate)
from
PurchaseOrder l
where
l.Cost <> 0
group by
l.SupplierNum, l.itemid, l.Location) l on a.SupplierNum = l.SupplierNumand a.itemid = l.itemid
and l.Location = a.Location
and a.OrderDate = l.maxdate
I have tried to use lag(), offset (but with limitations once is within a join, forcing me to use the order by and include the dateOrder column which is not what I want because we need only one date)
A bit of context: I have a report in which I need to show the last and second last cost of a purchase order for each supplier. Bring the last cost of an order is easy, the problem is go back to the second last... and it is where I am stuck right now.
Any thought?
If I'm understanding you correctly, here's one option using row_number to return the 2 highest orderdate records:
select *
from (
select *,
row_number() over (partition by SupplierNum, ItemID, Location
order by OrderDate desc) rn
from PurchaseOrder
where cost <> 0
) t
where rn <= 2
Inner query does order by desc and outside query does order by asc.
select distinct top 1 a.*
from PurchaseOrder a
inner join
(
select Top 2 l.*
from PurchaseOrder l
where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location order by orderdate desc) l
on a.SupplierNum= l.SupplierNumand a.itemid = l.itemid and l.Location=a.Location and a.OrderDate = l.Orderdate
order by a.orderdate
or
SELECT TOP 1 * FROM (SELECT * FROM PurchaseOrder a
EXCEPT SELECT TOP (SELECT (COUNT(*)-2) FROM PurchaseOrder a where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location) * FROM PurchaseOrder) A
or
SELECT *
FROM PurchaseOrder a
WHERE OrderDate = ( SELECT MAX(OrderDate)
FROM PurchaseOrder
WHERE Orderdate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder l where
l.Cost <> 0
group by l.SupplierNum, l.itemid, l.Location
)
) ;
or
SELECT TOP (1) *
FROM PurchaseOrder
WHERE OrderDate < ( SELECT MAX(OrderDate)
FROM PurchaseOrder where ....
)
ORDER BY OrderDate DESC ;

Get max value from another query

I have problems with some query. I need to get max value and product_name from that query:
select
products.product_name,
sum(product_invoice.product_amount) as total_amount
from
product_invoice
inner join
products on product_invoice.product_id = products.product_id
inner join
invoices on product_invoice.invoice_id = invoices.invoice_id
where
month(invoices.invoice_date) = 2
group by
products.product_name
This query returns a result like this:
product_name | total_amount
--------------+--------------
chairs | 70
ladders | 500
tables | 150
How to get from this: ladders 500?
Select product_name,max(total_amount) from(
select
products.product_name,
sum(product_invoice.product_amount) as total_amount
from product_invoice
inner join products
on product_invoice.product_id = products.product_id
inner join invoices
on product_invoice.invoice_id = invoices.invoice_id
where month(invoices.invoice_date) = 2
group by products.product_name
) outputTable
You can use order by and fetch first 1 row only:
select p.product_name,
sum(pi.product_amount) as total_amount
from product_invoice pi inner join
products p
on pi.product_id = p.product_id inner join
invoices i
on pi.invoice_id = i.invoice_id
where month(i.invoice_date) = 2 -- shouldn't you have the year here too?
group by p.product_name
order by total_amount
fetch first 1 row only;
Not all databases support the ANSI-standard fetch first clause. You may need to use limit, select top, or some other construct.
Note that I have also introduced table aliases -- they make the query easier to write and to read. Also, if you are selecting the month, shouldn't you also be selecting the year?
In older versions of SQL Server, you would use select top 1:
select top (1) p.product_name,
sum(pi.product_amount) as total_amount
from product_invoice pi inner join
products p
on pi.product_id = p.product_id inner join
invoices i
on pi.invoice_id = i.invoice_id
where month(i.invoice_date) = 2 -- shouldn't you have the year here too?
group by p.product_name
order by total_amount;
To get all rows with the top amount, use SELECT TOP (1) WITH TIES . . ..
If you are using SQL Server, then TOP can offer a solution:
SELECT TOP 1
p.product_name,
SUM(pi.product_amount) AS total_amount
FROM product_invoice pi
INNER JOIN products p
ON pi.product_id = p.product_id
INNER JOIN invoices i
ON pi.invoice_id = i.invoice_id
WHERE
MONTH(i.invoice_date) = 2
GROUP BY
p.product_name
ORDER BY
SUM(pi.product_amount) DESC;
Note: If there could be more than one product tied for the top amount, and you want all ties, then use TOP 1 WITH TIES, e.g.
SELECT TOP 1 WITH TIES
... (the same query I have above)

Segment purchases based on new vs returning

I'm trying to write a query that can select a particular date and count how many of those customers have placed orders previously and how many are new. For simplicity, here is the table layout:
id (auto) | cust_id | purchase_date
-----------------------------------
1 | 1 | 2010-11-15
2 | 2 | 2010-11-15
3 | 3 | 2010-11-14
4 | 1 | 2010-11-13
5 | 3 | 2010-11-12
I was trying to select orders by a date and then join any previous orders on the same user_id from previous dates, then count how many had orders, vs how many didnt. This was my failed attempt:
SELECT SUM(
CASE WHEN id IS NULL
THEN 1
ELSE 0
END ) AS new, SUM(
CASE WHEN id IS NOT NULL
THEN 1
ELSE 0
END ) AS returning
FROM (
SELECT o1 . *
FROM orders AS o
LEFT JOIN orders AS o1 ON ( o1.user_id = o.user_id
AND DATE( o1.created ) = "2010-11-15" )
WHERE DATE( o.created ) < "2010-11-15"
GROUP BY o.user_id
) AS t
Given a reference data (2010-11-15), then we are interested in the number of distinct customers who placed an order on that date (A), and we are interested in how many of those have placed an order previously (B), and how many did not (C). And clearly, A = B + C.
Q1: Count of orders placed on reference date
SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q2: List of customers placing order on reference date
SELECT DISTINCT Cust_ID
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q3: List of customers who placed an order on reference date who had ordered before
SELECT DISTINCT o1.Cust_ID
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q4: Count of customers who placed an order on reference data who had ordered before
SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q5: Combining Q1 and Q4
There are several ways to do the combining. One is to use Q1 and Q4 as (complicated) expressions in the select-list; another is to use them as tables in the FROM clause which don't need a join between them because each is a single-row, single-column table that can be joined in a Cartesian product. Another would be a UNION, where each row is tagged with what it calculates.
SELECT (SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15') AS Total_Customers,
(SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15') AS Returning_Customers
FROM Dual;
(I'm blithely assuming MySQL has a DUAL table - similar to Oracle's. If not, it is trivial to create a table with a single column containing a single row of data. Update 2: bashing the MySQL 5.5 Manual shows that 'FROM Dual' is supported but not needed; MySQL is happy without a FROM clause.)
Update 1: added qualifier 'o1.Cust_ID' in key locations to avoid 'ambiguous column name' as indicated in the comment.
How about
SELECT * FROM
(SELECT * FROM
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 1 AS OLD_CUSTOMER, 0 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT > 1)
UNION ALL
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 0 AS OLD_CUSTOMER, 1 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT = 1)) G
INNER JOIN
(SELECT CUST_ID, ORDER_DATE
FROM ORDERS) O
USING (CUST_ID)
WHERE ORDER_DATE = [date of interest] AND
OLD_CUSTOMER = [0 or 1, depending on what you want] AND
NEW_CUSTOMER = [0 or 1, depending on what you want]
Not sure if that'll do the whole thing, but it might provide a starting point.
Share and enjoy.
select count(distinct o1.cust_id) as repeat_count,
count(distinct o.cust_id)-count(distinct o1.cust_id) as new_count
from orders o
left join (select cust_id
from orders
where purchase_date < "2010-11-15"
group by cust_id) o1
on o.cust_id = o1.cust_id
where o.purchase_date = "2010-11-15"