How to group by having the same id? - sql

I want the customerid who bought product X and Y and Z, from the following schema:
Sales(customerid, productName, rid);
I could do the intersection:
select customerid from sales where productName='X'
INTERSECT
select customerid from sales where productName='X'
INTERSTECT
select customerid from sales where productName='Z'
Is this the best I could do?

Not sure if this works in postrgesql, but try:
select customerid
from sales
where productName in ('X', 'Y', 'Z')
group by customerid
having count(distinct productName) = 3

You could also select from sales 3 times:
select s1.customerID
from sales s1, sales s2, sales s3
where s1.productName = 'X'
and S2.productName = 'Y'
and S3.productName = 'Z'
and (S1.customerID = S2.customerID
and s2.customerID = s3.customerID);
Or rewrite using proper join syntax (might not be 100% though...)
select s1.customerID
from sales s1
inner join sales S2
on s1.customerId = S2.customerID
inner join sales S3
on s2.customerID = S3.customerId
where s1.productName = 'X'
and S2.productName = 'Y'
and S3.productName = 'Z';

Related

Use 1 SQL query to join 3 tables and find the category of products that generates the most revenue for each customer segment

I am using SQLite3 for this following query.
I have a table called "products" that looks like this:
I have a table called "transactions" that looks like this:
I have a table called "segments" that looks like this:
For each active segment, I want to find the category that produces the highest revenue.
I think that I know how to do this in 3 different queries.
create table table1 as
SELECT s.seg_name, p.category, t.item_qty * t.item_price as revenue
from segments s
JOIN
transactions t
on s.cust_id = t.cust_id
JOIN products p
on p.prod_id = t.prod_id
where s.active_flag = 'Y'
order by s.seg_name, p.category
;
create table table2 as
select seg_name, category, sum(revenue) as revenue
from table1
group by seg_name, category;
select seg_name, category, max(revenue) as revenue
from table2
group by seg_name;
How can I do it in 1 query?
here is one way :
select seg_name,category,revenue
from (
select
s.seg_name,
p.category,
sum(t.item_qty * t.item_price) as revenue,
rank() over (partition by seg_name order by sum(t.item_qty * t.item_price) desc) rn
from segments s
join transactions t on s.cust_id = t.cust_id
join products p on p.prod_id = t.prod_id
where s.active_flag = 'Y'
group by seg_name, p.category
) t where rn = 1

Multiple SQL filter on same column

How do I display the customer_id of customers who bought products A and B, but didn’t buy product C, ordered by ascending customer ID.
I tried the below code, but does not give me any result.
select customer_id, product_name from orders where customer_id = 'A' and 'product_name '= 'B'
select customer_id from orders where product_name = 'A'
intersect
select customer_id from orders where product_name = 'B'
except
select customer_id from orders where product_name = 'C'
You can use analytical function as follows:
Select * from
(select customer_id, product_name ,
Count(distinct case when product_name in ('A','B') then product_name end)
Over (partition by customer_id) as cntab ,
Count(case when product_name = 'C' then product_name end)
Over (partition by customer_id) as cntc
from orders t) t
Where cntab = 2 and cntc = 0;
One method use exists and not exists:
select o.*
from orders o
where exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'A'
) and
exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'B'
) and
not exists (select 1
from orders o2
where o2.customer_id = o.customer_id and o2.product_name = 'C'
)
order by customer_id;
To me, this query reads well ...
SELECT customer_id
FROM (
SELECT TableA.customer_id
FROM (
SELECT customer_id
FROM orders
WHERE product_name = 'A'
) AS TableA
INNER JOIN (
SELECT customer_id
FROM orders
WHERE product_name = 'B'
) AS TableB
ON TableA.customer_id = TableB.customer_id
) AS TableX
WHERE customer_id NOT IN (SELECT customer_id FROM orders WHERE product_name = 'C')
Explanation. Get a list of customer ids that bought products A and B with self inner join. Then, pare that list down by removing any rows where those customers bought product C.

SQL customer who never order product p1 and p2 together

I do have two table. I want to find customers who never order product p1 and p2 together(there are million rows in this table)
customer_id 2,4,1,4,2,1,3,2,1
product_id. p1,p3,p2,p1,p2,p3,p4,p2
If I understand you correctly, with your very limited information, here is the solution. I broke it up to pieces, for you to understand it better
-- to get customers who ordered P1
Select customer_id from tbl where product_id = 'P1'
-- to get customers who ordered P2
Select customer_id from tbl where product_id = 'P2'
-- to get customers who ordered both P1 & P2
Select customer_id from tbl p1
inner join tbl p2 on p1.customer_id = p2.customer_id
where p1.product_id = 'P1' and p2.product_id = 'P2'
-- to get customers who did not ordered both P1 & P2 together
Select * from tbl m
Left Join
(
Select customer_id from tbl p1
inner join tbl p2 on p1.customer_id = p2.customer_id
where p1.product_id = 'P1' and p2.product_id = 'P2'
) q on m.customer_id = q.customer_id
Where q.customer_id is null
If you have one table with customer ids and product ids, then you can use aggregation:
select customer_id
from t
group by customer_id
having sum(case when product_id = 'P1' then 1 else 0 end) = 0 or
sum(case when product_id = 'P2' then 1 else 0 end) = 0;
That is, get customer who have not ordered one of the products.
Note that "ordered together" implies that they are in the same order. However, you data does not provide any information about orders or the timing of purchases.
the answer for this question if i understood it correctly that you want to find customers who have ordered p1 but never p2 and have ordered p2 but never p1.
Select b.customerid, b.productid
Table1 b
inner join
(Select customerid, count(distinct productid)
From table1
Where productid in (‘P1’, ‘P2’)
group by
Customerid
Having
count(distinct productid) = 1) a
on (b.customerid = a.customerid)
And b.productid in (‘P1’, ‘P2’)

SQL. Find the customers who bought same brands and at-least 2 products in each brand

I have two tables :
Sales
columns: (Sales_id, Date, Customer_id, Product_id, Purchase_amount):
Product
columns: (Product_id, Product_Name, Brand_id,Brand_name)
I have to write a query to find the customers who bought the brands 'X' and 'Y' (both) and at least 2 products of each brand. Is the following query correct? Any recommended changes?
SELECT S.Customer_id "Customer ID"
FROM Sales S LEFT JOIN Product P
ON S.Product_id = P.Product_id
AND P.Brand_Name IN ('X','Y')
GROUP BY S.Customer_id
HAVING COUNT(DISTINCT S.Product_id)>=2 -----at least 2 products in each brand
AND COUNT(S.Customer_id) =2 ---------------customers who bought both brands
Any help will be appreciated. Thanks in advance
Use COUNT() window function to count the number of distinct brands and the number of distinct products of each brand that each customer has bought.
Then filter out the customers who haven't bought both brands and GROUP BY customer with a HAVING clause that filters out the customers who haven't bought at least 2 products of each brand.
Also your join should be an INNER join and not a LEFT join.
select t.customer_id "Customer ID"
from (
select s.customer_id,
count(distinct p.brand_id) over (partition by s.customer_id) brands_counter,
count(distinct p.product_id) over (partition by s.customer_id, p.brand_id) products_counter
from sales s inner join product p
on p.product_id = s.product_id
where p.brand_name in ('X', 'Y')
) t
where t.brands_counter = 2
group by t.customer_id
having min(t.products_counter) >= 2
Starting from your existing query, you can use the following HAVING clause:
HAVING
AND COUNT(DISTINCT CASE WHEN p.brand_name = 'X' then S.product_id end) >= 2
AND COUNT(DISTINCT CASE WHEN p.brand_name = 'Y' then S.product_id end) >= 2
This ensures that the customer bought at least two products in both brands. This implicitly guarantees that it placed ordered in both brands, so there is no need for additional logic for this.
You could also express this with MIN() and MAX():
HAVING
AND MIN(CASE WHEN p.brand_name = 'X' THEN S.product_id END)
<> MAX(CASE WHEN p.brand_name = 'X' then S.product_id end)
AND MIN(CASE WHEN p.brand_name = 'Y' THEN S.product_id END)
<> MAX(CASE WHEN p.brand_name = 'Y' then S.product_id end)
You can use two levels of aggregation:
SELECT Customer_id
FROM (SELECT S.Customer_id, S.Brand_Name, COUNT(DISTINCT S.Product_Id) as num_products
FROM Sales S LEFT JOIN
Product P
ON S.Product_id = P.Product_id
WHERE P.Brand_Name IN ('X', 'Y')
GROUP BY S.Customer_id, S.Product_Id
) s
GROUP BY Customer_Id
HAVING COUNT(*) = 2 AND MIN(num_products) >= 2;

Segment purchases based on new vs returning

I'm trying to write a query that can select a particular date and count how many of those customers have placed orders previously and how many are new. For simplicity, here is the table layout:
id (auto) | cust_id | purchase_date
-----------------------------------
1 | 1 | 2010-11-15
2 | 2 | 2010-11-15
3 | 3 | 2010-11-14
4 | 1 | 2010-11-13
5 | 3 | 2010-11-12
I was trying to select orders by a date and then join any previous orders on the same user_id from previous dates, then count how many had orders, vs how many didnt. This was my failed attempt:
SELECT SUM(
CASE WHEN id IS NULL
THEN 1
ELSE 0
END ) AS new, SUM(
CASE WHEN id IS NOT NULL
THEN 1
ELSE 0
END ) AS returning
FROM (
SELECT o1 . *
FROM orders AS o
LEFT JOIN orders AS o1 ON ( o1.user_id = o.user_id
AND DATE( o1.created ) = "2010-11-15" )
WHERE DATE( o.created ) < "2010-11-15"
GROUP BY o.user_id
) AS t
Given a reference data (2010-11-15), then we are interested in the number of distinct customers who placed an order on that date (A), and we are interested in how many of those have placed an order previously (B), and how many did not (C). And clearly, A = B + C.
Q1: Count of orders placed on reference date
SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q2: List of customers placing order on reference date
SELECT DISTINCT Cust_ID
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q3: List of customers who placed an order on reference date who had ordered before
SELECT DISTINCT o1.Cust_ID
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q4: Count of customers who placed an order on reference data who had ordered before
SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q5: Combining Q1 and Q4
There are several ways to do the combining. One is to use Q1 and Q4 as (complicated) expressions in the select-list; another is to use them as tables in the FROM clause which don't need a join between them because each is a single-row, single-column table that can be joined in a Cartesian product. Another would be a UNION, where each row is tagged with what it calculates.
SELECT (SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15') AS Total_Customers,
(SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15') AS Returning_Customers
FROM Dual;
(I'm blithely assuming MySQL has a DUAL table - similar to Oracle's. If not, it is trivial to create a table with a single column containing a single row of data. Update 2: bashing the MySQL 5.5 Manual shows that 'FROM Dual' is supported but not needed; MySQL is happy without a FROM clause.)
Update 1: added qualifier 'o1.Cust_ID' in key locations to avoid 'ambiguous column name' as indicated in the comment.
How about
SELECT * FROM
(SELECT * FROM
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 1 AS OLD_CUSTOMER, 0 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT > 1)
UNION ALL
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 0 AS OLD_CUSTOMER, 1 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT = 1)) G
INNER JOIN
(SELECT CUST_ID, ORDER_DATE
FROM ORDERS) O
USING (CUST_ID)
WHERE ORDER_DATE = [date of interest] AND
OLD_CUSTOMER = [0 or 1, depending on what you want] AND
NEW_CUSTOMER = [0 or 1, depending on what you want]
Not sure if that'll do the whole thing, but it might provide a starting point.
Share and enjoy.
select count(distinct o1.cust_id) as repeat_count,
count(distinct o.cust_id)-count(distinct o1.cust_id) as new_count
from orders o
left join (select cust_id
from orders
where purchase_date < "2010-11-15"
group by cust_id) o1
on o.cust_id = o1.cust_id
where o.purchase_date = "2010-11-15"