Select Customer ID who hasnt purchased product X - sql

I have a table of customer IDs and Products Purchased. A customer ID can purchase multiple products over time.
customerID, productID
In BigQuery I need to find the CustomerID for those who have not purchased product A.
I've been going around in circles trying to do self joins, inner joins, but I'm clueless.
Any help appreciated.

select customerID
from your_table
group by customerID
having sum(case when productID = 'A' then 1 else 0 end) = 0
and to check if it only contains a name
sum(case when productID contains 'XYZ' then 1 else 0 end) = 0

Below is for BigQuery Standard SQL
#standardSQL
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(Product = 'A') = 0
You can test / play with it using dummy data as below
#standardSQL
WITH `project.dataset.yourTable` AS (
SELECT 1234 CustomerID, 'A' Product UNION ALL
SELECT 11234, 'A' UNION ALL
SELECT 4567, 'A' UNION ALL
SELECT 7896, 'C' UNION ALL
SELECT 5432, 'B'
)
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(Product = 'A') = 0
how would I adjust this so it could be productID contains "xyz"
#standardSQL
WITH `project.dataset.yourTable` AS (
SELECT 1234 CustomerID, 'Axyz' Product UNION ALL
SELECT 11234, 'A' UNION ALL
SELECT 4567, 'A' UNION ALL
SELECT 7896, 'Cxyz' UNION ALL
SELECT 5432, 'B'
)
SELECT CustomerID
FROM `project.dataset.yourTable`
GROUP BY CustomerID
HAVING COUNTIF(REGEXP_CONTAINS(Product, 'xyz')) = 0

If you have a customer table, you might want:
select c.*
from customers c
where not exists (select 1 from t where t.customer_id = c.customer_id and t.proectID = 'A');
This will return customers who have made no purchases as well as those who have purchased all but product A. Of course, the definition of a customer in your data might be that the customer has made a purchase, in which case I like Juergen's solution.

Related

I need a query that can calculate a value based on conditions

I have data such as:
Type
Amount
a
1000
a
5000
b
4000
b
2000
c
300
And would like to sum the amounts where Type is a and b, and minus the amounts where type is c.
I only know how to sum based on one condition, ie:
select sum(amount)
from xxxx
where type = 'a'
Do I need to do a sub-select or is there an easier way?
You can use a case statement inside sum:
select sum(case when type in ('a', 'b') then amount when type = 'c' then -amount end)
from table_name;
Use GROUP BY:
SELECT
Type,
sum(Amount)
FROM table
GROUP BY Type
https://www.postgresql.org/docs/10/queries-table-expressions.html#QUERIES-GROUP
https://www.postgresql.org/docs/10/tutorial-agg.html
WITH CTE(Type , Amount) AS
(
SELECT 'a' ,1000 UNION ALL
SELECT 'a' , 5000 UNION ALL
SELECT 'b' , 4000 UNION ALL
SELECT 'b' , 2000 UNION ALL
SELECT 'c' , 300
)
SELECT
SUM(CASE WHEN C.TYPE IN ('a','b')THEN C.Amount
ELSE 0
END) -
SUM(CASE WHEN C.TYPE='c' THEN C.Amount
ELSE 0
END)
FROM CTE AS C
for mariaDB
SELECT
debit - credit
FROM
(SELECT sum(Amount) AS debit FROM Your_table WHERE Type IN ('a', 'b')) AS condition1,
(SELECT sum(Amount) AS credit FROM Your_table WHERE Type NOT IN ( 'a', 'b' )) AS condition2;

I want to get difference of sum of a column from two tables

I have 2 tables: transactions and transactions_archive. Each of them has fields accountno,drcr(which has values either as C or D) and field amount. I want to get difference of sum of all 'C' in both transactions and transactions_archive and sum of all 'D' in both transactions and transactions_archive.
What query can I use to get this answer.
I tried this unsuccessfully:
select (
select accountno,drcr,sum(amount)as total from
(
select accountno,drcr,amount
from ebank.tbtransactions
where drcr='C'
union all
select accountno,drcr,amount
from ebank.tbtransactions_archive
where drcr='C'
)
)
-
(select accountno,drcr,sum(amount)as total
from (
select accountno,drcr,amount
from ebank.tbtransactions
where drcr='D'
union all
select accountno,drcr,amount
from ebank.tbtransactions_archive
where drcr='D'
)
)
group by accountno,drcr;
If I understand correctly, you want to subtract all the "D"s from the "C"s. Combine the tables using UNION ALL and use conditional aggregation:
select accountno,
sum(case when drcr = 'C' then amount else - amount end)as total
from ((select accountno, drcr, amount
from ebank.tbtransactions
) union all
(select accountno, drcr, amount
from ebank.tbtransactions_archive
)
) t
where drcr in ('D', 'C')
group by accountno;
SELECT top 1 (amount - nextamount) as diff from(
SELECT
amount,LEAD(amount, 1,0) OVER (ORDER BY YEAR(drcr)) AS nextamount FROM(
SELECT drcr, sum(amount) as amount from transactions JOIN transactions_archive on transactions.drcr and transactions_archive.drcr GROUP BY drcr))

compare suppliers from 2 tables in SQL Oracle

I have 2 tables
in both tables are suppliers with items. In the table supplier_with_awards are suppliers which can deliver an item. for 1 item there could be several suppliers. in the table suppliers_with_incomming_goods are suppliers that actually supply the items. There is a situation that non-awarded supplier supplies an item.
case 1
I need to check if the item is in both tables and pick up only those with different suppliers
case 2
same as case 1 but pick up non-awarded suppliers as well.
my data
CREATE TABLE suppliers_with_awards ( supplier, item ) AS
SELECT 'supplier1', 'item1' FROM DUAL UNION ALL
SELECT 'supplier2', 'item1' FROM DUAL UNION ALL
SELECT 'supplier3', 'item2' FROM DUAL UNION ALL
SELECT 'supplier4', 'item3' FROM DUAL ;
CREATE TABLE suppliers_with_incoming_goods ( supplier, item ) AS
SELECT 'supplier1', 'item1' FROM DUAL UNION ALL
SELECT 'supplier2', 'item1' FROM DUAL UNION ALL
SELECT 'supplier5', 'item2' FROM DUAL UNION ALL
SELECT 'supplier6', 'item4' FROM DUAL ;
with simple join we get for item1 unnecessary combinations supplier1-supplier2 and vica versa but in reality supplier1 got award and supplier1 delivers, the same goes for supplier2. So I used row_number to exclude such cross combo if you have better solution let me know.
with award as (
select supplier, item, row_number() over (partition by item order by supplier) r
from suppliers_with_awards
),
goods as (
select supplier, item, row_number() over (partition by item order by supplier) r
from suppliers_with_incoming_goods
)
select a.supplier,a.item,g.supplier from award a join goods g on a.item=g.item and a.r=g.r and a.supplier<>g.supplier;
SUPPLIER ITEM SUPPLIER
supplier3 item2 supplier5
this query finds the item2 because there are different suppliers as I want (case 1).again if there is better solution for this , please ....
But I need somehow to get the non-awarded supplier6 with item4 as well (case 2)
thanks
The current query may not return all results with inserted different values such as supplier7 and supplier8 for item1 inserted into the table suppliers_with_awards. I don't recommend use analytic function in this case, rather you can convert the query into the following which includes NOT EXISTS. And use UNION ALL, since you may need to return more than two supplier which already should be independently listed each unique one into one seperate line
--# Case 1
WITH item_supplier AS
(
SELECT g.item AS item,
a.supplier AS supplier_a,g.supplier AS supplier_g
FROM suppliers_with_awards a
JOIN suppliers_with_incoming_goods g
ON a.item = g.item
)
SELECT DISTINCT item, supplier_a AS supplier
FROM item_supplier i
WHERE NOT EXISTS ( SELECT 0
FROM item_supplier
WHERE supplier_g = i.supplier_a)
UNION ALL
SELECT DISTINCT item, supplier_g
FROM item_supplier i
WHERE NOT EXISTS ( SELECT 0
FROM item_supplier
WHERE supplier_a = i.supplier_g)
for the second case just convert the INNER JOIN to RIGHT or FULL JOIN, and filter out the NULL values of item and supplier in the main query such as
--# Case 2
WITH item_supplier AS
(
SELECT g.item AS item,
a.supplier AS supplier_a,g.supplier AS supplier_g
FROM suppliers_with_awards a
RIGHT JOIN suppliers_with_incoming_goods g
ON a.item = g.item
), its AS
(
SELECT DISTINCT item, supplier_a as supplier
FROM item_supplier i
WHERE NOT EXISTS ( SELECT 0
FROM item_supplier
WHERE supplier_g = i.supplier_a)
UNION ALL
SELECT DISTINCT item, supplier_g
FROM item_supplier i
WHERE NOT EXISTS ( SELECT 0
FROM item_supplier
WHERE supplier_a = i.supplier_g)
)
SELECT *
FROM its
WHERE item IS NOT NULL
AND supplier IS NOT NULL
Demo

Comparing between rows in same table in Oracle SQL

I'm trying to find the best way to compare between rows by CustomerID and Status. In other words, only show the CustomerID when the status are equal between multiple rows and CustomerID. If not, don't show the CustomerID.
Example data
CUSTOMERID STATUS
1000 ACTIVE
1000 ACTIVE
1000 NOT ACTIVE
2000 ACTIVE
2000 ACTIVE
RESULT I'm hoping for
CUSTOMERID STATUS
2000 ACTIVE
You can do this with a WHERE NOT EXISTS:
Select Distinct CustomerId, Status
From YourTable A
Where Not Exists
(
Select *
From YourTable B
Where A.CustomerId = B.CustomerId
And A.Status <> B.Status
)
SELECT DISTINCT o.*
FROM
(
SELECT
CustomerId
FROm
TableName
GROUP BY
CustomerId
HAVING
COUNT(DISTINCT Status) = 1
) t
INNER JOIN TableName o
ON t.CustomerId = o.CustomerId
The only "Code" here is the last 4 lines in the code block. The other is establishing sample data.
with T1 as (
Select 1000 as CUSTOMERID, 'ACTIVE' as STATUS from dual union all
select 1000, 'ACTIVE' from dual union all
select 1000, 'NOT ACTIVE' from dual union all
select 2000, 'ACTIVE' from dual union all
select 2000, 'ACTIVE' from dual )
SELECT customerID, max(status) as status
FROM T1
GROUP BY customerID
HAVING count(distinct Status) = 1
I used a CTE to setup sample data and called this Common table Expression T1.
Order of operations matter here. First the table T1 is identified
second the engine groups by customer ID
third the engine limits the results to those records having a distinct record status matching 1 and only 1 value.
4th the engine picks the max status which will always be 1 value. min/max it doesn't matter as there is only 1 possible value. note, we have to use an aggregate here since we can't group by status or you wouldn't get the desired results.
Here's a pretty simple one using IN:
SELECT DISTINCT CustomerID, Status
FROM My_Table
WHERE CustomerID IN
(SELECT CustomerID
FROM My_Table
GROUP BY CustomerID
HAVING COUNT(Distinct Status) = 1)
Addition: based on your comment, it seems what you really want is all the IDs that do not have a 'Not Active' row, which is actually easier:
SELECT Distinct CustomerID, Status
FROM My_Table
WHERE CustomerID NOT IN
(SELECT CustomerID
FROM My_Table
WHERE Status = 'Not Active')
This is a SQL Server answer, I believe it should work in Oracle.
SELECT
a.AGMTNUM
FROM TableA a
WHERE NOT EXISTS (SELECT 1 FROM TableB b WHERE b.Status = 'NOT ACTIVE' AND a.AGMTNUM = b.AGMTNUM)
AND EXISTS (SELECT 1 FROM TableB c WHERE c.Status = 'ACTIVE' AND a.AGMTNUM = c.AGMTNUM)
This will only return values that have at least one 'ACTIVE' value and no 'NOT ACTIVE' values.

Intersect Select Statements on Specific Columns

I've a table of SalesDetails, looking like this:
InvoiceID, LineID, Product
1,1,Apple
1,2,Banana
2,1,Apple
2,2,Mango
3,1,Apple
3,2,Banana
3,3,Mango
My requirement is to return rows where an Invoice contained sales of both: Apple AND Banana, but if there are other products on such an invoice, I don't want those.
So the result should be:
1,1,Apple
1,2,Banana
3,1,Apple
3,2,Banana
I tried the following:
Select * from SalesDetails where Product = 'Apple'
Intersect
Select * from SalesDetails where Product = 'Banana'
Didn't work, because it seems Intersect needs to match all the columns.
What I'm hoping to do is:
Select * from SalesDetails where Product = 'Apple'
Intersect ----On InvoiceID-----
Select * from SalesDetails where Product = 'Banana'
Is there a way to do this?
Or do I have to first Intersect on InvoiceIDs only using my criteria, then select the rows of those InvoiceIDs where the criteria is matched again, I.e.:
Select * From SalesDetails
Where Product In ('Apple', 'Banana') And InvoiceID In
(
Select InvoiceID from SalesDetails where Product = 'Apple'
Intersect
Select InvoiceID from SalesDetails where Product = 'Banana'
)
Which seems somewhat wasteful as it's examining the criteria twice.
Okay this time I've managed to get reuse of the Apple/Banana info by using a CTE.
with sd as (
Select * from SalesDetails
where (Product in ('Apple', 'Banana'))
)
Select * from sd where invoiceid in (Select invoiceid from
sd group by invoiceid having Count(distinct product) = 2)
SQL Fiddle
Do it with conditional aggregation:
select *
from SalesDetails
where product in ('apple', 'banana') and invoiceid in(
select invoiceid
from SalesDetails
group by invoiceid
having sum(case when product in('apple', 'banana') then 1 else 0 end) >= 2)
I think OP's suggestion is about the best one can do. The following might be faster, although I expect the difference to be slight and I have not done any benchmarking.
Select * From SalesDetails
Where Product ='Apple' And InvoiceID In
(
Select InvoiceID from SalesDetails where Product = 'Banana'
)
union all
select * from SalesDetails
Where Product ='Banana' And InvoiceID In
(
Select InvoiceID from SalesDetails where Product = 'Apple'
)
A self-join will solve the problem.
SELECT T1.*
FROM SalesDetails T1
INNER JOIN SalesDetails T2 ON T1.InvoiceId = T2.InvoiceId
AND (T1.Product = 'Apple' AND T2.Product = 'Banana'
OR T1.Product = 'Banana' AND t2.Product = 'Apple')
declare #t table (Id int,val int,name varchar(10))
insert into #t (id,val,name)values
(1,1,'Apple'),
(1,2,'Banana'),
(2,1,'Apple'),
(2,2,'Mango'),
(3,1,'Apple'),
(3,2,'Banana'),
(3,3,'Mango')
;with cte as (
select ID,val,name,ROW_NUMBER()OVER (PARTITION BY id ORDER BY val)RN from #t)
,cte2 AS(
select TOP 1 c.Id,c.val,c.name,C.RN from cte c
WHERE RN = 1
UNION ALL
select c.Id,c.val,c.name,C.RN from cte c
WHERE c.Id <> c.val)
select Id,val,name from (
select Id,val,name,COUNT(RN)OVER (PARTITION BY Id )R from cte2 )R
WHERE R = 2
Other was is to do PIVOT like this:
DECLARE #DataSource TABLE
(
[InvoiceID] TINYINT
,[LineID] TINYINT
,[Product] VARCHAR(12)
);
INSERT INTO #DataSource ([InvoiceID], [LineID], [Product])
VALUES (1,1,'Apple')
,(1,2,'Banana')
,(2,1,'Apple')
,(2,2,'Mango')
,(3,1,'Apple')
,(3,2,'Banana')
,(3,3,'Mango');
SELECT *
FROM #DataSource
PIVOT
(
MAX([LineID]) FOR [Product] IN ([Apple], [Banana])
) PVT
WHERE [Apple] IS NOT NULL
AND [Banana] IS NOT NULL;
It will give you the results in this format, but you are able to UNVPIVOT them if you want:
Or you can use window function like this:
;WITH DataSource AS
(
SELECT *
,SUM(1) OVER (PARTITION BY [InvoiceID]) AS [Match]
FROM #DataSource
WHERE [Product] = 'Apple' OR [Product] = 'Banana'
)
SELECT *
FROM DataSource
WHERE [Match] =2
First, you want to COUNT the number of rows per InvoiceID that matched the criteria Product = 'Apple' or 'Banana'. Then do a SELF-JOIN and filter the rows such that the COUNT must be >= 2, or the number of Products in your critera.
SQL Fiddle
SELECT sd.*
FROM (
SELECT InvoiceID, CC = COUNT(*)
FROM SalesDetails
WHERE Product IN('Apple', 'Banana')
GROUP BY InvoiceID
)t
INNER JOIN SalesDetails sd
ON sd.InvoiceID = t.InvoiceID
WHERE
t.CC >= 2
AND sd.Product IN('Apple', 'Banana')
WITH cte
AS
(
SELECT *
FROM [dbo].[SalesDetails]
WHERE [Product]='banana')
,cte1
AS
(SELECT *
FROM [dbo].[SalesDetails]
WHERE [Product]='apple')
SELECT *
FROM cte c INNER JOIN cte1 c1
ON c.[InvoiceID]=c1.[InvoiceID]
Here is a method using window functions:
select sd.*
from (select sd.*,
max(case when product = 'Apple' then 1 else 0 end) over (partition by invoiceid) as HasApple,
max(case when product = 'Banana' then 1 else 0 end) over (partition by invoiceid) as HasBanana
from salesdetails sd
) sd
where (product = 'Apple' and HasBanana > 0) or
(product = 'Banana' and HasApple > 0);
If you only want to write the condition once and are sure that each Product will only be once in any Order, you can use this:
SELECT * FROM (
SELECT InvoiceID, Product
,COUNT(*) OVER (PARTITION BY InvoiceID) matchcount
FROM SalesDetails
WHERE Product IN ('Apple','Banana') ) WHERE matchcount = 2;
This is what I ended up using, inspired by #Leon Bambrick:
(Expanded a little to support multiple products in the criteria)
WITH cteUnionBase AS
(SELECT * FROM SalesDetails
WHERE Product IN ('Apple Red','Apple Yellow','Apple Green','Banana Small','Banana Large')),
cteBanana AS
(SELECT * FROM cteUnionBase
WHERE Product IN ('Banana Small','Banana Large')),
cteApple AS
(SELECT * FROM cteUnionBase
WHERE Product IN ('Apple Red','Apple Yellow','Apple Green')),
cteIntersect AS
(
SELECT InvoiceID FROM cteApple
Intersect
SELECT InvoiceID FROM cteBanana
)
SELECT cteUnionBase.*
FROM cteUnionBase INNER JOIN cteIntersect
on cteUnionBase.InvoiceID = cteIntersect.InvoiceID