Joining SQL tables and removing repetitions - sql

I am trying to get a list of the customers with the same phone number, as there are instances of the same customer being created two or three times with slightly different names.
The query below has almost the intended behavior:
SELECT C1.CUSTOMER_NAME, C2.CUSTOMER_NAME, C1.PHONE_NUMBER
FROM CUSTOMER C1
JOIN CUSTOMER C2
ON C1.PHONE_NUMBER = C2.PHONE_NUMBER
WHERE C1.CUSTOMER_NAME != C2.CUSTOMER_NAME
AND C1.PHONE_NUMBER != ''
ORDER BY C1.CUSTOMER_NAME
But I get repetions like:
Customer A - Customer B
Customer A - Customer C
Customer B - Customer A
Customer B - Customer C
Customer C - Customer A
Customer C - Customer B
When all I want to get is the first two lines, which are enough to cover all the cases.
Thanks in advance for the help.

I'm not sure you want just the first two lines . . . because the last line seems different.
In any case, you can replace the != with < to get what you want:
SELECT C1.CUSTOMER_NAME, C2.CUSTOMER_NAME, C1.PHONE_NUMBER
FROM CUSTOMER C1 JOIN
CUSTOMER C2
ON C1.PHONE_NUMBER = C2.PHONE_NUMBER AND
C1.CUSTOMER_NAME < C2.CUSTOMER_NAME
WHERE C1.PHONE_NUMBER <> ''
ORDER BY C1.CUSTOMER_NAME;
If you just want all the customers on a given phone number -- when there is more than one customer -- then you do not need a join:
select c.phone_number, c.name
from (select c.*, count(*) over (partition by phone_number) as cnt
from customer c
) c
where cnt > 1
order by c.phone_number, c.name;

You could use a subquery (or JOIN with same login) to get the duplicate numbers first, then report on all the customers with that number:
SELECT CUSTOMER_NAME, PHONE_NUMBER
FROM CUSTOMER
WHERE PHONE_NUMBER IN (SELECT PHONE_NUMBER
FROM CUSTOMER
WHERE COUNT(PHONE_NUMBER) > 1 AND PHONE_NUMBER != '')
ORDER BY PHONE_NUMBER

Related

How to get all rows from one table which have all relations?

I have 3 tables:
companies (id, name)
union_products (id, name)
products (id, company_id, union_product_id, price_per_one_product)
I need to get all companies which have products with union_product_id in (1,2) and total price of products (per company) is less than 100.
What I am trying to do now:
select * from "companies" where exists
(
select id from "products"
where "companies"."id" = "products"."company_id"
and "union_product_id" in (1, 2)
group by id
having COUNT(distinct union_product_id) = 2 AND SUM(price_per_one_product) < 100
)
The problem I stuck with is that I'm getting 0 rows from the query above, but it works if I'll change COUNT(distinct union_product_id) = 2 to 1.
DB fiddle: https://www.db-fiddle.com/f/iRjfzJe2MTmnwEcDXuJoxn/0
Try to join the three tables as the following:
SELECT C.id, C.name FROM
products P JOIN union_products U
ON P.union_product_id=U.id
JOIN companies C
ON P.company_id=C.id
WHERE P.union_product_id IN (1, 2)
GROUP BY C.id, C.name
HAVING COUNT(DISTINCT P.union_product_id) = 2 AND
SUM(P.price_for_one_product) < 100
ORDER BY C.id
See a demo.
SELECT c.name FROM "companies" c
JOIN "products" p ON c.id = p.company_id
WHERE union_product_id IN (1, 2) AND price_for_one_product < 100
GROUP BY c.name
HAVING COUNT(DISTINCT p.name) =2
This would provide you all the company(s) name(s) which has provides both union_product_id 1 and 2 and the price_for_one_product/ price_per_one_product is less than 100.
Note: You might need to change price_for_one_product with price_per_one_product, as in question you have used price_per_one_product but db-fiddle link table defination has used price_for_one_product.

Select columns with multiple purcahses in a given date

table structure
I need to get names (FIO) of all the people that purchased product a and product b in any given day (but must be 2 purchases in a day) in april, or any other specified month.
What I tried to do is
with
purchased_items as
(
select customer_key
from purchase p
join product pr
on p.product_key = pr.product_key
where pr.name in ('Teddy bear', 'LEGO')
AND p.date_sold BETWEEN '01.04.2019' AND '30.04.2019'
group by customer_key
having count(distinct p.product_key) = 2
)
select *
from customer c
where exists (
select *
from purchased_items pui
where c.customer_key = pui.customer_key
);
But that only gives clients that bought 2 items in a month (not a single day).
I also suspect that it can be done by querying for Date, Client_name (FIO), array_agg(product.Name /*group by date */ ) , but I am not sure how to implemet it.
Thank you in advance for any help !
EDIT: figured it out.
with a as (
SELECT c.FIO, p.date_sold, pr.Name
FROM customer c
JOIN purchase p ON c.customer_key = p.customer_key
JOIN product pr ON p.product_key = pr.product_key
Where p.date_sold BETWEEN '01.04.2019' AND '30.04.2019' and (pr.name like '%LEGO%' OR pr.name like '%Teddy bear%') ),
count_name as (
select fio, count(distinct Name) as count, date_sold
from a
group by date_sold, fio)
select DISTINCT FIO
from count_name
where count=2
Although it's probably very subotimal

Selecting Distinct Records Not In A Set

I have a table of customers with their customer contact options.
Customers can be contact in one of three ways via:
Telephone (1)
SMS (2)
Email (3)
the FK id is in brackets.
If I wanted to pull out a list of distinct customer id's for both say SMS and Email I could do the following:
SELECT DISTINCT customer_id
FROM contact_options
WHERE contact_option_type_id IN (2,3)
But how do I do the inverse? Say I want a (DISTINCT) list of customers who don't have a Telephone contact. Can I do this without using a sub-query?
I realise the example is contrived, in practice I have very many different contact options (around 80).
One option is to use aggregation:
SELECT customer_id
FROM contact_options
GROUP BY customer_id
HAVING SUM(CASE WHEN contact_option_type_id = 1 THEN 1 ELSE 0 END) = 0;
We can also try doing this using EXISTS:
SELECT DISTINCT customer_id
FROM contact_options c1
WHERE NOT EXISTS (SELECT 1 FROM contact_options c2
WHERE c1.customer_id = c2.customer_id AND c2.contact_option_type_id = 1);
You can use not exists:
select distinct c.customer_id
from contact_options c
where not exists (select 1
from contact_options
where customer_id = c.customer_id and
contact_option_type_id = 1
);
I don't think exists has performance issue if you have proper indexing delcared.

SQL - Select highest value when data across 3 tables

I have 3 tables:
Person (with a column PersonKey)
Telephone (with columns Tel_NumberKey, Tel_Number, Tel_NumberType e.g. 1=home, 2=mobile)
xref_Person+Telephone (columns PersonKey, Tel_NumberKey, CreatedDate, ModifiedDate)
I'm looking to get the most recent (e.g. the highest Tel_NumberKey) from the xref_Person+Telephone for each Person and use that Tel_NumberKey to get the actual Tel_Number from the Telephone table.
The problem I am having is that I keep getting duplicates for the same Tel_NumberKey. I also need to be sure I get both the home and mobile from the Telephone table, which I've been looking to do via 2 individual joins for each Tel_NumberType - again getting duplicates.
Been trying the following but to no avail:
-- For HOME
SELECT
p.PersonKey, pn.Phone_Number, pn.Tel_NumberKey
FROM
Persons AS p
INNER JOIN
xref_Person+Telephone AS x ON p.PersonKey = x.PersonKey
INNER JOIN
Telephone AS pn ON x.Tel_NumberKey = pn.Tel_NumberKey
WHERE
pn.Tel_NumberType = 1 -- e.g. Home phone number
AND pn.Tel_NumberKey = (SELECT MAX(pn1.Tel_NumberKey) AS Tel_NumberKey
FROM Person AS p1
INNER JOIN xref_Person+Telephone AS x1 ON p1.PersonKey = x1.PersonKey
INNER JOIN Telephone AS pn1 ON x1.Tel_NumberKey = pn1.Tel_NumberKey
WHERE pn1.Tel_NumberType = 1
AND p1.PersonKey = p.PersonKey
AND pn1.Tel_Number = pn.Tel_Number)
ORDER BY
p.PersonKey
And have been looking over the following links but again keep getting duplicates.
SQL select max(date) and corresponding value
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
SQL Server: SELECT only the rows with MAX(DATE)
Am sure this must be possible but been at this a couple of days and can't believe its that difficult to get the most recent / highest value when referencing 3 tables. Any help greatly appreciated.
select *
from
( SELECT p.PersonKey, pn.Phone_Number, pn.Tel_NumberKey
, row_number() over (partition by p.PersonKey, pn.Phone_Number order by pn.Tel_NumberKey desc) rn
FROM
Persons AS p
INNER JOIN
xref_Person+Telephone AS x ON p.PersonKey = x.PersonKey
INNER JOIN
Telephone AS pn ON x.Tel_NumberKey = pn.Tel_NumberKey
WHERE
pn.Tel_NumberType = 1
) tt
where tt.rn = 1
ORDER BY
tt.PersonKey
you have to use max() function and then you have to order by rownum in descending order like.
select f.empno
from(select max(empno) empno from emp e
group by rownum)f
order by rownum desc
It will give you all employees having highest employee number to lowest employee number. Now implement it with your case then let me know.

comparing data from tables in sql server

I am facing great peril.
I have 2 TABLES--- purchaseTbl and CustomerTbl , which contain:
purchaseTbl : C_ID (int - FK) , Purchase_amt (int)
CustomerTbl: C_ID (int - PK), [other details].
So i want to calculate the sum of all purchases where the C_ID in both the tables match
Thank you
Gru
Use group by clause in your query like this....
SELECT CustomerTbl.C_ID, SUM(Purchase_amt) AS PurchaseSUM FROM CustomerTbl, purchaseTbl WHERE purchaseTbl.C_ID = CustomerTbl.C_ID GROUP BY CustomerTbl.C_ID
SELECT C.C_ID,
--You can add more columns (like customer name) here if you wish
SUM(Purchase_amt) AS SUMP
FROM CustomerTbl C
JOIN purchaseTbl P
ON P.C_ID = C.C_ID
GROUP BY C.C_ID
--If you added more columns in the select add them here too separated with comma
If you just want to know the total amount and not split it into customers then:
SELECT SUM(Purchase_amt) AS SUMP
FROM CustomerTbl C
JOIN purchaseTbl P
ON P.C_ID = C.C_ID
The above will get the total amount only if there is a corresponding C_ID in CustomerTbl.