Remove Duplicates from LEFT OUTER JOIN - sql

My question is quite similar to Restricting a LEFT JOIN, with a variation.
Assuming I have a table SHOP and another table LOCATION. Location is a sort of child table of table SHOP, that has two columns of interest, one is a Division Key (calling it just KEY) and a "SHOP" number. This matches to the Number "NO" in table SHOP.
I tried this left outer join:
SELECT S.NO, L.KEY
FROM SHOP S
LEFT OUTER JOIN LOCATN L ON S.NO = L.SHOP
but I'm getting a lot of duplicates since there are many locations that belong to a single shop. I want to eliminate them and just get a list of "shop, key" entries without duplicates.
The data is correct but duplicates appear as follows:
SHOP KEY
1 XXX
1 XXX
2 YYY
3 ZZZ
3 ZZZ etc.
I would like the data to appear like this instead:
SHOP KEY
1 XXX
2 YYY
3 ZZZ etc.
SHOP table:
NO
1
2
3
LOCATION table:
LOCATION SHOP KEY
L-1 1 XXX
L-2 1 XXX
L-3 2 YYY
L-4 3 YYY
L-5 3 YYY
(ORACLE 10g Database)

You need to GROUP BY 'S.No' & 'L.KEY'
SELECT S.NO, L.KEY
FROM SHOP S
LEFT OUTER JOIN LOCATN L
ON S.NO = L.SHOP
GROUP BY S.NO, L.KEY

EDIT Following the update in your scenario
I think you should be able to do this with a simple sub query (though I haven't tested this against an Oracle database). Something like the following
UPDATE shop s
SET divnkey = (SELECT DISTINCT L.KEY FROM LOCATN L WHERE S.NO = L.SHOP)
The above will raise an error in the event of a shop being associated with locations that are in multiple divisions.
If you just want to ignore this possibility and select an arbitrary one in that event you could use
UPDATE shop s
SET divnkey = (SELECT MAX(L.KEY) FROM LOCATN L WHERE S.NO = L.SHOP)

I had this problem too but I couldn't use GROUP BY to fix it because I was also returning TEXT type fields. (Same goes for using DISTINCT).
This code gave me duplicates:
select mx.*, case isnull(ty.ty_id,0) when 0 then 'N' else 'Y' end as inuse
from master_x mx
left outer join thing_y ty on mx.rpt_id = ty.rpt_id
I fixed it by rewriting it thusly:
select mx.*,
case when exists (select 1 from thing_y ty where mx.rpt_id = ty.rpt_id) then 'Y' else 'N' end as inuse
from master_x mx
As you can see I didn't care about the data within the 2nd table (thing_y), just whether there was greater than zero matches on the rpt_id within it. (FYI: rpt_id was also not the primary key on the 1st table, master_x).

Related

SQL Selecting & Counting In the same query

thanks in advance for any help on this, I am a bit of a newbie to MS SQL and I want to do something that I think is achievable but don't have the know how.
I have a simple table called "suppliers" where I can do (SELECT id, name FROM suppliers ORDER BY id ASC)
id
name
1
ACME
2
First Stop Business Supplies
3
All in One Supply Warehouse
4
Farm First Supplies
I have another table called "products"
id
name
supplier_id
1
Item 1
2
2
Item 2
1
3
Item 3
1
4
Item 4
3
5
Item 5
2
I want to list all the suppliers and get the total amount of products for each supplier if that makes sense on the same row? I am just not sure how to pass the suppliers.id through the query to get the count.
I am hoping to get to this:
id
name
total_products
1
ACME
2
2
First Stop Business Supplies
2
3
All in One Supply Warehouse
1
4
Farm First Supplies
0
I really appreciate any help on this.
Three concepts to grasp here. Left Join, group by, and Count().
select s.id, s.name, Count(*) as total_products
from suppliers s
left join products p on s.id=p.supplier_id --the left join gets your no matches
group by s.id, s.name
left join is a join where all of the values from the first table are kept even if there are no matches in the second.
Group by is an aggregation tool where the columns to be aggregated are entered.
Count() is simply a count of transactions for the grouped columns.
Try this :-
SELECT id, name, C.total_products
FROM Suppliers S
OUTER APPLY (
SELECT Count(id) AS total_products
FROM Products P
WHERE P.supplier_id = S.id
) C

How to Limit Results Per Match on a Left Join - SQL Server

I have a table with student info [STU] and a table with parent info [PAR]. I want to return an email address for each student, but just one. So I run this query:
SELECT [STU].[ID], [PAR].[EM]
FROM (SELECT [STU].* FROM DB1.STU)
STU LEFT JOIN (SELECT [PAR].* FROM DB1.PAR) PAR ON [STU].[ID] = [PAR].[ID]
This gives me the below table:
Student ID ParentEmail
1 jim#email.com
1 sarah#email.com
2 paul#email.com
2 tim#email.com
3 bill#email.com
3 frank#email.com
3 joyce#email.com
4 greg#email.com
5 tony#email.com
5 sam#email.com
Each student has multiple parent emails, but I only want one. In other words, I want the output to look like this:
Student ID ParentEmail
1 jim#email.com
2 paul#email.com
3 frank#email.com
4 greg#email.com
5 sam#email.com
I've tried so many things. I've tried using GROUP BY and MIN/MAX and I've tried complex CASE statements, and I've tried COALESCE but I just can't seem to figure it out.
I think OUTER APPLY is the simplest method:
SELECT [STU].[ID], [PAR].[EM]
FROM DB1.STU OUTER APPLY
(SELECT TOP (1) [PAR].*
FROM DB1.PAR
WHERE [STU].[ID] = [PAR].[ID]
) PAR;
Normally, there would be an ORDER BY in the subquery, to give you control over which email you want -- the longest, shortest, oldest, or whatever. Without an ORDER BY it returns just one email, which is what you are asking for.
If you just want one column from the parent table, a simple approach is a correlated subquery:
select
s.id student_id,
(select max(p.em) from db1.par p where p.id = s.id) parent_email
from db1.stu s
This gives you the greatest parent email per student.

JOIN query, SQL Server is dropping some rows of my first table

I have two tables customer_details and address_details. I want to display customer details with their corresponding address, so I was using a LEFT JOIN, but when I'm executing this query, SQL Server drops rows where street_no of customer_details table doesn't match with the street_no in address_detials table and displays only rows where `street_no' of customer_detials = street_no of address_details table. I need to display a complete customer_details table and in case if street_no doesn't matches it should display empty string or anything. Am I doing anything wrong in my SQL join?
Table customer_details:
case_id customer_name mob_no street_no
-------------------------------------------------
1 John 242342343 4324234234234
1 Rohan 343233333 43332
1 Ankit 234234233 2342332423433
1 Suresh 234234324 2342342342342
1 Ranjeet 343424323 32233
1 Ramu 234234333 2342342342343
Table address_details:
s_no streen_no address city case_id
------------------------------------------------------
1 4324234234234 Roni road Delhi 1
2 2342332423433 Natan street Lucknow 1
3 2342342342342 Koliko road Herdoi 1
SQL JOIN query:
select
a.*, b.address
from
customer_details a
left join
address_details b on a.street_no = b.street_no
where
b.case_id = 1
Now that it became clear that you used b.case_id=1, I will explain why it filters:
The LEFT JOIN itself returns some rows that contain all NULL values for table b in the result set, which is what you want and expect.
But by using WHERE b.case_id=1, the rows containing NULL values for table b are filtered out because none of them matches the condition (all those rows have b.case_id=NULL so they don't match).
It might work to instead use WHERE a.case_id=1, but we don't know if a.case_id and b.case_id are always the same value for matching rows (they might not be; and if they are always the same, then we just identified a potential redundancy).
There are two ways to fix this for sure.
(1) Move b.case_id = 1 into the left join condition:
left join address_details b on a.street_no = b.street_no and b.case_id = 1
(2) Keep b.case_id = 1 in the WHERE but also allow for NULLED-out b values:
left join address_details b on a.street_no = b.street_no
where b.case_id = 1
or b.street_no IS NULL
Personally I'd go for (1) because that is the most clear way to express that you want to filter b on two conditions, without affecting the rows of a that are being returned.
I do think that Wilhelm Poggenpohl answer is kind of right. You just need to change the last join condition a.case_id=1 to b.case_id=1
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and b.case_id=1
This query will show every row from customer_details and the corresponding adress if there is a match of street_no and the adress meets the condition case_id=1.
This is because of the where clause. Try this:
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and a.case_id=1

Remove all instances of a record if condition is met

Im trying to remove all instances of a record if the value in the flag field is 4. (this means they have unsubscribed from the email list)
Sample data:
Customer# Email Name CustomerType Flag
001 email#email.com Bob Vet 1
001 email#email.com Bob Med 2
001 email#email.com Bob Pod 4
So since there is an instance that this record has a Flag of 4, all 3 should be removed from this query. They don't need to actually be deleted from the database, I just don't need the data to come up in my query. How do I approach this?
Assuming that the customer number is what links the records together, you can use a not exists clause:
select *
from tbl t1
where not exists (select *
from tbl t2
where t2.[Customer#] = t1.[Customer#]
and t2.Flag = 4)
Three approaches.
use Not exists
use a where not in
use a join
Below is the join: Sstan provided the not exists and Gordon more or less provided the where in but change to not in and a select and you'd have it as well.
Without table size volume of translations and index information I can't say which would offer the best performance though the not exists is the strong favorite.
SELECT A.*
FROM TableName A
LEFT JOIN TableName B
on A.Customer# = B.Customer#
and B.Flag = 4
WHERE B.Customer# is null
This does a self join but only to a set of records that are flagged as 4. it then excludes those records which have a match; returning only customer#'s which don't have a 4.
Here is one method:
delete from sample
where customer# in (select customer# from sample as s2 where flag = 4);
EDIT:
You can readily adapt this to a select:
select s.*
from sample s
where customer# not in (select customer# from sample as s2 where flag = 4);

pad database out with NULL criteria

If I have the following sample table (order by ID)
ID Date Type
-- ---- ----
1 01/01/2000 A
2 22/04/1995 A
2 14/02/2001 B
Where you can immediate see that ID=1 does not have a Type=B, but ID=2 does. What I want to do, if fill in a line to show this:
ID Date Type
-- ---- ----
1 01/01/2000 A
1 NULL B
2 22/04/1995 A
2 14/02/2001 B
where there could potentially be 100's of different types, (so may need to end up inserting 100's rows per person if they lack 100's Types!)
Is there a general solution to do this?
Could I possibly outer join the table on itself and do it that way?
You can do this with a cross join to generate all the rows and a left join to get the actual data values:
select i.id, s.date, t.type
from (select distinct id from sample) i cross join
(select distinct type from sample) t left join
sample s
on s.id = i.id and
s.type = t.type;