SQL Inner join in a nested select statement - sql

I'm trying to do an inner join in a nested select statement. Basically, There are first and last reason IDs that produce a certain number (EX: 200). In another table, there are definitions for the IDs. I'm trying to pull the Last ID, along with the corresponding comment for whatever is pulled (EX: 200 - Patient Cancelled), then the first ID and the comment for whatever ID it is.
This is what I have so far:
Select BUSN_ID
AREA_NAME
DATE
AREA_STATUS
(Select B.REASON_ID
A.LAST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A.LAST_REASON _ID=B.REASON_ID,
(Select B.REASON_ID
A. FIRST_REASON_ID
FROM BUSN_INFO A, BUSN_REASONS B
WHERE A_FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO
I believe an inner join is best, but I'm stuck on how it would actually work.
Required result would look like (this is example dummy data):
First ID -- Busn Reason -- Last ID -- Busn Reason
1 Patient Sick 2 Patient Cancelled
2 Patient Cancelled 2 Patient Cancelled
3 Patient No Show 1 Patient Sick
Justin_Cave's SECOND example is the way I used to solve this problem.

If you want to use inline select statements, your inline select has to select a single column and should just join back to the table that is the basis of your query. In the query you posted, you're selecting the same numeric identifier multiple times. My guess is that you really want to query a string column from the lookup table-- I'll assume that the column is called reason_description
Select BUSN_ID,
AREA_NAME,
DATE,
AREA_STATUS,
a.last_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.LAST_REASON_ID=B.REASON_ID),
a.first_reason_id,
(Select B.REASON_description
FROM BUSN_REASONS B
WHERE A.FIRST_REASON_ID = B.REASON_ID)
FROM BUSN_INFO A
More conventionally, though, you'd just join to the busn_reasons table twice
SELECT i.busn_id,
i.area_name,
i.date,
i.area_status,
i.last_reason_id,
last_reason.reason_description,
i.first_reason_id,
first_reason.reason_description
FROM busn_info i
JOIN busn_reason first_reason
ON( i.first_reason_id = first_reason.reason_id )
JOIN busn_reason last_reason
ON( i.last_reason_id = last_reason.reason_id )

Related

How to Limit Results Per Match on a Left Join - SQL Server

I have a table with student info [STU] and a table with parent info [PAR]. I want to return an email address for each student, but just one. So I run this query:
SELECT [STU].[ID], [PAR].[EM]
FROM (SELECT [STU].* FROM DB1.STU)
STU LEFT JOIN (SELECT [PAR].* FROM DB1.PAR) PAR ON [STU].[ID] = [PAR].[ID]
This gives me the below table:
Student ID ParentEmail
1 jim#email.com
1 sarah#email.com
2 paul#email.com
2 tim#email.com
3 bill#email.com
3 frank#email.com
3 joyce#email.com
4 greg#email.com
5 tony#email.com
5 sam#email.com
Each student has multiple parent emails, but I only want one. In other words, I want the output to look like this:
Student ID ParentEmail
1 jim#email.com
2 paul#email.com
3 frank#email.com
4 greg#email.com
5 sam#email.com
I've tried so many things. I've tried using GROUP BY and MIN/MAX and I've tried complex CASE statements, and I've tried COALESCE but I just can't seem to figure it out.
I think OUTER APPLY is the simplest method:
SELECT [STU].[ID], [PAR].[EM]
FROM DB1.STU OUTER APPLY
(SELECT TOP (1) [PAR].*
FROM DB1.PAR
WHERE [STU].[ID] = [PAR].[ID]
) PAR;
Normally, there would be an ORDER BY in the subquery, to give you control over which email you want -- the longest, shortest, oldest, or whatever. Without an ORDER BY it returns just one email, which is what you are asking for.
If you just want one column from the parent table, a simple approach is a correlated subquery:
select
s.id student_id,
(select max(p.em) from db1.par p where p.id = s.id) parent_email
from db1.stu s
This gives you the greatest parent email per student.

What is the best way to join tables

this is more like a general question.
I am looking for the best way to join 4, maybe 5 different tables. I am trying to create a Power Bi pulling live information from an IBM AS400 where customer service can type one of our parts number,
see how many parts we have in inventory, if none, see the lead time and if there are any orders already already entered for the typed part number.
SERI is our inventory table with 37180 records.
(active inventory that is available)
METHDM is our kit table with 37459 records.
(this table contains the bill of materials for custom kits, KIT A123 contains different part numbers in it witch are in SERI as well.)
STKA is our part lead time table with 76796 records.
(lead time means how long will it take for parts to come in)
OCRI is our sales order table with 6497 records.
(This table contains all customer orders)
I have some knowledge in writing queries but this one is more challenging of what I have created in the past. Should I start with the table that has the most records and start left joining the rest ?
From STKA 76796 records
Left join METHDM 37459 records on STKA
left join SERI 37180 records on STKA
left join OCRI 6497 records on STAK
Select
STKA.v6part as part,
STKA.v6plnt as plant,
STKA.v6tdys as pur_leadtime,
STKA.v6prpt as Pur_PrepLeadtime,
STKA.v6lead as Mfg_leadtime,
STKA.v6prpt as Mfg_PrepLeadTime,
METHDM.AQMTLP AS COMPONENT,
METHDM.AQQPPC AS QTYNEEDED,
SERI.HTLOTN AS BATCH,
SERI.HTUNIT AS UOM,
(HTQTY - HTQTYC) as ONHAND,
OCRI.DDORD# AS SALESORDER,
OCRI.DDRDAT AS PROMISED
from stka
left join METHDM on STKA.V6PART = METHDM.AQPART
left join SERI on STKA.V6PART = SERI.HTPART
left join OCRI on STKA.V6PART = OCRI.DDPART
Is this the best way to join the tables?
I think you already have your answer, but conceptually, there are a few issues here to deal with, and I figured I would give you a few examples, using data a little bit like yours, but massively simplified.
CREATE TABLE #STKA (V6PART INT, OTHER_DATA VARCHAR(50));
CREATE TABLE #METHDM (AQPART INT, KIT_ID INT, SOME_DATE DATETIME, OTHER_DATA VARCHAR(50));
CREATE TABLE #SERI (HTPART INT, OTHER_DATA VARCHAR(50));
CREATE TABLE #OCRI (DDPART INT, OTHER_DATA VARCHAR(50));
INSERT INTO #STKA SELECT 1, NULL UNION ALL SELECT 2, NULL UNION ALL SELECT 3, NULL; --1, 2, 3 Ids
INSERT INTO #METHDM SELECT 1, 1, '20200108 10:00', NULL UNION ALL SELECT 1, 2, '20200108 11:00', NULL UNION ALL SELECT 2, 1, '20200108 13:00', NULL; --1 Id appears twice, 2 Id once, no 3 Id
INSERT INTO #SERI SELECT 1, NULL UNION ALL SELECT 3, NULL; --1 and 3 Ids
INSERT INTO #OCRI SELECT 1, NULL UNION ALL SELECT 4, NULL; --1 and 4 Ids
So fundamentally we have a few issues here:
o the first problem is that the IDs in the tables differ, one table has an ID #4 but this isn't in any of the others;
o the second issue is that we have multiple rows for the same ID in one table;
o the third issue is that some tables are "missing" IDs that are in other tables, which you already covered by using LEFT JOINs, so I will ignore this.
--This will select ID 1 twice, 2 once, 3 once, and miss 4 completely
SELECT
*
FROM
#STKA
LEFT JOIN #METHDM ON #METHDM.AQPART = #STKA.V6PART
LEFT JOIN #SERI ON #SERI.HTPART = #STKA.V6PART
LEFT JOIN #OCRI ON #OCRI.DDPART = #STKA.V6PART;
So the problem here is that we don't have every ID in our "anchor" table STKA, and in fact there's no single table that has every ID in it. Now your data might be fine here, but if it isn't then you can simply add a step to find every ID, and use this as the anchor.
--This will select each ID, but still doubles up on ID 1
WITH Ids AS (
SELECT V6PART AS ID FROM #STKA
UNION
SELECT AQPART AS ID FROM #METHDM
UNION
SELECT HTPART AS ID FROM #SERI
UNION
SELECT DDPART AS ID FROM #OCRI)
SELECT
*
FROM
Ids I
LEFT JOIN #STKA ON #STKA.V6PART = I.Id
LEFT JOIN #METHDM ON #METHDM.AQPART = I.Id
LEFT JOIN #SERI ON #SERI.HTPART = I.Id
LEFT JOIN #OCRI ON #OCRI.DDPART = I.Id;
That's using a common-table expression, but a subquery would also do the job. However, this still leaves us with an issue where ID 1 appears twice in the list, because it has multiple rows in one of the sub-tables.
One way to fix this is to pick the row with the latest date, or any other ORDER you can apply to the data:
--Pick the best row for the table where it has multiple rows, now we get one row per ID
WITH Ids AS (
SELECT V6PART AS ID FROM #STKA
UNION
SELECT AQPART AS ID FROM #METHDM
UNION
SELECT HTPART AS ID FROM #SERI
UNION
SELECT DDPART AS ID FROM #OCRI),
BestMETHDM AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY AQPART ORDER BY SOME_DATE DESC) AS ORDER_ID
FROM
#METHDM)
SELECT
*
FROM
Ids I
LEFT JOIN #STKA ON #STKA.V6PART = I.Id
LEFT JOIN BestMETHDM ON BestMETHDM.AQPART = I.Id AND BestMETHDM.ORDER_ID = 1
LEFT JOIN #SERI ON #SERI.HTPART = I.Id
LEFT JOIN #OCRI ON #OCRI.DDPART = I.Id;
Of course you could also add some aggregation (SUM, MAX, MIN, AVG, etc.) to fix this problem (if it is indeed an issue). Also, I used a common-table expression, but this would work just as well with a subquery.
Expanding on a comment made on the question..
I would say I will start with SERI as that table contains the entire inventory for our facility and should cover the other tables
However the question said
SERI is our inventory table with 37180 records. (active inventory that is available)
In my experience, active inventory, isn't the same as all parts.
Normally, in a query like this, I'd expect the first table to be a Parts Master table of some sort that contains every possible part ID.

JOIN query, SQL Server is dropping some rows of my first table

I have two tables customer_details and address_details. I want to display customer details with their corresponding address, so I was using a LEFT JOIN, but when I'm executing this query, SQL Server drops rows where street_no of customer_details table doesn't match with the street_no in address_detials table and displays only rows where `street_no' of customer_detials = street_no of address_details table. I need to display a complete customer_details table and in case if street_no doesn't matches it should display empty string or anything. Am I doing anything wrong in my SQL join?
Table customer_details:
case_id customer_name mob_no street_no
-------------------------------------------------
1 John 242342343 4324234234234
1 Rohan 343233333 43332
1 Ankit 234234233 2342332423433
1 Suresh 234234324 2342342342342
1 Ranjeet 343424323 32233
1 Ramu 234234333 2342342342343
Table address_details:
s_no streen_no address city case_id
------------------------------------------------------
1 4324234234234 Roni road Delhi 1
2 2342332423433 Natan street Lucknow 1
3 2342342342342 Koliko road Herdoi 1
SQL JOIN query:
select
a.*, b.address
from
customer_details a
left join
address_details b on a.street_no = b.street_no
where
b.case_id = 1
Now that it became clear that you used b.case_id=1, I will explain why it filters:
The LEFT JOIN itself returns some rows that contain all NULL values for table b in the result set, which is what you want and expect.
But by using WHERE b.case_id=1, the rows containing NULL values for table b are filtered out because none of them matches the condition (all those rows have b.case_id=NULL so they don't match).
It might work to instead use WHERE a.case_id=1, but we don't know if a.case_id and b.case_id are always the same value for matching rows (they might not be; and if they are always the same, then we just identified a potential redundancy).
There are two ways to fix this for sure.
(1) Move b.case_id = 1 into the left join condition:
left join address_details b on a.street_no = b.street_no and b.case_id = 1
(2) Keep b.case_id = 1 in the WHERE but also allow for NULLED-out b values:
left join address_details b on a.street_no = b.street_no
where b.case_id = 1
or b.street_no IS NULL
Personally I'd go for (1) because that is the most clear way to express that you want to filter b on two conditions, without affecting the rows of a that are being returned.
I do think that Wilhelm Poggenpohl answer is kind of right. You just need to change the last join condition a.case_id=1 to b.case_id=1
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and b.case_id=1
This query will show every row from customer_details and the corresponding adress if there is a match of street_no and the adress meets the condition case_id=1.
This is because of the where clause. Try this:
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and a.case_id=1

Will this left join on same table ever return data?

In SQL Server, on a re-engineering project, I'm walking through some old sprocs, and I've come across this bit. I've hopefully captured the essence in this example:
Example Table
SELECT * FROM People
Id | Name
-------------------------
1 | Bob Slydell
2 | Jim Halpert
3 | Pamela Landy
4 | Bob Wiley
5 | Jim Hawkins
Example Query
SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL
Please disregard formatting, style, and query efficiency issues here. This example is merely an attempt to capture the exact essence of the real query I'm working with.
After looking over the real, more complex version of the query, I burned it down to this above, and I cannot for the life of me see how it would ever return any data. The LEFT JOIN should always exclude everything that was just selected because of the b.Name IS NULL check, right? (and it being the same table). If a row from People was found where b.Name IS NULL evals to true, then shouldn't that mean that data found in People a was never found? (impossible?)
Just to be very clear, I'm not looking for a "solution". The code is what it is. I'm merely trying to understand its behavior for the purpose of re-engineering it.
If this code indeed never returns results, then I'll conclude it was written incorrectly and use that knowledge during the re-engineering.
If there is a valid data scenario where it would/could return results, then that will be news to me and I'll have to go back to the books on SQL Joins! #DrivenCrazy
Yes. There are circumstances where this query will retrieve rows.
The query
SELECT a.*
FROM (
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.PName = b.PName
WHERE b.PName IS NULL;
is roughly (maybe even exactly) equivalent to...
select distinct Id, PName
from People
where Id > 3 and PName is null;
Why?
Tested it using this code (mysql).
create table People (Id int, PName varchar(50));
insert into People (Id, Pname)
values (1, 'Bob Slydell'),
(2, 'Jim Halpert'),
(3,'Pamela Landy'),
(4,'Bob Wiley'),
(5,'Jim Hawkins');
insert into People (Id, PName) values (6,null);
Now run the query. You get
6, Null
I don't know if your schema allows null Name.
What value can P.Name have such that a.PName = b.PName finds no match and b.PName is Null?
Well it's written right there. b.PName is Null.
Can we prove that there is no other case where a row is returned?
Suppose there is a value for (Id,PName) such that PName is not null and a row is returned.
In order to satisfy the condition...
where b.PName is null
...such a value must include a PName that does not match any PName in the People table.
All a.PName and all b.PName values are drawn from People.PName ...
So a.PName may not match itself.
The only scalar value in SQL that does not equal itself is Null.
Therefore if there are no rows with Null PName this query will not return a row.
That's my proposed casual proof.
This is very confusing code. So #DrivenCrazy is appropriate.
The meaning of the query is exactly "return people with id > 3 and a null as name", i.e. it may return data but only if there are null-values in the name:
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3 and PName is null
The proof for this is rather simple, if we consider the meaning of the left join condition ... LEFT JOIN People b ON a.PName = b.PName together with the (overall) condition where p.pname is null:
Generally, a condition where PName = PName is true if and only if PName is not null, and it has exactly the same meaning as where PName is not null. Hence, the left join will match only tuples where pname is not null, but any matching row will subsequently be filtered out by the overall condition where pname is null.
Hence, the left join cannot introduce any new rows in the query, and it cannot reduce the set of rows of the left hand side (as a left join never does). So the left join is superfluous, and the only effective condition is where PName is null.
LEFT JOIN ON returns the rows that INNER JOIN ON returns plus unmatched rows of the left table extended by NULL for the right table columns. If the ON condition does not allow a matched row to have NULL in some column (like b.NAME here being equal to something) then the only NULLs in that column in the result are from unmatched left hand rows. So keeping rows with NULL for that column as the result gives exactly the rows unmatched by the INNER JOIN ON. (This is an idiom. In some cases it can also be expressed via NOT IN or EXCEPT.)
In your case the left table has distinct People rows with a.Id > 3 and the right table has all People rows. So the only a rows unmatched in a.Name = b.Name are those where a.Name IS NULL. So the WHERE returns those rows extended by NULLs.
SELECT * FROM
(SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL) a
LEFT JOIN People b ON 1=0;
But then you SELECT a.*. So the entire query is just
SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL;
sure.left join will return data even if the join is done on the same table.
according to your query
"SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL"
it returns null because of the final filtering "b.Name IS NULL".without that filtering it will return 2 records with id > 3

Two group by tables stich another table

I have 3 tables I need to put together.
The first table is my main transaction table where I need to get distinct transaction id numbers and company id. It has all the important keys. The transaction ids are not unique.
The second table has item info which is linked to transaction id numbers which are not unique and I need to pull items.
The third table has company info which has company id.
Now I've sold some of these with the first one through a group by id. The second through a subquery which creates unique ids and joins onto the first one.
The issue I'm having is the third one by company. I cannot seem to create a query that works in the above combinations. Any ideas?
As suggested here is my code. It works but that's because for the company I used count which doesn't give the correct number. How else can I get the company number to come out correct?
SELECT
dep.ItemIDAPK,
dep.TotalOne,
dep.company,
company.vendname,
appd.ItemIDAPK,
appd.ItemName
FROM (
SELECT
csi.ItemIDAPK,
sum(f.TotalOne) as TotalOne,
count(f.DimCurrentcompanyID) company
FROM dbo.ReportOne F with (nolock)
INNER JOIN dbo.DSaleItem csi with (nolock)
on f.DSaleItemID = csi.DSaleItemID
INNER JOIN dbo.DimCurrentcompany cv
ON f.DimCurrentcompanyID = cv.DimCurrentcompanyID
INNER JOIN dbo.DimDate dat
on f.DimDateID = dat.DimDateID
where (
dat.date >='2013-01-29 00:00:00.000'
and dat.date <= '2013-01-30 00:00:00.000'
)
GROUP BY csi.ItemIDAPK
) as dep
INNER JOIN (
SELECT
vend.DimCurrentcompanyID,
vend.Name vendname
FROM dbo.DimCurrentcompany vend
) As company
on dep.company = company.DimCurrentcompanyID
INNER JOIN (
SELECT
c2.ItemIDAPK,
ItemName
FROM (
SELECT DISTINCT ItemIDAPK
FROM dbo.dimitem AS C
) AS c1
JOIN dbo.dimitem AS c2 ON c1.ItemIDAPK = c2.ItemIDAPK
) as appd
ON dep.ItemIDAPK = appd.ItemIDAPK
For further information my output is the following example, I know the code executes and the companyid is incorrect as I just put it with a (count) in their to make the above code execute:
Current Results:
Item Number TLS CompanyID Company Name Item Number Item Name
111111 300 303 Johnson Corp 29323 Soap
Proposed Results:
Item Number TLS CompanyID Company Name Item Number Item Name
111111 300 29 Johnson Corp 29323 Soap