Better way to do a multi-join in this SQL Query? - sql

I am trying to pull data from a table (a.table) to join to another table (b.table). For me to do that, I need to join a third table (c.table) to reference between tables Plan_Code and Policy_Riders. Please see the code below
USE [CDS]
GO
SELECT riders.ExpiryDt--
,riders.TerminationDt
,[ModalPremium]--
,plan_code
FROM a.table as riders
JOIN c.table as policy
ON policy.Policy_Num = riders.Policy_Num
JOIN b.table AS plan_code
on policy.Plan_Code_ID = plan_code.Plan_Code_ID
WHERE plan_code.Plan_Code LIKE '%EIUL3%'
OR plan_code.Plan_Code LIKE '%LBIUL%'
OR plan_code.Plan_Code LIKE '%MEIUL3%'
GO
For me to get the field name Plan_code from b.table to my output, I need to join first a.table to c.table, then c.table to b.table. My question is that is there a better way to approach this query to join better between the three tables? Any help would be appreciated. Thank you!

First off, use a derived table for the filters:
...
JOIN (SELECT columns
FROM b.table
WHERE Plan_Code LIKE '%EIUL3%'
OR Plan_Code LIKE '%LBIUL%'
OR Plan_Code LIKE '%MEIUL3%'
) AS plan_code ON policy.Plan_Code_ID = plan_code.Plan_Code_ID
This will generally make sure those filters are applied against the smallest set of data, instead of after all the tables are joined. Another option would be to make the above a temp table, then join to it. Same concept, just helping the optimizer work efficiently. In smaller queries you'll see no difference, but in larger ones (especially those like this, with many filters from a single table) it will be night and day.
Second, your filters specifically. Using a LIKE with front and back wildcards %ex% is not good. It won't be able to use indexes. Use one ex% or the other %ex if possible.
Other than that, your joins are fine and are the correct approach to getting columns from each table.

Related

What's the best way to amalgamate the 2 queries below

I wrote the query below as part of a larger query to create a table. As I'm new to SQL, this was done in a very step-by-step manner so that I could easily understand the individual steps in the query and what each part was doing.
However, I've now been tasked to make the below 2 parts of the query more efficient by joining them together, and this is where I'm struggling.
I feel like I should be creating a single table rather than 2 and that the single table should contain all of the columns/values that I require. However, I am not at all sure of the syntax required to make this happen or the order in which I need to re-write the query.
Is anyone able to offer any advice?
Many thanks
sys_type as (select nvl(dw_start_date,sysdate) date_updated, id, descr
from scd2.scd2_table_a
inner join year_month_period
on 1=1
WHERE batch_end_date BETWEEN dw_start_date and NVL(dw_end_date,sysdate)),
sys_type_2 as (select -1 as sys_typ_id,
'Unassigned' as sys_typ_desc,
sysdate as date_updated
from dual
union
select id as sys_typ_id, descr as sys_typ_desc, date_updated
from sys_type),
Assuming you are using Oracle database, the queries above seem fine. I don't think you can make them more efficient just by 'joining' them (joining defined very loosely here. Is there a performance issue?
I think you can get better results by tuning your first inline query 'sys_type'.
You have a cartesian product there. Do you need that? Why don't you put the condition in the where clause as the join clause?
Basically
sys_type as (select nvl(dw_start_date,sysdate) date_updated, id, descr
from scd2.scd2_table_a
inner join year_month_period
on (batch_end_date BETWEEN dw_start_date and NVL(dw_end_date,sysdate)))

Natural Join Explanation

I am a student trying to learn Microsoft SQL and I am frustrated that I cannot understand how natural join works. I have a problem with a solution but I cannot understand how this solution was created through natural join. My friend sent my his old solution but cant remember how he got the result. I really want to figure out how this natural join works. Can someone please explain how this answer was achieved?
The question taken from "Database Design, Application, and Administration 6th Ed.:
Show the result of a natural join that combines the Customer and OrderTbl tables.
Edit: The book doesn't give a query statement on how to get the results for the natural join in the chapters I have read so far. Which makes things much more confusing for me as I cannot simply add the table and send a query.
Edit2: The reason why some entries in the order table have null values is because the question is implying that some orders were processed through the internet. A friend of mine told me that was the reason why he got the question right opposed to some people who argued against it. :S
If I read your question correctly, you are trying to learn two concepts at the same time. The first is INNER JOIN (sometimes called equijoin). The second is natural join.
It's easier to explain natural join, assuming you already know how INNER JOIN works. A natural join is simply an equijoin where the column names indicate to you what the join condition ought to be. In your case, the fact that CustNo appears in both tables is the only clue you need in order to devise the correct join condition. You also include the join field only once in the result.
Column names are actually quite arbitrary, and could have been made very different in this case. for example, if the column Customer.CustNo had been named Customer.ID instead, you wouldn't be able to do a natural join.
for a correct solution in your case, see the answer provided by JamieC.
If you simply want the query which will result in the final table in your question here it is,
SELECT
o.OrdNo,
o.OrdDate,
o.EmpNo,
o.CustNo,
c.CustFirstName
c.CustLastName,
c.CustCity,
c.CustSatate,
c.CustZip,
c.CustBal
FROM OrderTbl o
INNER JOIN Customer c
ON o.CustNo = c.CustNo
ORDER BY c.CustNo
So, by way of explanation; this query selects all data from Customer and OrderTbl joining the two using CustNo which is the primary key (presumably) in Customer and a foreign key in OrderTbl. The ordering of the result is a little more tricky, and based almost purely on guesswork, I suspect the result is ordered by CustNo as well.
The Employee table does not feature at all in the result, however as the OrderTbl table has some blanks for EmpNo, you would almost certainly want a LEFT JOIN/RIGHT JOIN (as appropriate) if you wanted to retrieve any information about the employee from the orders table.
MS SQL Server doesn't support NATURAL JOIN. However, if you were using a platform that would support it, a simple:
SELECT * FROM Customer NATURAL JOIN OrderTbl;
should do the trick.
https://en.wikipedia.org/wiki/Join_(SQL)#Natural_join is quite good.

Ho do you check if subquery records are in range outer fields?

Two tables, tbljob and tblscan.
tblJob has three fields:
job_no
job_start_seq
job_end_seq
tblScan has two fields:
serialnumber
facility_id
How do I query all the serial numbers for each job?
This is what I've come up with, but it's not correct.
SELECT dbo.tblScan.facility_id, dbo.tblScan.serialnumber, dbo.tblJob.job_no
FROM dbo.tblScan CROSS JOIN
dbo.tblJob
WHERE EXISTS
(SELECT job_id, job_no, mailer, job_start_seq, job_end_seq
FROM dbo.tblJob AS tblJob_1
WHERE (dbo.tblScan.serialnumber BETWEEN job_start_seq AND job_end_seq))
Thanks to anyone that can help. I've got way to may hours in this. If you're wondering if the data structure can change, sadly, it cannot.
I never used sql-server, but that being said, I would write something like this:
SELECT dbo.tblScan.facility_id, dbo.tblScan.serialnumber, dbo.tblJob.job_no
FROM dbo.tblScan, dbo.tblJob
WHERE dbo.tblScan.serialnumber BETWEEN job_start_seq AND job_end_seq
This will give you "duplicates" in terms of facility_id and job_no assuming you can have multiple serial numbers for each. Also the example uses your code but you may need to add some things such as dbo.tblJob. in front of job_start_seq and job_end_seq.
Now you have all the serial numbers, and you probably need to do a little work on your data depending on what you need to do with it.
FYI, this kind of statement is called a cross product.
Try like this
SELECT TS.facility_id,TS.serialnumber,TJ.job_no
FROM dbo.tblScan TS CROSS JOIN dbo.tblJob TJ ON
TS.serialnumber BETWEEN TJ.job_start_seq AND TJ.job_end_seq

DB2 Performance CASE vs COALESCE

I'm modifying an existing statement that joins user information in one table so that the user info can come from another table. One is a permanent table and the other is temporary (records get moved from one to the other). I changed my join to a left join and then left joined the second contact info table. I need to select the permanent field if it exists and the temporary if the permanent isn't there. 154306 is the user id of all incoming records on the main table I'm selecting from. Here are my 2 options for selecting fields:
SELECT
CASE WHEN U.USRID = 154306
THEN T.TMPFNAME
ELSE U.FNAME
END AS FNAME,
COALESCE (U.LNAME, T.TMPLNAME) AS LNAME
FROM FILES.ORDERS O
LEFT JOIN FILES.USERS U ON U.USRID <> 154306 AND U.USRID = O.ORDUSR
LEFT JOIN FILES.TMPUSERS T ON O.ORDNUM = T.TMPORD
I'm thinking the case seems more "correct" as it's actually controlling the flow, but since the coalesce has less logic to follow it might perform faster. Either should accomplish the same result because the 2 left joins ensures we'll get the info for the user no matter what, but don't get the permanent user info for orders which are still assigned to the temp user. It looks like we have 10 fields to case/coalesce so I'm thinking the method with better performance is the way to go, which I think is coalesce but I'm not even sure on that. Is either way better for any reason?
The performance of case versus coalesce() just will not make a difference to a query that is joining three large tables. Such queries are dominated by the time for reading and matching the rows in the table.
By the way, the two are not exactly the same. If you have NULL values in users.Fname, then the case logic would keep them but the coalesce() logic would fill in the values from the other table.
Your criterion should be clarity of expression. Because you think the case makes more sense, I would suggest you go with that.

Is there some equivalent to subquery correlation when making a derived table?

I need to flatten out 2 rows in a vertical table (and then join to a third table) I generally do this by making a derived table for each field I need. There's only two fields, I figure this isn't that unreasonable.
But I know that the rows I want back in the derived table, are the subset that's in my join with my third table.
So I'm trying to figure out the best derived tables to make so that the query runs most efficiently.
I figure the more restrictive I make the derived table's where clause, the smaller the derived table will be, the better response I'll get.
Really what I want is to correlate the where clause of the derived table with the join with the 3rd table, but you can't do that in sql, which is too bad. But I'm no sql master, maybe there's some trick I don't know about.
The other option is just to make the derived table(s) with no where clause and it just ends up joining the entire table twice (once for each field), and when I do my join against them the join filters every thing out.
So really what I'm asking I guess is what's the best way to make a derived table where I know pretty much specifically what rows I want, but sql won't let me get at them.
An example:
table1
------
id tag value
-- ----- -----
1 first john
1 last smith
2 first sally
2 last smithers
table2
------
id occupation
-- ----------
1 carpenter
2 homemaker
select table2.occupation, firsttable.first, lasttable.last from
table2, (select value as first from table1 where tag = 'first') firsttable,
(select value as last from table1 where tag = 'last') lasttable
where table2.id = firsttable.id and table2.id = lasttable.id
What I want to do is make the firsttable where clause where tag='first' and id = table2.id
DERIVED tables are not to store the intermediate results as you expect. These are just a way to make code simpler. Using derived table doesnt mean that the derived table expression will be executed first and output of that will be used to join with remaining tables.Optimizer will automaticaly faltten derived tables in most of the cases.
However,There are cases where the optimizer might want to store the results of the subquery and thus materilize instead of flattening.It usually happens when you have some kind of aggregate functions or like that.But in your case the query is too simple and thus optimizer will flatten query
Also,storing derived table expression wont make your query fast it will in turn could make it worse.Your real problem is too much normalization.Fix that query will be just a join of two tables.
Why you have this kind of normalization?Why you are storing col values as rows.Try to denormalize table1 so that it has two columns first and last.That will be best solution for this.
Also, do you have proper indexes on id and tag column? if yes then a merge join is quite good for your query.
Please provide index details on these tables and the plan generated by your query.
Your query will be used like an inner join query.
select table2.occupation, first.valkue as first, last.value as last
from
table2
inner join table1 first
on first.tag = 'first'
and first.id =table2.id
inner join table1 last
on last.tag = 'last'
and table2.id = last.id
I think what you're asking for is a COMMON TABLE EXPRESSION. If your platform doesn't implement them, then a temporary table may be the best alternative.
I'm a little confused. Your query looks okay . . . although it looks better with proper join syntax.
select table2.occupation, firsttable.first, lasttable.last
from table2 join
(select value as first from table1 where tag = 'first') firsttable
on table2.id = firsttable.id join
(select value as last from table1 where tag = 'last') lasttable
on table2.id = lasttable.id
This query does what you are asking it to do. SQL is a declarative language, not a procedural language. This means that you describe the result set and rely on the database SQL compiler to turn it into the right set of commands. (That said, sometimes how a query is structured does make it easier or harder for some engines to produce efficient query plans.)