Which is the best practice to write sql - sql

Below two queries give same results.
Just wanted to know which one is better in terms of performance.
Query 1:
SELECT N.*
FROM NOTIFICATIONS N
JOIN NOTIFICATION_COMPANY_GROUPS NCG
ON ( N.COMPANY_ID = NCG.COMPANY_ID
AND N.ID = NCG.NOTIFICATION_ID )
JOIN COMPANY_USER_GROUPS CUG
ON ( N.COMPANY_ID = CUG.COMPANY_ID
AND CUG.COMPANY_GROUP_ID = NCG.COMPANY_GROUP_ID )
JOIN NOTIFICATION_PROPERTIES NP ON ( N.COMPANY_ID = NP.COMPANY_ID )
JOIN COMPANY_USER_PROPERTIES CUP
ON ( N.COMPANY_ID = CUP.COMPANY_ID
AND CUP.PROPERTY_ID = NP.PROPERTY_ID )
WHERE N.COMPANY_ID = 2138
AND CUG.COMPANY_USER_ID = 41422
AND CUP.COMPANY_USER_ID = 41422;
Query 2:
SELECT N.*
FROM NOTIFICATIONS N
JOIN NOTIFICATION_COMPANY_GROUPS NCG
ON ( N.COMPANY_ID = 2138
AND N.COMPANY_ID = NCG.COMPANY_ID
AND N.ID = NCG.NOTIFICATION_ID )
JOIN COMPANY_USER_GROUPS CUG
ON ( CUG.COMPANY_USER_ID = 41422
AND N.COMPANY_ID = CUG.COMPANY_ID
AND CUG.COMPANY_GROUP_ID = NCG.COMPANY_GROUP_ID )
JOIN NOTIFICATION_PROPERTIES NP ON ( N.COMPANY_ID = NP.COMPANY_ID )
JOIN COMPANY_USER_PROPERTIES CUP
ON ( CUP.COMPANY_USER_ID = 41422
AND N.COMPANY_ID = CUP.COMPANY_ID
AND CUP.PROPERTY_ID = NP.PROPERTY_ID );

I expect the performance should be the same, but you can use EXPLAIN to verify that the query plan is the same.
However, the first version is the "proper" way to write it. Generally, ON clauses should only contain conditions that relate the tables being joines, while conditions on single tables should be in WHERE clauses.
The only exception to this is in LEFT JOIN clauses, where conditions on the table being joined should be in the ON clause. This is because if you put them in the WHERE clause, the null rows from rows in the main table that have no matches in the joining table will be filtered out unless you explicitly check for NULL. As an example:
SELECT ...
FROM T1
LEFT JOIN T2 ON T2.T1_id = T1.id AND T2.someCol = 3
versus
SELECT ...
FROM T1
LEFT JOIN T2 ON T2.T1_id = T1.id
WHERE T2.someCol = 3
In the first version, the test of T2.someCol is done before joining; the result will contain all rows from T1, but the ones with no matching row in T2 will have NULL for all the T2 columns. But the second version won't have any of these non-matching rows, because the join is done first, and then it performs the T2.someCol = 3 test; if there was no matching T2 row, T2.someCol will be NULL, and this test will fail and the row will be filtered out by WHERE.
In the case of an inner join, it doesn't matter whether you do the comparison before or after joining, the results are equivalent. The query planner should order these in whichever way takes best advantage of indexes.

Related

Join or match value from two tables

I have two tables . Based on the first table I need to check if it is locked or not.
In the below example , if the combination is present then I would pick else it should match with 'All' and bring that record.
Lock Table
Transaction Table
Output
Query tried - But it is doing a cross join . I understand the reason but could not solve it
SELECT a.GROUP,a.OFFICE,b.LOCK
FROM T_ITEMS a INNER JOIN LOCKED_T b
ON a.ORG=c.ORG
AND (a.OFFICE =b.OFFICE OR b.OFFICE='All')
AND a.GROUP=b.GROUP
What you want to do is match on group or use all as a "wildcard". The problem is that you are matching on both for one of the items -- so you get two results.
So what you do is the first join
SELECT a.GROUP, a.OFFICE, b.LOCK
FROM T_ITEMS a
LEFT JOIN LOCKED_T b ON a.ORG = c.ORG A
AND a.OFFICE = b.OFFICE
AND a.GROUP = b.GROUP
Now take those results and try to fill in the missing ones (missing ones will have a null in the lock column
SELECT
BASE.GROUP, BASE.OFFICE, COLLESCE(BASE.LOCK, L.LOCK) AS LOCK
FROM
(SELECT
a.GROUP, a.OFFICE, b.LOCK
FROM
T_ITEMS a
LEFT JOIN
LOCKED_T b ON a.ORG = c.ORG
AND a.OFFICE = b.OFFICE
AND a.GROUP = b.GROUP) BASE
LEFT JOIN
LOCKED_T L ON BASE.ORG = L.ORG
AND L.OFFICE = 'All'
AND base.GROUP = L.GROUP
AND BASE.LOCK IS NULL
I look at this as a "defaulting" problem. That can be solved with two left joins:
SELECT i.GROUP, i.OFFICE,
COALESCE(l.LOCK, l_default.LOCK)
FROM T_ITEMS i LEFT JOIN
LOCKED_T l
ON l.ORG = i.ORG AND l.OFFICE = i.OFFICE LEFT JOIN
LOCKED_T l_default
ON l_default.OFFICE = 'All' AND l_default.GROUP = i.GROUP AND l.ORG IS NULL;
As the number of combinations grows, this gets trickier. So a more generalizable alternative uses a correlated subquery:
SELECT i.*,
(SELECT MAX(l.LOCK) KEEP (DENSE_RANK FIRST ORDER BY NULLIF(l.OFFICE, 'All) NULLS LAST,NULLIF(l.GROUP, 'All) NULLS LAST
FROM LOCKED_T l
WHERE (l.OFFICE = i.OFFICE OR l.OFFICE = 'All') AND
(l.GROUP = i.GROUP OR l.GROUP = 'All')
) as LOCKED
FROM T_ITEMS i;
Oracle 12C supports lateral joins so this can actually be in the FROM clause instead.

SQL, Self join on 3 joined tables

Tables and requested output
I'm using National Instrumets Teststand default database setup. I've tried to simplify the DB layout in the picture above.
I can manage to get what i want through some rather "complicated" sql, and it's very slow.
I think there is a better way, and then i stumbled over SELF JOIN. Basically what I want is to get data values from several different rows, from one "serial number".
My problem is to combine the self Join with the "general" join of my tables.
I'm using an Access Databdase at the moment.
This will give you the output you're aiming for with the sample data:
with x as (
select
row_number() over (partition by t1.Serial order by t1.Serial) as [RN],
t1.Serial,
case when t3.Sub_Test_Name = 'AAA' then t3.Value end as [AAA],
case when t3.Sub_Test_Name = 'BBB' then t3.Value end as [BBB],
case when t3.Sub_Test_Name = 'CCC' then t3.Value end as [CCC],
case when t3.Sub_Test_Name = 'DDD' then t3.Value end as [DDD]
from Table_1 t1
inner join Table_2 t2 on t2.Table_1_Id = t1.Id
inner join Table_3 t3 on t3.Table_2_Id = t2.Id
)
select
x.Serial,
AAA.AAA,
BBB.BBB,
CCC.CCC,
DDD.DDD
from x
left outer join x AAA on AAA.Serial = x.Serial and AAA.RN = x.rn + 0
left outer join x BBB on BBB.Serial = x.Serial and BBB.RN = x.rn + 1
left outer join x CCC on CCC.Serial = x.Serial and CCC.RN = x.rn + 2
left outer join x DDD on DDD.Serial = x.Serial and DDD.RN = x.rn + 3
where x.rn = 1
This uses self joins as you mentioned (where you see x being left joined to itself multiple times in the final select statement).
I've deliberately added extra columns CCC and DDD so it is easier to see how you would build this out for a larger data set, incrementing the row_number offset for each join.
I've tested this in SQL Fiddle and you're welcome to play around with it. If you need to apply additional filters, your where clause should be placed inside the CTE.
Note, you're effectively pivoting the data with this sort of query (except we're not aggregating anything, so we can't use the built in PIVOT option). The downside of both this method and real pivots is that you have to manually specify every column header with its own CASE statement in the CTE, and a left join in the final select statement. This can get unwieldy in medium - large data sets, so it best suited in cases where you will have a small number of known column headers in your results.

Query with combined WHERE clause is slower than two individual WHERE clauses

I'm having a performance problem with a SQL query that is generated by a .NET application.
Basically what the query is doing is:
(query1) left join (query2) right join (queries3 to 30) WHERE (query1.ID IS NULL) OR (query3.ID IS NULL AND query4.ID IS NULL AND… queryN.ID IS NULL)
When the query only does WHERE A (query1.ID) the query is fast.
When the query only does WHERE B (query3 to 30) the query is fast
When A and B are a combined WHERE clause with an OR, the query is
very slow.
I'm looking for a way to optimize this query without variables or stored procedures.
The query:
SELECT DISTINCT [Table0].[FIELD]
FROM /*8*/ ([Table0] AS [Table0]
INNER JOIN
[XTABLE] AS [XTABLE0]
ON [Table0].ID = [XTABLE0].ID1
AND [XTABLE0].ID3 = 52)
RIGHT OUTER /*10*/ JOIN
[Table1] AS [Table1]
/*21*/ /*11*/ ON [XTABLE0].ID2 = [Table1].ID
AND [XTABLE0].ID3 = 52
LEFT OUTER JOIN
([XTABLE] AS [XTABLE1]
INNER JOIN
[Table2] AS [Table2]
ON [XTABLE1].ID1 = [Table2].ID
AND [XTABLE1].ID3 = 19
/*20a*/ INNER JOIN
[XTABLE] AS [XTABLE2]
ON [Table2].ID = [XTABLE2].ID1
AND [XTABLE2].ID3 = 8
INNER JOIN
[Table3] AS [Table3]
ON [XTABLE2].ID2 = [Table3].ID
AND [XTABLE2].ID3 = 8/*22*/ )
ON [Table1].ID = [XTABLE1].ID2
AND [XTABLE1].ID3 = 19
/*26 */ LEFT OUTER JOIN
([XTABLE] AS [XTABLE3]
... and tens of similar INNER JOIN blocks
WHERE (/*13*/ [XTABLE0].ID IS NULL)
OR (/*25*/ [XTABLE1].ID IS NULL
AND /*27b*/ [XTABLE3].ID IS NULL
AND /*27b*/ [XTABLE5].ID IS NULL
... and tens of similar lines
AND /*27b*/ [XTABLE131].ID IS NULL);
You are OUTER JOIN'ing the queries, so, when you start putting stuff in the WHERE clause from the result of the OUTER JOIN table expressions (derived table in this case) then it will more than likely be treat as an INNER JOIN - you can see that by checking the query plan.

Joining two tables on a key and then left outer joining a table on a number of criteria

I'm attempting to join 3 tables together in a single query. The first two have a key so each entry has a matching entry. This joined table will then be joined by a third table that could produce multiple entries for each entry from the first table (the joined ones).
select * from
(select a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession
from trade_monthly a, trade_monthly_second b
where
a.bidentifier = b.jidentifier AND
a.bsession = b.JSession)
left outer join
trade c
on c.symbol = a.symbol
order by a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
There will be more criteria (not just c.symbol = a.symbol) on the left outer join but for now this should be useful. How can I nest the queries this way? I'm gettin gan SQL command not properly ended error.
Any help is appreciated.
Thanks
For what I know every derived table must be given a name; so try something like this:
SELECT * FROM
(SELECT a.bidentifier, ....
...
a.bsession = b.JSession) t
LEFT JOIN trade c
ON c.symbol = t.symbol
ORDER BY t.bidentifier, ...
Anyway I think you could use a simpler query:
SELECT a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.*
FROM trade_monthly a
INNER JOIN trade_monthly_second b
ON a.bidentifier = b.jidentifier
AND a.bsession = b.JSession
LEFT JOIN trade c
ON c.symbol = a.symbol
ORDER BY a.bidentifier, a.bsession, a.symbol, b.jidentifier, b.JSession, c.symbol
Try this:
SELECT
`trade_monthly`.`bidentifier` AS `bidentifier`,
`trade_monthly`.`bsession` AS `bsession`,
`trade_monthly`.`symbol` AS `symbol`,
`trade_monthly_second`.`jidentifier` AS `jidentifier`,
`trade_monthly_second`.`jsession` AS `jsession`
FROM
(
(
`trade_monthly`
JOIN `trade_monthly_second` ON(
(
(
`trade_monthly`.`bidentifier` = `trade_monthly_second`.`jidentifier`
)
AND(
`trade_monthly`.`bsession` = `trade_monthly_second`.`jsession`
)
)
)
)
JOIN `trade` ON(
(
`trade`.`symbol` = `trade_monthly`.`symbol`
)
)
)
ORDER BY
`trade_monthly`.`bidentifier`,
`trade_monthly`.`bsession`,
`trade_monthly`.`symbol`,
`trade_monthly_second`.`jidentifier`,
`trade_monthly_second`.`jsession`,
`trade`.`symbol`
Why don't you just create a view of the two inner joined tables. Then you can build a query that joins this view to the trade table using the left outer join matching criteria.
In my opinion, views are one of the most overlooked solutions to a lot of complex queries.

Multiple joins in a query

I have this SP with more than 10tables involved. In the underlined lines, there is a table AllData which is being joined 3times because of the fieldname in the where clause.
Any suggestions on how to handle this complex query better will be greatly appreciated. Mostly, to avoid the multiple times I am joining AllData(with alias names ad1, adl2, adl3). This could affect the performance.
Here is the sp
ALTER PROCEDURE [dbo].[StoredProc1]
AS
select case when pd.Show_Photo = '1,1,1'
then i.id
else null
end as thumbimage,
t1.FPId,
'WebProfile' as profiletype,
mmbp.Name as Name,
t1.Age,
t1.Height,
adl.ListValue as AlldataValue1,
adl2.ListValue as AlldataValue2,
adl3.ListValue as AlldataValue3,
c.CName,
ed.ELevel,
ed.EDeg,
NEWID()
from Table2 mmbp, Table3 u
join Table1 t1 on t1.Pid = u.Pid
left join Table4 mmb on t1.Pid= mmb.Pid
join table5 i on t1.Pid = i.Pid
join table6 pd on t1.Pid = pd.Pid
join table7 ed on t1.Pid = ed.Pid
join table8 c on t1.xxx= c.xxx
join AllData adl on t1.xxx = adl.ListKey
join AllData adl2 on b.ms = adl2.ListKey
join AllData adl3 on b.Diet = adl3.ListKey
where adl.FieldName=xxx and
adl2.FieldName='ms' and
adl3.FieldName='Diet' and
------
I note that you appear to have a cartesian join between Table2 and Table3 - unless one of these tables is very small, this is likely to drastically affect performance. I suggest explicitly joining Table2 to one of the other tables in the query, to improve performance.
One thing you could try is moving the where conditions into the joins
join AllData ad1 on t1.xxx = ad1.ListKey AND ad1.FieldName = xxx
join AllData ad2 on b.ms = adl2.ListKey AND ad2.FieldName = 'ms'
join AllData ad3 on b.Diet = adl3.ListKey AND ad3.FieldName = 'Diet'
This would give better performance as the join size would be limited to only the records you want. To do this all in one join you could join AllData ad on (t1.xxx = ad.ListKey AND ad.FieldName = xxx) OR (b.ms = ad.ListKey AND ad.FieldName = 'ms').... The issue with this option is you no longer have distinct columns for ad1, ad2, etc.