Weird SQL Server view definition - sql-server-2005

I've "inherited" a well over 10-year old app, and it does show its age at times. I've stumbled across a really weird view definition today - I just can't seem to make sense of it! Can you help me? This was originally on SQL Server 7.0 and has since been migrated to SQL Server 2005 - but obviously it's never been refactored / redone....
This is the view definition - based on a bunch of tables and another view:
CREATE VIEW dbo.MyOddView
AS
SELECT
t1.MVOID, t1.SomeOtherColumn,
t2.Number ,
t3.OID, t3.FKOID,
t4.AcctNo,
t5.ShortDesc, t5.ZipCode, t5.City,
t6.BankAcctNo
FROM
dbo.viewFirst vf
INNER JOIN
dbo.Table1 t1 ON vf.MVOID = t1.MVOID AND vf.ValidFrom = t1.ValidFrom
LEFT OUTER JOIN
dbo.Table2 t2
RIGHT OUTER JOIN
dbo.Table3 t3 ON t2.OID = t3.FKOID
LEFT OUTER JOIN
dbo.Table4 t4 ON t3.ZVOID = t4.OID
LEFT OUTER JOIN
dbo.Table5 t5
INNER JOIN
dbo.Table4 t6 ON t5.OID = t6.BCOID
ON t4.ZVOID = t5.OID
ON t2.AddressOID = t4.OID
GO
What I don't get are the two JOIN's (for Table2 t2 and Table5 t5) which have no JOIN condition listed next to them, and the two extra ON conditions at the end of the view definition - I can't seem to rip this apart and put it back together in "proper" ANSI JOIN syntax so that my row count is the same...... (my original view gets me something over 12'000 rows, and a first attempt at refactoring this returned a single row......)
Any ideas? What the heck is this? Seems like totally invalid SQL to me - but it appears to be doing its job (and has been for the past several years....) Any thoughts? Pointers?

SELECT ...
FROM dbo.viewFirst vf
INNER JOIN dbo.Table1 t1
ON vf.MVOID = t1.MVOID
AND vf.ValidFrom = t1.ValidFrom
LEFT OUTER JOIN dbo.Table2 t2
RIGHT OUTER JOIN dbo.Table3 t3
ON t2.OID = t3.FKOID
LEFT OUTER JOIN dbo.Table4 t4
ON t3.ZVOID = t4.OID
LEFT OUTER JOIN dbo.Table5 t5
INNER JOIN dbo.Table4 t6
ON t5.OID = t6.BCOID
ON t4.ZVOID = t5.OID
ON t2.AddressOID = t4.OID
This syntax is covered in chapter 7 of Inside SQL Server 2008 T-SQL Querying or see this article by Itzik Ben Gan and the follow up letter by Lubor Kollar
Having the ON clause for t2.AddressOID = t4.OID last for example means that the JOIN of t2 logically happens last. i.e the other joins are logically processed first then the LEFT JOIN happens against the result of those Joins.

Related

Inner join 2 clauses of different tables?

I'm asking if is it possible to use join with 2 clauses of differente tables like :
select t1.x,t1.y, t1.z
from t1
inner join t2 on t1.conditionA = t2.conditionA
inner join t3 on t2.conditionB = t3.conditionB
inner join t4 on t3.conditionC = t4.conditionC and t2.conditionD = t4.conditionD
where....
Is it worng? how should I do it?
Yes, you can, but it is a clue that your tables are not well designed.

Want to understand a query for a view i'm trying to dissect

I'm confused by a bit of a query i'm working with.
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
right outer join Table3
right outer join Table4
inner join Table5
on Table4.id1 = Table5.id1
on Table3.id1 = Table5.id2
on Table1.id2 = Table5.id3
I tried to keep the query as close to what i'm working with as I could.
I don't understand the joins without the ON and then the join with multiple ONs.
Are tables 3 and 4 not actually being joined until after table 5 is joined?
The following doesn't work as Table5.id1 and Table5.id2 receive 'multi-part identifier "Table5.id_" could not be bound
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
right outer join Table3
on Table3.id1 = Table5.id2
right outer join Table4
on Table4.id1 = Table5.id1
inner join Table5
on Table1.id2 = Table5.id3
Additionally, this bit does process because table 5 is joined first and solves the bounding error, but I receive about 27k more records than is wanted
select *
from Table1
inner join Table2
on Table1.id1 = Table2.id1
inner join Table5
on Table1.id2 = Table5.id3
right outer join Table3
on Table3.id1 = Table5.id2
right outer join Table4
on Table4.id1 = Table5.id1
So at this point it's obvious the original query is built the way it is for a reason, but I still don't understand the logic behind it or what is actually happening.
Any help would be much appreciated.
What you have here are multiple nested joins. Before I explain, I'm going to reformat the query a bit to make it easier to see what's going on.
select *
from Table1
inner join Table2 on Table1.id1 = Table2.id1
right outer join Table3 -- Join B
right outer join Table4 -- Join A
inner join Table5 on Table4.id1 = Table5.id1
on Table3.id1 = Table5.id2 -- ON clause for Join A
on Table1.id2 = Table5.id3 -- ON clause for Join B
Nested joins let you join two tables together then join that result to another set of records. Initially that doesn't sound terribly useful. That's just a regular join, right? Kinda. The difference is that only if the inner-most join succeeds does it attempt to join that row to the outer table. This isn't really useful at all if all you are doing is using inner joins. It becomes a lot more interesting if you are mixing inner and outer joins (more on that shortly).
I'll attempt to explain what's going on with this query, both in prose and in comments so hopefully between the two it will make sense.
First, the inner-most join here is an inner join between tables 4 and 5. Those tables are joined together first. That will give you a result set where each row in Table4 has at least one matching row in Table5 (according to whatever criteria exists in the on clause, in this case that Table4.id1 = Table5.id1). This implicitly filters out any rows from both Table4 and Table5 that don't have a match in the other table.
Then that result is then right joined to Table3 (on Table3.id1 = Table5.id2). Meaning you will get all records from Table3 joined with their corresponding match in the Table4/5 join set (if present).
Then we do a right join on that whole result set with Table1 (on Table3.id1 = Table5.id2). Meaning we will end up with everything in Table3 joined to the Table4/5 combo and then to a Table 1/2 combo.
The ultimate result set is everything from Table3 joined with 0 or more rows that match with Table1 and Table2 (if Table1 doesn't have a matching Table2 record, neither will be joined to Table3). Same for Table4/5. I believe this is correct (too much staring at this without the ability to run the query means I may have confused myself, but the basic idea is correct).
So why this crazy syntax? Alternatives are kind of a pain too. You could use CTEs or apply statements, both of which are their own kind of fun (not necessarily hard, just not your vanilla SQL. I tried converting your query using those and I think I got reasonably close, then I confused myself into a corner because of poor naming of things then I gave up). So why do this? Well it means you can ensure that you can outer join two tables to a third table only if there are matches in the first two tables. Maybe a more concrete example would help?
Say you have 4 tables Person, Order, OrderItem and OrderItemDiscount. You are tasked with getting back a result set that shows every order and to highlight orders that contain a Figlewubbit and where a discount code was used on it. So you write this:
select *
from Person p
left join Order o on o.PersonId = p.PersonId
left join OrderItem oi on oi.OrderId = o.OrderId
and oi.ItemName = 'Figlewubbit'
left join OrderItemDiscount oid on oid.OrderItemId = oi.OrderItemId
Another way to write it would be this:
select *
from Person p
left join Order o on o.PersonId = p.PersonId
left join OrderItem oi
inner join OrderItemDiscount oid on oid.OrderItemId = oi.OrderItemId
on oi.OrderId = o.OrderId
and oi.ItemName = 'Figlewubbit'
The execution plan here will change. OrderItem and OrderItemdDiscount will get joined together then that set will get fed into the left join to Order. Each OrderItem and OrderItemdDiscount joined row is effectively treated as a combined entity for the other joins. You won't get one without the other.
(I apologize if this example seems contrived. Nested joins are a weird beast. They have their uses (I've needed them once or twice). But coming up with a simple example that requires their use is quite hard. They are a very specialized tool that usually requires an equally specialized (and complicated) requirement to warrant their use. I highly recommend researching this some more and using simple versions of them first. Combining right joins and multiple nested joins even gives me a headache trying to parse it.)
Actually, tables 4 and 5 are joined first, then table 3 is joined. Here's execution plan:

Right join or which type of join?

I have 4 tables.
1 with just sitenames.
3 tables, which contains sitenames and the ammount of hits on them for different user types.
I need to make a report on the number of hits on thoose sites for each user type like this
Site Userype1hits Userype2hits Userype3hits
so in the select part I neeed to crosscheck with a table called noanswer
like
select *
from table
where site in (select site from noanswer)
So from what I understand I need to use join, and in this case right join?
How do I do with join in this query?
You would use left join:
select sn.*, t1.cnt, t2.cnt, t3.cnt
from sitenames sn left join
table1 t1
on t1.name = sn.name left join
table2 t2
on t2.name = sn.name left join
table3 t3
on t3.name = sn.name;
Your question is vague on the field names.
A left join keeps all rows in the first table and matches rows in the subsequent tables. It is much more commonly used than right join, probably because it is easier to read the logic thinking "I'll keep all of these rows". A series of right joins actually keeps all rows in the last table, so you have to wait to see which rows stay.
I did like this, googled some further as I did not get it to work even with the answer here, joins are totaly new to me. The internalusers, external and so on is just fake to post what I did here as I cant use the real names.
select s.Site,s2.Total_hits as 'Internalusers',s3.Total_Hits as 'ExternalUsers',
s4.Total_hits as 'Comapany2users' from Database.Table as S
left outer join HITAnalyze.dbo.Internal as s2 on s2.site = s.Site
left outer join HITAnalyze.dbo.External1 as s3 on s3.site = s.Site
left outer join HITAnalyze.dbo.Company2 as s4 on s4.site = s.Site

Old SQL? multiple tables after FROM

I'm trying to rewrite the following query which was generated in some part by Impromptu. For some reason I cannot get my head around the multiple tables after the FROM and the nest of joins that follows. I have always used a "master" table then joined anything I would need after that. i.e select from, left join (table) on x=x left join (table) on x=x and so on.
I'm having a difficult time with the parenths etc... in this. What would it look like written in "normal" query style? Thanks so much in advance!
select
T6."dateofservice"
, getdate(), -20000 - (1 - convert(float(53),-2) / abs(-2)) / 2
, T1."pgrp_specialty"
, T1."pgrp_prov_combo"
, T2."patsex"
, T3."restricted"
, T2."patdob"
, T2."patdecdate"
, T2."acctno"
, T2."patno"
from
"acctdemo_t" T3, "transaction_t" T5,
("patdemo_t" T2
LEFT OUTER JOIN ("provcode_t" T8
LEFT OUTER JOIN "provalt_t" T1 on T8."px" = T1."accesspractice" and T8."provcode" = T1."accessprovidercode") on T2."provcode" = T8."provcode")
LEFT OUTER JOIN "insset_t" T4 on T2."acctno" = T4."acctno" and T2."patno" = T4."patno", "charge_t" T6
LEFT OUTER JOIN "poscode_t" T7 on T6."poscode" = T7."poscode"
where
T2."patsex" <> 'U'
and T7."posid" = '3'
and T6."correction" = 'N'
and T5."txtype" = 'C'
and (T4."defaultset" = 'Y' or
(T4."inssetno" = 0 or T4."inssetno" is null)
and T4."defaultset" is null)
and T6."chgno" = T5."chgno"
and T2."patno" = T6."patno"
and T2."acctno" = T6."acctno"
and T2."acctno" = T3."acctno"
SQL written with multiple tables, separated by comma is implied INNER JOIN, with the where clause serving as the join clause. For example,
select * from table1 a, table2 b
where a.id = b.id
is the same as:
select * from table1 a inner join table2 b on a.id = b.id
So, in your case, this part:
from "acctdemo_t" T3
, "transaction_t" T5
is implying inner join between acctdemo_t and transaction_t.
And - the third part of this:
("patdemo_t" T2 LEFT OUTER JOIN
("provcode_t" T8 LEFT OUTER JOIN "provalt_t" T1 on T8."px" = T1."accesspractice" and T8."provcode" = T1."accessprovidercode")
on T2."provcode" = T8."provcode")
Is actually a tableset being created on the fly with its own clauses, and is basically acting as a table in this join. Also being joined with Inner join, since its added with a comma.
I believe this can definitely written to be more readable, and the inline tableset being created isn't going to be good for performance, but that totally depends on the number of records you have.

SQL style question: INNER JOIN in FROM clause or WHERE clause?

If you are going to join multiple tables in a SQL query, where do you think is a better place to put the join statement: in the FROM clause or the WHERE clause?
If you are going to do it in the FROM clause, how do you format it so that it is clear and readable? (I'm talking about indents, newlines, whitespace in general.)
Are there any advantages/disadvantages to each?
I tend to use the FROM clause, or rather the JOIN clause itself, indenting like this (and using aliases):
SELECT t1.field1, t2.field2, t3.field3
FROM table1 t1
INNER JOIN table2 t2
ON t1.id1 = t2.id1
INNER JOIN table3 t3
ON t1.id1 = t3.id3
This keeps the join condition close to where the join is made. I find it easier to understand this way then trying to look through the WHERE clause to figure out what exactly is joined how.
When making OUTER JOINs (ANSI-89 or ANSI-92), filtration location matters because criteria specified in the ON clause is applied before the JOIN is made. Criteria against an OUTER JOINed table provided in the WHERE clause is applied after the JOIN is made. This can produce very different result sets.
In comparison, it doesn't matter for INNER JOINs if the criteria is provided in the ON or WHERE clauses -- the result will be the same. That said, I strive to keep the WHERE clause clean -- anything related to JOINed tables will be in their respective ON clause. Saves hunting through the WHERE clause, which is why ANSI-92 syntax is more readable.
I prefer the FROM clause if for no other reason that it distinguishes between filtering results (from a Cartesian product) merely between foreign key relationships and between a logical restriction. For example:
SELECT * FROM Products P JOIN ProductPricing PP ON P.Id = PP.ProductId
WHERE PP.Price > 10
As opposed to
SELECT * FROM Products P, ProductPricing PP
WHERE P.Id = PP.ProductID AND Price > 10
I can look at the first one and instantly know that the only logical restriction I'm placing is the price, as opposed to the implicit machinery of joining tables together on the relationship key.
I almost always use the ANSI 92 joins because it makes it clear that these conditions are for JOINING.
Typically I write it this way
FROM
foo f
INNER JOIN bar b
ON f.id = b.id
sometimes I write it this way when it trivial
FROM
foo f
INNER JOIN bar b ON f.id = b.id
INNER JOIN baz b2 ON b.id = b2.id
When its not trivial I do the first way
e.g.
FROM
foo f
INNER JOIN bar b
ON f.id = b.id
and b.type = 1
or
FROM
foo f
INNER JOIN (
SELECT max(date) date, id
FROM foo
GROUP BY
id) lastF
ON f.id = lastF.id
and f.date = lastF.Date
Or really the weird (not sure if I got the parens correctly but its supposed to be an LEFT join to table bar but bar needs an inner join to baz)
FROM
foo f
LEFT JOIN (bar b
INNER JOIN baz b2
ON b.id = b2.id
)ON f.id = b.id
You should put joins in Join clauses which means the From clause. A different question could be had about where to put filtering statements.
With respect to indenting, there are many styles. My preference is to indent related joins and keep main clauses like Select, From, Where, Group By, Having and Order By indented at the same level. In addition, I put each of these main attributes and the first line of an On clause on its own line.
Select ..
From Table1
Join Table2
On Table2.FK = Table1.PK
And Table2.OtherCol = '12345'
And Table2.OtherCol2 = 9876
Left Join (Table3
Join Table4
On Table4.FK = Table3.PK)
On Table3.FK = Table2.PK
Where ...
Group By ...
Having ...
Order By ...
Use the FROM clause to be compliant with ANSI-92 standards.
This:
select *
from a
inner join b
on a.id = b.id
where a.SomeColumn = 'x'
Not this:
select *
from a, b
where a.id = b.id
and a.SomeColumn = 'x'
I definitely always do my JOINS (of whatever type) in my FROM clause.
The way I indent them is this:
SELECT fields
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.t1_id
INNER JOIN table3 t3 ON t1.id = t3.t1_id
AND
t2.id = t3.t2_id
In fact, I'll generally go a step farther and move as much of my constraining logic from the WHERE clause to the FROM clause, because this (at least in MS SQL) front-loads the constraint, meaning that it reduces the size of the recordset sooner in the query construction (I've seen documentation that contradicts this, but my execution plans are invariably more efficient when I do it this way).
For example, if I wanted to only select things in the above query where t3.id = 3, you could but that in the WHERE clause, or you could do it this way:
SELECT fields
FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.t1_id
INNER JOIN table3 t3 ON t1.id = t3.t1_id
AND
t2.id = t3.t2_id
AND
t3.id = 3
I personally find queries laid out in this way to be very readable and maintainable, but this is certainly a matter of personal preference, so YMMV.
Regardless, I hope this helps.
ANSI joins. I omit any optional keywords from the SQL as they only add noise to the equation. There's no such thing as a left inner join, is there? And by default, a simple join is an inner join, so there's no particular point to saying 'inner join'.
Then I column align things as much as possible.
The point being that a large complex SQL query can be very difficult to comprehend, so the more order that is imposed on it to make it more readable, the better. Any body looking at the query to fix, modify or tune it, needs to be able to answer a few things off right off the bat:
what tables/views are involved in the query?
what are the criteria for each join? What's the cardinality of each join?
what/how many columns are returned by the query
I like to write my queries so they look something like this:
select PatientID = rpt.ipatientid ,
EventDate = d.dEvent ,
Side = d.cSide ,
OutsideHistoryDate = convert(nchar, d.devent,112) ,
Outcome = p.cOvrClass ,
ProcedureType = cat.ctype ,
ProcedureCategoryMajor = cat.cmajor ,
ProcedureCategoryMinor = cat.cminor
from dbo.procrpt rpt
join dbo.procd d on d.iprocrptid = rpt.iprocrptid
join dbo.proclu lu on lu.iprocluid = d.iprocluid
join dbo.pathlgy p on p.iProcID = d.iprocid
left join dbo.proccat cat on cat.iproccatid = lu.iproccatid
where procrpt.ipatientid = #iPatientID