How to convert full outer join query to O-R query? - sql

I'm converting relational database into object-relational in Oracle.
I have a query that uses full outer join in the old one.
Is it possible to write the same query for O-R database without explicitly using full outer join?
For normal inner join it simple, I just use dot notation together with ref/deref.
I'm interested in this in general so let's say the relational query is:
select a.attr, b.attr from a full outer join b on (a.fk = b.pk);
I want to know if it's a good idea to do it this way:
select a.attr, b.attr from a_obj a full outer join b_obj b on (a.b_ref = ref(b));

Say I have the entities 'sale history' and 'sale forecast'. For a given product and period, I want to see the actual sale versus the forecast sale. However any given period may not have had a forecast or an actual sale, so I use an SQL like :
SELECT NVL(f.product_id, h.product_id), NVL(f.period, h.period),
f.forecast_sale, h.actual_sale
FROM forecast f
FULL OUTER JOIN history h ON h.product_id = f.product_id and h.period = f.period
Basically I am joining two child tables together on a common key. Full outer joins are rare in relational databases as normalisation would generally merge the entities with common keys. Left and right outer joins are more common as the typical use case is to select the parent and its children while requiring a row even if a parent has no children.
So if you have a full outer join, the first thing to examine is whether the data structure is correct.
The data structure in an object model is fixed or 'pre-joined'. If the data structures within the model don't readily support the production of a certain result set, then you pretty much have to go without (or at least code up a lot of functionality to extract and join the data manually).
If you post some details about the relevant data structures the advice may be more precise.

Related

PostgreSQL with lots of LEFT JOIN-s

I have a collection which can be of several types: Cat, Dog, Bird.
If it is a Cat then I need to join with Cat-related tables, same with Dog and Bird.
I end up with quite a lot of LEFT JOIN-s and when tables have lots of records the performance impacted.
SELECT Animal.*, CatDetail1.*, CatDetail2.*, DogDetail1.*, DogDetail2.*, BirdDetail1.*, BirdDetail2.*
FROM Animal
LEFT JOIN CatDetail1 on CatDetail1.id = Animal.id
LEFT JOIN CatDetail2 on CatDetail1.id = Animal.id
LEFT JOIN DogDetail1 on DogDetail1.id = Animal.id
LEFT JOIN DogDetail2 on DogDetail2.id = Animal.id
LEFT JOIN BirdDetail1 on BirdDetail1.id = Animal.id
LEFT JOIN BirdDetail2 on BirdDetail2.id = Animal.id
ORDER BY Animal.sequence
I was thinking a View might make it run faster but there is no official documentation supporting that.
Is there a way to reduce the LEFT JOIN, and use more INNER JOIN to improve performance?
If you need an outer join, you cannot use an inner join. And you clearly need an outer join here.
There is no better way to write that query, and there is no substantially different way to model a situation where you want different types of objects stored in one table.
If that query is slow, that is not surprising. After all, you want everything from seven tables, and want the result sorted. If the tables are large, that will take a while.
A view won't make any difference here, since a view is just a named SQL statement, and the view name will be replaced by its definition when the query is executed.
The question you should ask yourself is if you really need everything from seven tables. Perhaps you don't need all columns? Perhaps you don't need all rows?

Is a RIGHT JOIN generally faster than a LEFT JOIN?

I was told in an interview that a right join is typically faster than a left join.
Is this true?
That depends on the RDBMS of course but in general there is no reason for that to be true. A right join can easily be rewritten to a left join automatically. So if it was true the query optimizer, even a primitive one, could do that transformation.
Semantically, you don't have a choice anyway for correctness reasons so you don't get to pick.
There is one case where this is generally true, though. When you have a data warehouse style query like this:
select aggregates...
from Facts
left join Dim1 on ...
left join Dim2 on ...
left join Dim3 on ...
left join Dim4 on ...
group by ...
You want to get a hash join plan with physical right joins. Left joins would use the huge Facts table to build a hash table which is terrible. You rather want to build small hash tables from the dimension inputs and then stream the huge Facts table through those hash tables by probing into them.
Of course all good query optimizers do that for you (at least in databases that are meant for DW use).

Need Input | SQL Dynamic Query

Have a requirement where I need to build a dynamic query based on user input and send the count of records from result set.
So there are 6 tables which I needs to make a join Inner for sure and rest table join will be based on user input and this should be performance oriented.
Here is the requirement
select count(A.A1) from table A
INNER JOIN table B on B.B1=A.A1
INNER JOIN table B on C.C1=B.B1
INNER JOIN table D on D.D1=C.C1
INNER JOIN table E on E.E1=D.D1
INNER JOIN table F on F.F1=E.E1
Now if user select some value in UI , then have to execute query as
select count(A.A1) from table A
INNER JOIN table B on B.B1=A.A1
INNER JOIN table B on C.C1=B.B1
INNER JOIN table D on D.D1=C.C1
INNER JOIN table E on E.E1=D.D1
INNER JOIN table F on F.F1=E.E1
INNER JOIN table B on G.G1=F.F1
Where G.Name like '%Germany%'
User can send 1- 5 choices and have to build the query and accordingly and send the result set
So if I add all the joins first and then add where clause as per the choice , then query will be easy and serve the purpose, but if user did not select any query then I am creating unnecessary join for the user choices.
So which will be better way to write having all the joins in advance and then filtering it or on demand join and with filters using dynamic query.
Could be great if someone can provide valuable inputs.
When SQL Server executes a query, there is a first step which is planning the query, i.e. deciding an strategy to get the query result.
If you use "inner joins" you're making it compulsory to include all the tables, becasuse "inner join" means that there must be matching rows on both tables of the join, so the query planner can't dicard any tables.
However, if you change the inner joins by left outer joins, it's not compulsory that there are matching rows on both sides of the join, so the query planner can decide if it includes or not the tables on the right. So, if you use left outer joins, and you don't select, or filter, or do any operation on fields on the right side of the joins, the query planner can discard then when executing the query. That's the easiest way to get rid of your concerns.
On the other hand, if you want to control what tables to inclued or not to include, and create a custom query for each case, you can use several techniques:
making a graph that includes the definition of the table relations, and using some graph manipulation library that allows you to get the necessary tables from the graph.I did this one, but is quite hard to achieve if you don't have experience with graps.
using Entity Framework. You must build a simple model including all the tables. And then, to run each query, you can programmatically build the query in LINQ, and EF will take care to generate and execute the SQL query for you.

Joining tables or to select from multiple tables

Which is better in between joining a table or selecting from multiple tables ?
For instance, lets assume the following similar scenario:
Using join:
SELECT COALESCE(SUM(SALARY),0) FROM X
JOIN Y ON X.X_ID=Y.Y_X_ID
OR
By selecting from multiple tables
SELECT COALESCE(SUM(SALARY),0) FROM X, Y
WHERE X.X_ID=Y.Y_X_ID
Both are joins. The first is an explicit join and the second one is an implicit join and is a SQL antipattern.
The second one is bad because it is easy to get an accidental cross join. It is also bad becasue when you want a cross join, it is not clear if your did want that or if you have an accidental one.
Further in the second style if you ned to convert to an outer join, you need to change all joins in the query or risk getting incorrect results. So the second style is harder to maintain.
Explcit joins were institututed in the last century, why anyone is still using error-prone and hard to maintain implicit joins is beyond me.
mainly join is used to retrieve data from multiple tables
so in sql there are 3 types join are available
Equi join-inner join
outer join-left
right
full
Non equi join
Self join
Cross join
You should use the JOIN syntax for a lot of reasons which can be found here.
Moreover this syntax has the advantage to give some hints to the query optimizer (during the computation of weights, weights computed directly by the facts mentionned in this syntax are more favorably weighted than the others).

When to use SQL natural join instead of join .. on?

I'm studying SQL for a database exam and the way I've seen SQL is they way it looks on this page:
http://en.wikipedia.org/wiki/Star_schema
IE join written the way Join <table name> On <table attribute> and then the join condition for the selection. My course book and my exercises given to me from the academic institution however, use only natural join in their examples. So when is it right to use natural join? Should natural join be used if the query can also be written using JOIN .. ON ?
Thanks for any answer or comment
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
IMO, the JOIN ON syntax is much more readable and maintainable than the natural join syntax. Natural joins is a leftover of some old standards, and I try to avoid it like the plague.
A natural join will find columns with the same name in both tables and add one column in the result for each pair found. The inner join lets you specify the comparison you want to make using any column.
The JOIN keyword is used in an SQL statement to query data from two or more tables, based on a relationship between certain columns in these tables.
Different Joins
* JOIN: Return rows when there is at least one match in both tables
* LEFT JOIN: Return all rows from the left table, even if there are no matches in the right table
* RIGHT JOIN: Return all rows from the right table, even if there are no matches in the left table
* FULL JOIN: Return rows when there is a match in one of the tables
INNER JOIN
http://www.w3schools.com/sql/sql_join_inner.asp
FULL JOIN
http://www.w3schools.com/sql/sql_join_full.asp
A natural join is said to be an abomination because it does not allow qualifying key columns, which makes it confusing. Because you never know which "common" columns are being used to join two tables simply by looking at the sql statement.
A NATURAL JOIN matches on any shared column names between the tables, whereas an INNER JOIN only matches on the given ON condition.
The joins often interchangeable and usually produce the same results. However, there are some important considerations to make:
If a NATURAL JOIN finds no matching columns, it returns the cross
product. This could produce disastrous results if the schema is
modified. On the other hand, an INNER JOIN will return a 'column does
not exist' error. This is much more fault tolerant.
An INNER JOIN self-documents with its ON clause, resulting in a
clearer query that describes the table schema to the reader.
An INNER JOIN results in a maintainable and reusable query in
which the column names can be swapped in and out with changes in the
use case or table schema.
The programmer can notice column name mis-matches (e.g. item_ID vs itemID) sooner if they are forced to define the ON predicate.
Otherwise, a NATURAL JOIN is still a good choice for a quick, ad-hoc query.