There are a few questions about the select statement

There are a few questions about the select statement - sql

I have a few questions about the select statement.
First of all, I have normalized 15 tables for this select query.
The problem is invisible because there is not much data right now.
However, since I try to process many tables in one select query, it seems to cause problems later.
So I want to add a few more select statements to divide the tables to search, but I want to know how different it is from doing it at once.
Secondly, if I use join, I will use outer join. If I join multiple tables with outer join, I'm not sure how to use left outer join and right outer join.
The currently created select query refers to 8 tables and one join is linked.
That is, the remaining rest of the tables have obtained data in subqueries and the remaining eight tables are likely to use join.
I would appreciate it if you could let me know the direction of the multiple outer joins.
Let me briefly show you some of the current select queries.
select
a.cal1,a.cal2,a.cal3,...,
(select b.cal1 from b
where a.cal4=b.cal2)
as "bcals",
(select c.cal1 from c
where a.cal5=c.cal2)
as "ccals",
....,
(select e.cal1 from e
where a.caln=e.cal2)
as "ecals",
(select sum(extract(year from age(f.endday,f.startday))
from f
where e.cal1=a.cal1)
as "fcals",
g.cal1,g.cal2,g.cal3,...,
(select h.cal1 from h
where g.cal4=h.cal2)
as "hcals"
from a left outer join g on a.cal1=g.cal5
where a.cal1=?;
Result:
a.cal1|a.cal2|a.cal3|...|hcals
var1 |var2 |var3 |...|varn
After this, I wonder how to join the rest of the tables.
To sum up
If there are many tables that need to be included in a select query statement, what is the difference between performance and performance when this complex query is divided into multiple select statements?
If we write inside a select statement, how should outer join be?
Is there a problem with the query?

Actually your code is correct, but it looks very complex. People will find it difficult to understand it. Using joins you can minimize the lines of code and also make it more readable.
SELECT
TBL1.AMOUNT T1,
TBL2.AMOUNT T2,
TBL3.AMOUNT T3
FROM TBL1
LEFT JOIN TBL2 ON TBL2.ID = TBL1.ID
LEFT JOIN TBL3 ON TBL3.ID = TBL1.ID
In the above code , there are three tables, and two joins. One can easily understand and debug/make changes. Please try this for your code.

Related

the difference between comma and join in sql

what is the difference between using comma or join between two different tables.
such as these two codes:
SELECT studentId, tutorId FROM student, tutor;
SELECT studentId, tutorId FROM student JOIN tutor;

There's no real difference WHEN executing them, but there is a readability, consistency and error mitigating issue at work:
Imagine you had 4 tables
If you used the old fashioned way of doing an INNER JOIN, you would end up with:
SELECT col1, col2
FROM tab1, tab2, tab3,tab4
WHERE tab1.id=tab2.tab1_id
AND tab4.id = tab2.tab3_id
AND tab4.id = tab3.tab4_id;
Using explicit INNER JOINS it would be:
SELECT col1, col2
FROM tab1
INNER JOIN tab2 ON tab1.id = tab2.tab1_id
INNER JOIN tab3 ON tab3.id = tab2.tab3_id
INNER JOIN tab4 ON tab4.id = tab3.tab4_id;
The latter shows you right in front of the table exactly what is it JOINing with. It has improved readability, and much less error prone, since it's harder to forget to put the ON clause, than to add another AND in WHERE or adding a wrong condition altogether (like i did in the query above :).
Additionally, if you are doing other types of JOINS, using the explicit way of writing them, you just need to change the INNER to something else, and the code is consistently constructed.

Based on the specific code you gave, there is no difference at all.
However, using the JOIN operator syntax, you are allowed to specify the join conditions, which is very important when doing LEFT JOINs or RIGHT JOINs

Successive LEFT SQL joins back to original

I need to redo sql statement in legacy Foxpro application and don't understand whether it is meaningful at all. Syntax is a bit specific - it extracts data from temporary table into the same temporary table ( overwriting) with some joins.
SELECT aa.*,b.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
LEFT JOIN job2 ON jobs.job_no=job2.rucjob;
left join jobs b on b.job_no=job2.job_no;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())
Since only one field is added from joined tables ( spa_date ) is there any point in 2 left joins or I am missing something. Isn't it equivalent to
SELECT aa.*,jobs.spa_date FROM (ALIAS()) aa INNER JOIN jobs ON aa.seq=jobs.seq ;
WHERE jobs.qty1<>0 INTO CURSOR (ALIAS())

They are different because b.spa_date come from the second left join. You may be missing filtered rows without both left joins.
You would need to know the intent of the original query and perhaps rewrite it to make more sense but I'd say the two queries are different.

INNER JOIN with complex condition dramatically increases the execution time

I have 2 tables with several identical fields needed to be linked in JOIN condition. E.g. in each table there are fields: P1, P2. I want to write the following join query:
SELECT ... FROM Table1
INNER JOIN
Table2
ON Table1.P1 = Table2.P1
OR Table1.P2 = Table2.P2
OR Table1.P1 = Table2.P2
OR Table1.P2 = Table2.P1
In the case I have huge tables this request is executing a lot of time.
I tried to test how long will be the request of a query with one condition only. First, I have modified the tables in such way all data from P2 & P1 where copied as new rows into Table1 & Table2. So my query is simple:
SELECT ... FROM Table1 INNER JOIN Table2 ON Table1.P = Table2.P
The result was more then surprised: the execution time from many hours (the 1st case) was reduced to 2-3 seconds!
Why is it so different? Does it mean the complex conditions are always reduce performance? How can I improve the issue? May be P1,P2 indexing will help? I want to remain the 1st DB schema and not to move to one field P.

The reason the queries are different is because of the join strategies being used by the optimizer. There are basically four ways that two tables can be joined:
"Hash join": Creates a hash table on one of the tables which it uses to look up the values in the second.
"Merge join": Sorts both tables on the key and then readsthe results sequentially for the join.
"Index lookup": Uses an index to look up values in one table.
"Nested Loop": Compars each value in each table to all the values in the other table.
(And there are variations on these, such as using an index instead of a table, working with partitions, and handling multiple processors.) Unfortunately, in SQL Server Management Studio both (3) and (4) are shown as nested loop joins. If you look more closely, you can tell the difference from the parameters in the node.
In any case, your original join is one of the first three -- and it goes fast. These joins can basically only be used on "equi-joins". That is, when the condition joining the two tables includes an equality operator.
When you switch from a single equality to an "in" or set of "or" conditions, the join condition has changed from an equijoin to a non-equijoin. My observation is that SQL Server does a lousy job of optimization in this case (and, to be fair, I think other databases do pretty much the same thing). Your performance hit is the hit of going from a good join algorithm to the nested loops algorithm.
Without testing, I might suggest some of the following strategies.
Build an index on P1 and P2 in both tables. SQL Server might use the index even for a non-equijoin.
Use the union query suggested in another solution. Each query should be correctly optimized.
Assuming these are 1-1 joins, you can also do this as a set of multiple joins:
from table1 t1 left outer join
table2 t2_11
on t1.p1 = t2_11.p1 left outer join
table2 t2_12
on t1.p1 = t2_12.p2 left outer join
table2 t2_21
on t1.p2 = t2_21.p2 left outer join
table2 t2_22
on t1.p2 = t2_22.p2
And then use case/coalesce logic in the SELECT to get the value that you actually want. Although this may look more complicated, it should be quite efficient.

you can use 4 query and Union there result
SELECT ... FROM Table1
INNER JOIN
Table2
ON Table1.P1 = Table2.P1
UNION
SELECT ... FROM Table1
INNER JOIN
Table2
ON Table1.P1 = Table2.P2
UNION
SELECT ... FROM Table1
INNER JOIN
Table2
ON Table1.P2 = Table2.P1
UNION
SELECT ... FROM Table1
INNER JOIN
Table2
ON Table1.P2 = Table2.P2

Does using CTEs help performance?
;WITH Table1_cte
AS
(
SELECT
...
[P] = P1
FROM Table1
UNION
SELECT
...
[P] = P2
FROM Table1
)
, Table2_cte
AS
(
SELECT
...
[P] = P1
FROM Table2
UNION
SELECT
...
[P] = P2
FROM Table2
)
SELECT ... FROM Table1_cte x
INNER JOIN
Table2_cte y
ON x.P = y.P
I suspect, as far as the processor is concerned, the above is just different syntax for the same complex conditions.

How to get a related row if one (another) row exists?

I'm aware that this question's title might be a little bit inaccurate but I couldn't come up with anything better. Sorry.
I have to fetch 2 different fields, one is always there, the other isn't. That means I'm looking at a LEFT JOIN. Good so far.
But the row I want shown is not the row whose existence is uncertain.
I would like to do something like:
Show name and picture, but only show the picture if that name has a picture_id. Otherwise show nothing for the picture, but I still want the names regardless(left join).
I know this might be a little confusing but there's some clever guys out here so I guess somebody will understand it.
I tried some approaches but I couldn't quite say what I want in SQL.
P.S.: solutions specific to Oracle are good too.
------------------------------------------------------------------------------------------------------------------------------------
EDIT I've tried some queries but the main problem I found is that, inside the ON clause, I am only able to reference the last table mentioned, in other words:
There are four tables from which I'm retrieving data, but I can only mention the last (third table) inside the on clause of the LEFT JOIN(which is the 4th table). I'll describe the tables hopefully that'll help. Try not to delve too much on the names, because they are in Portuguese:
There are 4 tables. The fields I want to retrieve are :TB395.dsclaudo and TB397.dscrecomendacao, for a given TB392.nronip. The tables are as follows:
TB392(laudoid,nronip,codlaudo) // laudoid is PK, references TB395
TB395(codlaudo,dsclaudo) //codlaudo is PK
TB398(laudoid,codrecomendacao) //the pair laudoid,codrecomendacao is PK , references TB397
TB397(codrecomendacao,dscrecomendacao) // codrecomendacao is PK
Fields with the same name are foreign keys.
The problem is that there's no guarantee that, for a given laudoid,there will be one codrecomendacao. But, if there is, I want the dscrecomendacao field returned, that's what I don't know how to do. But even if there isn't a corresponding codrecomendacao for the laudoid, I still want the dsclaudo field, that's why I think a LEFT JOIN applies.

Sounds like you want your primary row source to be the join of TB392 and TB395; then you want an outer join to TB398, and when that gets a match, you want to lookup the corresponding value in TB397.
I would suggest coding the primary join as one inline view; the join between the two extra tables as a second inline view; and then doing an outer join between them. Something like:
SELECT ... FROM
(SELECT ... FROM TB392 JOIN TB395 ON ...) join1
LEFT JOIN
(SELECT ... FROM TB398 JOIN TB397 ON ...) join2
ON ...

It would be nice if you could specify what your tables are, which columns are on which tables, and what columns they join on. Its not clear if you have two tables or only one. I guess you have two tables because you are talking about a LEFT JOIN, and seem to imply that the join is on the name column. So you can use the NVL2 function to accomplish waht you want. So guessing what I can from your question, maybe something like:
SELECT T1.name
, NVL2( T2.picture_id, T1.picture, NULL )
FROM table1 T1
LEFT JOIN
table2 T2
ON T1.name = T2.name
If you only have one table, then its even simpler
SELECT T1.name
, NVL2( T1.picture_id, T1.picture, NULL )
FROM table1 T1

I think you need:
SELECT ...
FROM
TB395
JOIN
TB392
ON ...
LEFT JOIN --- this should be a LEFT JOIN
TB398
ON ...
LEFT JOIN --- and this as well, so the previous is not cancelled
TB397
ON ...
The details may be not accurate:
SELECT
a.dsclaudo
, b.laudoid
, c.codrecomendacao
, d.dscrecomendacao
FROM
TB395 a
JOIN
TB392 b
ON b.codlaudo = a.codlaudo
LEFT JOIN
TB398 c
ON c.laudoid = b.laudoid
LEFT JOIN
TB397 d
ON d.codrecomendacao = c.codrecomendacao

Create two views and then do your left join on the views. For example:
Create View view392_395
as
SELECT
t1.laudoid,
t1.nronip,
t1.codlaudo,
t2.dsclaudo
FROM TB392 t1
INNER JOIN TB395 t2
ON t1.codlaudo
= t2.codlaudo
Create View view398_397
as
SELECT
t1.laudoid,
t1.codrecomendacao,
t2.dscrecomendacao
FROM TB398 t1
INNER JOIN TB397 t2
ON t1.codrecomendacao
= t2.codrecomendacao
SELECT
v1.laudoid,
v1.nronip,
v1.codlaudo,
v1.dsclaudo,
v2.codrecomendacao,
v2.dscrecomendacao
FROM view392_395 v1
LEFT OUTER JOIN view398_397 v2
ON v1.laudoid
= v2.laudoid
In my opinion, views are always under used. Views are your friend. They can simplify some of the most complicated queries.

how to best organize the Inner Joins in (select) statement

let's say i have three tables, each one relates to another,
when i need to get a column from each table, does it make difference how to organize
the (inner joins)??
Select table1.column1,table2.column2,table3.column2
From table1
Inner Join table2 on ..... etc
Inner Join table3 on .....
in another words, can i put (table2) after (From )??
Select table1.column1,table2.column2,table3.column2
From table2
Inner Join table1 on ..... etc
Inner Join table3 on .....

For most queries, order does not matter.
An INNER JOIN is both associative and commutative so table order does not matter
SQL is declarative. That is, how you define the query is not how the optimiser works it out. It does not do it line by line as you wrote it.
That said...
OUTER JOINs are neither associative nor commutative
For complex queries, the optimiser will "best guess" rather than go through all possibilities which "costs" too much. Table order may matter here

The inner join operation has left to right associativity. It doesn't matter much in which order you write the tables, as long as you don't refer to any tables in the ON condition before they have been joined.
When the statement is executed the database engine will determine the best order to execute the join and this may be different from the order they appear in the SQL query.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas