SQL select Query performance is very slow with joins

SQL select Query performance is very slow with joins - sql

Please correct the below query to increase performance back-end. I am using Oracle database
Query execution is very slow:
SELECT
A.USER_PROFILE_ID,
B.LAST_NAME||','||B.FIRST_NAME||' - '||B.USER_PROFILE_ID AS EXPR1,
A.DEPARTMENT_CODE_ID,
C.NAME AS EXPR2,
A.EFFECTIVE_DATE,
A.EFFECTIVE_STATUS,
A.INSRT_USER,
A.INSRT_TS,
A.MOD_USER,
A.MOD_TS
FROM
'USER_PROFILE_DEPARTMENT' A,
'USER_PROFILE' B, 'DEPARTMENT_CODE' C
WHERE
A.USER_PROFILE_ID = B.USER_PROFILE_ID
AND A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
ORDER BY
EXPR1
I couldn't find any please help

Avoid using the WHERE clause to join tables. Instead, use the JOIN
keyword to explicitly specify the relationship between the tables being joined.
For example, instead of writing:
SELECT *
FROM "USER_PROFILE_DEPARTMENT" A, "USER_PROFILE" B, "DEPARTMENT_CODE" C
WHERE A.USER_PROFILE_ID = B.USER_PROFILE_ID AND A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
You can use:
SELECT *
FROM "USER_PROFILE_DEPARTMENT" A
INNER JOIN "USER_PROFILE" B ON A.USER_PROFILE_ID = B.USER_PROFILE_ID
INNER JOIN "DEPARTMENT_CODE" C ON A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
Some tips when it comes to performance in SQL:
Use appropriate indexes on the columns used in the JOIN and WHERE
clauses to improve the performance of the query. This can speed up
the process of finding matching rows in the tables being joined.
Consider using sub-queries or derived tables to retrieve the data you need, rather than joining multiple large tables. This can improve performance by reducing the amount of data that needs to be scanned and processed.
Use the LIMIT keyword to limit the number of rows returned by the query, if you only need a small subset of the data. This can reduce the amount of data that needs to be processed and returned, which can improve performance.
If the query is running slowly despite these changes, consider using EXPLAIN or EXPLAIN ANALYZE to understand where the bottlenecks are and how to optimize the query further.

Related

Tuning Oracle Query for slow select

I'm working on an oracle query that is doing a select on a huge table, however the joins with other tables seem to be costing a lot in terms of time of processing.
I'm looking for tips on how to improve the working of this query.
I'm attaching a version of the query and the explain plan of it.
Query
SELECT
l.gl_date,
l.REST_OF_TABLES
(
SELECT
MAX(tt.task_id)
FROM
bbb.jeg_pa_tasks tt
WHERE
l.project_id = tt.project_id
AND l.task_number = tt.task_number
) task_id
FROM
aaa.jeg_labor_history l,
bbb.jeg_pa_projects_all p
WHERE
p.org_id = 2165
AND l.project_id = p.project_id
AND p.project_status_code = '1000'
Something to mention:
This query takes data from oracle to send it to a sql server database, so I need it to be this big, I can't narrow the scope of the query.
the purpose is to set it to a sql server job with SSIS so it runs periodically

One obvious suggestion is not to use sub query in select clause.
Instead, you can try to join the tables.
SELECT
l.gl_date,
l.REST_OF_TABLES
t.task_id
FROM
aaa.jeg_labor_history l
Join bbb.jeg_pa_projects_all p
On (l.project_id = p.project_id)
Left join (SELECT
tt.project_id,
tt.task_number,
MAX(tt.task_id) task_id
FROM
bbb.jeg_pa_tasks tt
Group by tt.project_id, tt.task_number) t
On (l.project_id = t.project_id
AND l.task_number = t.task_number)
WHERE
p.org_id = 2165
AND p.project_status_code = '1000';
Cheers!!

As I don't know exactly how many rows this query is returning or how many rows this table/view has.
I can provide you few simple tips which might be helpful for you for better query performance:
Check Indexes. There should be indexes on all fields used in the WHERE and JOIN portions of the SQL statement.
Limit the size of your working data set.
Only select columns you need.
Remove unnecessary tables.
Remove calculated columns in JOIN and WHERE clauses.
Use inner join, instead of outer join if possible.
You view contains lot of data so you can also break down and limit only the information you need from this view

Small vs Large and Large vs Small sql joins [duplicate]

I was just tidying up some sql when I came across this query:
SELECT
jm.IMEI ,
jm.MaxSpeedKM ,
jm.MaxAccel ,
jm.MaxDeccel ,
jm.JourneyMaxLeft ,
jm.JourneyMaxRight ,
jm.DistanceKM ,
jm.IdleTimeSeconds ,
jm.WebUserJourneyId ,
jm.lifetime_odo_metres ,
jm.[Descriptor]
FROM dbo.Reporting_WebUsers AS wu WITH (NOLOCK)
INNER JOIN dbo.Reporting_JourneyMaster90 AS jm WITH (NOLOCK) ON wu.WebUsersId = jm.WebUsersId
INNER JOIN dbo.Reporting_Journeys AS j WITH (NOLOCK) ON jm.WebUserJourneyId = j.WebUserJourneyId
WHERE ( wu.isActive = 1 )
AND ( j.JourneyDuration > 2 )
AND ( j.JourneyDuration < 1000 )
AND ( j.JourneyDistance > 0 )
My question is does it make any performance difference the order of the joins as for the above query I would have done
FROM dbo.Reporting_JourneyMaster90 AS jm
and then joined the other 2 tables to that one

Join order in SQL2008R2 server does unquestionably affect query performance, particularly in queries where there are a large number of table joins with where clauses applied against multiple tables.
Although the join order is changed in optimisation, the optimiser does't try all possible join orders. It stops when it finds what it considers a workable solution as the very act of optimisation uses precious resources.
We have seen queries that were performing like dogs (1min + execution time) come down to sub second performance just by changing the order of the join expressions. Please note however that these are queries with 12 to 20 joins and where clauses on several of the tables.
The trick is to set your order to help the query optimiser figure out what makes sense. You can use Force Order but that can be too rigid. Try to make sure that your join order starts with the tables where the will reduce data most through where clauses.

No, the JOIN by order is changed during optimization.
The only caveat is the Option FORCE ORDER which will force joins to happen in the exact order you have them specified.

I have a clear example of inner join affecting performance. It is a simple join between two tables. One had 50+ million records, the other has 2,000. If I select from the smaller table and join the larger it takes 5+ minutes.
If I select from the larger table and join the smaller it takes 2 min 30 seconds.
This is with SQL Server 2012.
To me this is counter intuitive since I am using the largest dataset for the initial query.

Usually not. I'm not 100% this applies verbatim to Sql-Server, but in Postgres the query planner reserves the right to reorder the inner joins as it sees fit. The exception is when you reach a threshold beyond which it's too expensive to investigate changing their order.

JOIN order doesn't matter, the query engine will reorganize their order based on statistics for indexes and other stuff.
For test do the following:
select show actual execution plan and run first query
change JOIN order and now run the query again
compare execution plans
They should be identical as the query engine will reorganize them according to other factors.
As commented on other asnwer, you could use OPTION (FORCE ORDER) to use exactly the order you want but maybe it would not be the most efficient one.
AS a general rule of thumb, JOIN order should be with table of least records on top, and most records last, as some DBMS engines the order can make a difference, as well as if the FORCE ORDER command was used to help limit the results.

Wrong. SQL Server 2005 it definitely matters since you are limiting the dataset from the beginning of the FROM clause. If you start with 2000 records instead of 2 million it makes your query faster.

Does Sql JOIN order affect performance?

I was just tidying up some sql when I came across this query:
SELECT
jm.IMEI ,
jm.MaxSpeedKM ,
jm.MaxAccel ,
jm.MaxDeccel ,
jm.JourneyMaxLeft ,
jm.JourneyMaxRight ,
jm.DistanceKM ,
jm.IdleTimeSeconds ,
jm.WebUserJourneyId ,
jm.lifetime_odo_metres ,
jm.[Descriptor]
FROM dbo.Reporting_WebUsers AS wu WITH (NOLOCK)
INNER JOIN dbo.Reporting_JourneyMaster90 AS jm WITH (NOLOCK) ON wu.WebUsersId = jm.WebUsersId
INNER JOIN dbo.Reporting_Journeys AS j WITH (NOLOCK) ON jm.WebUserJourneyId = j.WebUserJourneyId
WHERE ( wu.isActive = 1 )
AND ( j.JourneyDuration > 2 )
AND ( j.JourneyDuration < 1000 )
AND ( j.JourneyDistance > 0 )
My question is does it make any performance difference the order of the joins as for the above query I would have done
FROM dbo.Reporting_JourneyMaster90 AS jm
and then joined the other 2 tables to that one

Join order in SQL2008R2 server does unquestionably affect query performance, particularly in queries where there are a large number of table joins with where clauses applied against multiple tables.
Although the join order is changed in optimisation, the optimiser does't try all possible join orders. It stops when it finds what it considers a workable solution as the very act of optimisation uses precious resources.
We have seen queries that were performing like dogs (1min + execution time) come down to sub second performance just by changing the order of the join expressions. Please note however that these are queries with 12 to 20 joins and where clauses on several of the tables.
The trick is to set your order to help the query optimiser figure out what makes sense. You can use Force Order but that can be too rigid. Try to make sure that your join order starts with the tables where the will reduce data most through where clauses.

No, the JOIN by order is changed during optimization.
The only caveat is the Option FORCE ORDER which will force joins to happen in the exact order you have them specified.

I have a clear example of inner join affecting performance. It is a simple join between two tables. One had 50+ million records, the other has 2,000. If I select from the smaller table and join the larger it takes 5+ minutes.
If I select from the larger table and join the smaller it takes 2 min 30 seconds.
This is with SQL Server 2012.
To me this is counter intuitive since I am using the largest dataset for the initial query.

Usually not. I'm not 100% this applies verbatim to Sql-Server, but in Postgres the query planner reserves the right to reorder the inner joins as it sees fit. The exception is when you reach a threshold beyond which it's too expensive to investigate changing their order.

JOIN order doesn't matter, the query engine will reorganize their order based on statistics for indexes and other stuff.
For test do the following:
select show actual execution plan and run first query
change JOIN order and now run the query again
compare execution plans
They should be identical as the query engine will reorganize them according to other factors.
As commented on other asnwer, you could use OPTION (FORCE ORDER) to use exactly the order you want but maybe it would not be the most efficient one.
AS a general rule of thumb, JOIN order should be with table of least records on top, and most records last, as some DBMS engines the order can make a difference, as well as if the FORCE ORDER command was used to help limit the results.

Wrong. SQL Server 2005 it definitely matters since you are limiting the dataset from the beginning of the FROM clause. If you start with 2000 records instead of 2 million it makes your query faster.

What are the differences between these?

What are the differences between the two queries?
SELECT CountryMaster.Id
FROM Districts INNER JOIN
CountryMaster ON Districts.CountryId = CountryMaster.Id
SELECT CountryMaster.Id
FROM CountryMaster INNER JOIN
Districts ON Districts.CountryId = CountryMaster.Id
I know the output will be same, but I want to know is there any drastic effects of the same if I neglect positions of tables and columns in complex queries or tables having tons of data like hundreds of thousands of rows.

No difference whatsoever. The order of the joins is irrelevant. The query optimizer inside the database engine will decide on a merge plan to actually process the records from the two tables based on the stored statistics for the data in those tables.
In fact, in many cases, the query optimizer's will generate exactly the same plan for both a query phrased using joins as it would for a query phrased with a correlated sub-query.
The lesson here I have learned is:
Always start with the syntax, or representation, that most clearly represents the meaning of the process you are trying to create, and trust the query optimizer to do its job. Having said that, the query optimizer is not perfect, so if there is a performance issue, use the query show plan with alternate constructions and see if it improves...
One quick comment on performance of inner vs. outer joins. It is simply not true that inner joins are intrinsically faster than outer joins. The relative performance depends entirely on which of the three types of processing joins are used by the query engine;
1. Nested Loop Join, 2., Merge Join, or 3. Hash Join.
The Nested Loop join, for example, is used when the set of records on one side of the join is very much smaller than on the other side, and the larger set is indexed on the join column[s]. In this case, if the smaller set is the "outer" side, then an outer join will be faster. The reason is that the nested loop join takes the entire set of records from that smaller set, and iterates through each one, finding the records from the larger set that match. An inner join has to perform a second step of removing rows from the smaller side when no matches were found in the larger set. The outer join does not do this second step.
Each of the three possible types of join processes has its own characterisitic behavior patterns... See Nested Loop Joins, Merge Joins and Hash Joins for the details.

As written they are identical. Excellent answer from Charles.
If you want to know if they will have different execution plans then simply display the execution plan in SSMS.
As for speed have the columns used in the join indexed.
Maintain the indexes - a fragmented index is not nearly as effective.
The query plan will not always be the same.
The query optimizer keeps statistics and as the profile of the data changes the optimal plan may change.
Thousands of rows is not a lot.
Once you get into millions then tune indexes and syntax (with hints).
Some times you have to get into millions before you have enough data to tune.
There is also a UNION operator that is equivalent and sometimes faster.
The join hint Loop is not symmetric so in that case the query plan is different for the following but they are still that same results.
If one is a PK table I always put it first.
In this case the first is twice as fast as the second.
select top 10 docSVsys.sID, docMVtext.fieldID
from docSVsys
inner loop join docMVtext
on docMVtext.sID = docSVsys.sID
where docSVsys.sID < 100
order by docSVsys.sID, docMVtext.fieldID
select top 10 docSVsys.sID, docMVtext.fieldID
from docMVtext
inner loop join docSVsys
on docMVtext.sID = docSVsys.sID
where docSVsys.sID < 100
order by docSVsys.sID, docMVtext.fieldID
Advanced Query Tuning Concepts

Sql JoiN and Performance

I have a question about JOINS.
Does Sql JOINs reduce performance in a query?
I have query with many JOIN in it . Can I say that the bad performance is come from these JOINS? if yes ,what should I do instead of JOIN in a query?
here is a piece of my query
......
FROM (((((tb_Pinnummern INNER JOIN tb_Fahrzeug ON tb_Pinnummern.SG = tb_Fahrzeug.Motor_SG) INNER JOIN tb_bauteile ON tb_Pinnummern.Bauteil = tb_bauteile.ID) LEFT JOIN Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Plus ON Fehlercodes_akt_Liste_FC_Plus.ID=tb_bauteile.[FC_Plus]) LEFT JOIN Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Minus ON Fehlercodes_akt_Liste_FC_Minus.ID=tb_bauteile.[FC_Minus]) LEFT JOIN Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Unterbrechung ON Fehlercodes_akt_Liste_FC_Unterbrechung.ID=tb_bauteile.[FC_Unterbrechung]) LEFT JOIN Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Aderschl ON Fehlercodes_akt_Liste_FC_Aderschl.ID=tb_bauteile.[FC_Aderschl]
WHERE (((tb_Fahrzeug.ID)=[forms]![frm_fahrzeug]![id]));

Yes it does.. increasing Number of records and joins among tables will increase time of execution.. A LEFT/RIGHT JOIN is absolutely not faster than an INNER JOIN. INDEXING on right column of tables will improve query performance.
If you have too much join in your query and its execution frequency is high, take an alternative i.e. create SQL VIEW or Materialized VIEW (Materialized VIEW - if you are using Oracle).

Well joins obviously need to be processed and this processing will consume cpu, memory and IO.
As well as this we have to consider that joins can perform really, really badly if the right indexes etc are not in place.
However, an SQL join with the correct supporting indexes will produce the result you require faster than any other method.
Just consider what you would need to do to calculate the same result as your SQL above. Read the first table, then sort into the correct order, then read the second table and sort it then merge the two result sets before proceeding to the third table ......
Or read all the rows from the first table and for each row issue SQL to retrieve the matching rows.

Joins will definitely degrade the performance the SQL query that you will be executing.
You should generate the SQL plan for the SQL that you have written and look at methods to reduce the cost of the SQL. Any query analyzing tool should help you with that.
From what I understand in the query that you have defined above, you are trying to fetch all rows from the tables that are in the inner joins and get specific columns (if present) from the tables in the left join.
That being the case, a query written in the below given format should help :
select (select Fehlercodes_akt_Liste_FC_Plus.column1 from Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Plus where Fehlercodes_akt_Liste_FC_Plus.ID=tb_bauteile.[FC_Plus]),
(select Fehlercodes_akt_Liste_FC_Minus.column2 from Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Minus where Fehlercodes_akt_Liste_FC_Minus.ID=tb_bauteile.[FC_Minus]),
(select Fehlercodes_akt_Liste_FC_Unterbrechung.column3 from Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Unterbrechung where Fehlercodes_akt_Liste_FC_Unterbrechung.ID=tb_bauteile.[FC_Unterbrechung]),
(select Fehlercodes_akt_Liste_FC_Aderschl.column4 from Fehlercodes_akt_Liste AS Fehlercodes_akt_Liste_FC_Aderschl where Fehlercodes_akt_Liste_FC_Aderschl.ID=tb_bauteile.[FC_Aderschl]),
<other columns>
FROM
(tb_Pinnummern INNER JOIN tb_Fahrzeug ON tb_Pinnummern.SG = tb_Fahrzeug.Motor_SG)
INNER JOIN tb_bauteile ON tb_Pinnummern.Bauteil = tb_bauteile.ID) as <aliastablename>
WHERE <aliastablename>.ID=[forms]![frm_fahrzeug]![id];

Sql Joins do not at all reduce performance : on the contrary : very often they will exponentially speed up a query, assuming offcousre the underlaying database model is well implemented. Indexes are very important in this matter.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas