How can I improve this sql query performance? - sql

SQL statement:
SELECT COUNT(1)
FROM ACT_CARD_BANK
WHERE CARD_NO IN (SELECT CARD_NO
FROM XSHTEST.XSH_CARD_BANK
WHERE BIN_NO IN ('731018', '731023', '731024', '731025', '731026', '731027')
AND STATUS = '06')
AND STATUS = '04';
execution plan
Index 01
Index 02
Rows of the table:
ACT_CARD_BANK 399187646
XSH_CARD_BANK 228751942
The statistics were re-collected yesterday with the following script:
exec dbms_stats.gather_table_stats(ownname => '$owner',tabname => 'XSH_CARD_BANK',estimate_percent => 0.1,method_opt=> 'for all indexed columns');
Is there anything else I can optimize? Thank you.

Make sure the query is using appropriate indexes.
Use table aliases to make the query easier to read and understand.
Use a JOIN instead of a subquery.
Use bulk operations to reduce the number of round trips to the database.
Use appropriate query hints to force the optimizer to use a more efficient query plan.
Use table partitioning to reduce the amount of data that needs to be scanned.
Use stored procedures to reduce the amount of code that needs to be parsed and optimized.
SELECT COUNT(1)
FROM ACT_CARD_BANK AS a
INNER JOIN XSHTEST.XSH_CARD_BANK AS b
ON a.CARD_NO = b.CARD_NO
WHERE b.BIN_NO IN ('731018', '731023', '731024', '731025', '731026', '731027') AND b.STATUS = '06' AND a.STATUS = '04';

Related

SQL select Query performance is very slow with joins

Please correct the below query to increase performance back-end. I am using Oracle database
Query execution is very slow:
SELECT
A.USER_PROFILE_ID,
B.LAST_NAME||','||B.FIRST_NAME||' - '||B.USER_PROFILE_ID AS EXPR1,
A.DEPARTMENT_CODE_ID,
C.NAME AS EXPR2,
A.EFFECTIVE_DATE,
A.EFFECTIVE_STATUS,
A.INSRT_USER,
A.INSRT_TS,
A.MOD_USER,
A.MOD_TS
FROM
'USER_PROFILE_DEPARTMENT' A,
'USER_PROFILE' B, 'DEPARTMENT_CODE' C
WHERE
A.USER_PROFILE_ID = B.USER_PROFILE_ID
AND A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
ORDER BY
EXPR1
I couldn't find any please help
Avoid using the WHERE clause to join tables. Instead, use the JOIN
keyword to explicitly specify the relationship between the tables being joined.
For example, instead of writing:
SELECT *
FROM "USER_PROFILE_DEPARTMENT" A, "USER_PROFILE" B, "DEPARTMENT_CODE" C
WHERE A.USER_PROFILE_ID = B.USER_PROFILE_ID AND A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
You can use:
SELECT *
FROM "USER_PROFILE_DEPARTMENT" A
INNER JOIN "USER_PROFILE" B ON A.USER_PROFILE_ID = B.USER_PROFILE_ID
INNER JOIN "DEPARTMENT_CODE" C ON A.DEPARTMENT_CODE_ID = C.DEPARTMENT_CODE_ID
Some tips when it comes to performance in SQL:
Use appropriate indexes on the columns used in the JOIN and WHERE
clauses to improve the performance of the query. This can speed up
the process of finding matching rows in the tables being joined.
Consider using sub-queries or derived tables to retrieve the data you need, rather than joining multiple large tables. This can improve performance by reducing the amount of data that needs to be scanned and processed.
Use the LIMIT keyword to limit the number of rows returned by the query, if you only need a small subset of the data. This can reduce the amount of data that needs to be processed and returned, which can improve performance.
If the query is running slowly despite these changes, consider using EXPLAIN or EXPLAIN ANALYZE to understand where the bottlenecks are and how to optimize the query further.

Tuning Oracle Query for slow select

I'm working on an oracle query that is doing a select on a huge table, however the joins with other tables seem to be costing a lot in terms of time of processing.
I'm looking for tips on how to improve the working of this query.
I'm attaching a version of the query and the explain plan of it.
Query
SELECT
l.gl_date,
l.REST_OF_TABLES
(
SELECT
MAX(tt.task_id)
FROM
bbb.jeg_pa_tasks tt
WHERE
l.project_id = tt.project_id
AND l.task_number = tt.task_number
) task_id
FROM
aaa.jeg_labor_history l,
bbb.jeg_pa_projects_all p
WHERE
p.org_id = 2165
AND l.project_id = p.project_id
AND p.project_status_code = '1000'
Something to mention:
This query takes data from oracle to send it to a sql server database, so I need it to be this big, I can't narrow the scope of the query.
the purpose is to set it to a sql server job with SSIS so it runs periodically
One obvious suggestion is not to use sub query in select clause.
Instead, you can try to join the tables.
SELECT
l.gl_date,
l.REST_OF_TABLES
t.task_id
FROM
aaa.jeg_labor_history l
Join bbb.jeg_pa_projects_all p
On (l.project_id = p.project_id)
Left join (SELECT
tt.project_id,
tt.task_number,
MAX(tt.task_id) task_id
FROM
bbb.jeg_pa_tasks tt
Group by tt.project_id, tt.task_number) t
On (l.project_id = t.project_id
AND l.task_number = t.task_number)
WHERE
p.org_id = 2165
AND p.project_status_code = '1000';
Cheers!!
As I don't know exactly how many rows this query is returning or how many rows this table/view has.
I can provide you few simple tips which might be helpful for you for better query performance:
Check Indexes. There should be indexes on all fields used in the WHERE and JOIN portions of the SQL statement.
Limit the size of your working data set.
Only select columns you need.
Remove unnecessary tables.
Remove calculated columns in JOIN and WHERE clauses.
Use inner join, instead of outer join if possible.
You view contains lot of data so you can also break down and limit only the information you need from this view

SQL Query Performance Issues Using Subquery

I am having issues with my query run time. I want the query to automatically pull the max id for a column because the table is indexed off of that column. If i punch in the number manually, it runs in seconds, but i want the query to be more dynamic if possible.
I've tried placing the sub-query in different places with no luck
SELECT *
FROM TABLE A
JOIN TABLE B
ON A.SLD_MENU_ITM_ID = B.SLD_MENU_ITM_ID
AND B.ACTV_FLG = 1
WHERE A.WK_END_THU_ID_NU >= (SELECT DISTINCT MAX (WK_END_THU_ID_NU) FROM TABLE A)
AND A.WK_END_THU_END_YR_NU = YEAR(GETDATE())
AND A.LGCY_NATL_STR_NU IN (7731)
AND B.SLD_MENU_ITM_ID = 4314
I just want this to run faster. Maybe there is a different approach i should be taking?
I would move the subquery to the FROM clause and change the WHERE clause to only refer to A:
SELECT *
FROM A CROSS JOIN
(SELECT MAX(WK_END_THU_ID_NU) as max_wet
FROM A
) am
ON a.WK_END_THU_ID_NU = max_wet JOIN
B
ON A.SLD_MENU_ITM_ID = B.SLD_MENU_ITM_ID AND
B.ACTV_FLG = 1
WHERE A.WK_END_THU_END_YR_NU = YEAR(GETDATE()) AND
A.LGCY_NATL_STR_NU IN (7731) AND
A.SLD_MENU_ITM_ID = 4314; -- is the same as B
Then you want indexes. I'm pretty sure you want indexes on:
A(SLD_MENU_ITM_ID, WK_END_THU_END_YR_NU, LGCY_NATL_STR_NU, SLD_MENU_ITM_ID)
B(SLD_MENU_ITM_ID, ACTV_FLG)
I will note that moving the subquery to the FROM clause probably does not affect performance, because SQL Server is smart enough to only execute it once. However, I prefer table references in the FROM clause when reasonable. I don't think a window function would actually help in this case.

Calculate year to date duplicate

I am trying to calculate year to date
I try this script, it returns the good values but it's too slow
select
T.Delivery_month,
T.[Delivery_Year],
Sales_Organization,
SUM(QTY) as Month_Total,
COALESCE(
(
select SUM(s2.QTY)
FROM stg.Fact_DC_AGG s2
where
s2.Sales_Organization = T.Sales_Organization
and s2.[Delivery_Year]=T.[Delivery_Year]
AND s2.Delivery_month<= T.Delivery_month
),0) as YTD_Total
from stg.Fact_DC_AGG T
group by
T.Delivery_month,
T.[Delivery_Year],
Sales_Organization
ORDER BY
Sales_Organization,T.[Delivery_Year],
T.Delivery_month
I modified it in order to optimize it, but it returns wrongs values with duplicates:
select
T.Delivery_month,
T.[Delivery_Year],
Sales_Organization,
SUM(QTY) as Month_Total,
COALESCE(
(
),0) as YTD_Total
from stg.Fact_DC_AGG T
INNER JOIN stg.Fact_DC_AGG s2
ON
s2.Sales_Organization = T.Sales_Organization
and s2.[Delivery_Year]=T.[Delivery_Year]
AND s2.Delivery_month<= T.Delivery_month
group by
T.Delivery_month,
T.[Delivery_Year],
Sales_Organization
ORDER BY
Sales_Organization,T.[Delivery_Year],
T.Delivery_month
How to optimize the query or to correct the second script ?
To optimize the performance of your query first you have to remove sub-query and use join instead of sub-query as well as you also need to generate an actual execution plan for your query and identify required missing index.
NOTE: Missing indexes might affect your SQL Server performance, that can down your SQL Server performance, So be sure to review your actual query execution plans and the identify the right index.
For more Information Please visit the following links.
1) SQL Server Basic Performance Tuning Tips and Tricks
2) Create Missing Index From the Actual Execution Plan

Can this SQL Query be optimized to run faster?

I have an SQL Query (For SQL Server 2008 R2) that takes a very long time to complete. I was wondering if there was a better way of doing it?
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND t.Code NOT IN (SELECT Code FROM ExcludedCodes)
Table1 has around 90Million rows in it and is indexed by Name and Code.
ExcludedCodes only has around 30 rows in it.
This query is in a stored procedure and gets called around 40k times, the total time it takes the procedure to finish is 27 minutes.. I believe this is my biggest bottleneck because of the massive amount of rows it queries against and the number of times it does it.
So if you know of a good way to optimize this it would be greatly appreciated! If it cannot be optimized then I guess im stuck with 27 min...
EDIT
I changed the NOT IN to NOT EXISTS and it cut the time down to 10:59, so that alone is a massive gain on my part. I am still going to attempt to do the group by statement as suggested below but that will require a complete rewrite of the stored procedure and might take some time... (as I said before, im not the best at SQL but it is starting to grow on me. ^^)
In addition to workarounds to get the query itself to respond faster, have you considered maintaining a column in the table that tells whether it is in this set or not? It requires a lot of maintenance but if the ExcludedCodes table does not change often, it might be better to do that maintenance. For example you could add a BIT column:
ALTER TABLE dbo.Table1 ADD IsExcluded BIT;
Make it NOT NULL and default to 0. Then you could create a filtered index:
CREATE INDEX n ON dbo.Table1(name)
WHERE IsExcluded = 0;
Now you just have to update the table once:
UPDATE t
SET IsExcluded = 1
FROM dbo.Table1 AS t
INNER JOIN dbo.ExcludedCodes AS x
ON t.Code = x.Code;
And ongoing you'd have to maintain this with triggers on both tables. With this in place, your query becomes:
SELECT #Count = COUNT(Name)
FROM dbo.Table1 WHERE IsExcluded = 0;
EDIT
As for "NOT IN being slower than LEFT JOIN" here is a simple test I performed on only a few thousand rows:
EDIT 2
I'm not sure why this query wouldn't do what you're after, and be far more efficient than your 40K loop:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
WHERE src.Code NOT IN (SELECT Code FROM dbo.ExcludedCodes)
GROUP BY src.Name;
Or the LEFT JOIN equivalent:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
LEFT OUTER JOIN dbo.ExcludedCodes AS x
ON src.Code = x.Code
WHERE x.Code IS NULL
GROUP BY src.Name;
I would put money on either of those queries taking less than 27 minutes. I would even suggest that running both queries sequentially will be far faster than your one query that takes 27 minutes.
Finally, you might consider an indexed view. I don't know your table structure and whether your violate any of the restrictions but it is worth investigating IMHO.
You say this gets called around 40K times. WHy? Is it in a cursor? If so do you really need a cursor. Couldn't you put the values you want for #name in a temp table and index it and then join to it?
select t.name, count(t.name)
from table t
join #name n on t.name = n.name
where NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t.code)
group by t.name
That might get you all your results in one query and is almost certainly faster than 40K separate queries. Of course if you need the count of all the names, it's even simpleer
select t.name, count(t.name)
from table t
NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t
group by t.name
NOT EXISTS typically performs better than NOT IN, but you should test it on your system.
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND NOT EXISTS (SELECT 1 FROM ExcludedCodes e WHERE e.Code = t.Code)
Without knowing more about your query it's tough to supply concrete optimization suggestions (i.e. code suitable for copy/paste). Does it really need to run 40,000 times? Sounds like your stored procedure needs reworking, if that's feasible. You could exec the above once at the start of the proc and insert the results in a temp table, which can keep the indexes from Table1, and then join on that instead of running this query.
This particular bit might not even be the bottleneck that makes your query run 27 minutes. For example, are you using a cursor over those 90 million rows, or scalar valued UDFs in your WHERE clauses?
Have you thought about doing the query once and populating the data in a table variable or temp table? Something like
insert into #temp (name, Namecount)
values Name, Count(name)
from table1
where name not in(select code from excludedcodes)
group by name
And don't forget that you could possibly use a filtered index as long as the excluded codes table is somewhat static.
Start evaluating the execution plan. Which is the heaviest part to compute?
Regarding the relation between the two tables, use a JOIN on indexed columns: indexes will optimize query execution.