How to speed up sql sub query

How to speed up sql sub query - sql

i'm running a job and its taking too long to run.
I have created a job to update based on the value of multiple tables
UPDATE applicant_scores
SET applicant_scores.Age=2.5
where applicant_scores.Applicant_id in
(select applicantinfo.subebno from applicantinfo
WHERE SUBSTR(applicantinfo.DOB,7,4) ='1985')
This should update a column with about 17000 rows, but it's taking too long time.

I would recommend using not exists and an index:
UPDATE applicant_scores
SET applicant_scores.Age = 2.5
WHERE EXISTS (SELECT 1
FROM applicantinfo ai
WHERE appliacnt_scores.Applicant_id = ai.subebno AND
SUBSTR(ai.DOB, 7, 4) ='1985'
);
For performance, you want an index on applicantinfo(subebno, DOB).
Note: DOB probably means "date of birth". It should be stored as a date in your database and you should be using proper date functions, such as:
extract(year from dob) = 1985
year(dob) = 1985
dob >= '1985-01-01' and dob < '1986-01-01'
Do not store dates as strings. Do not use string functions on dates.

Your problem is this:
WHERE SUBSTR(applicantinfo.DOB,7,4) ='1985'
The database doesn't have a way to quickly find all the rows that match that criteria. There's no index. It has to check every row in the database to find the ones that match that expression.
One solution you have is you can add another column to your table, maybe call it dob_year, then is just the year out of that date. Then, you CREATE INDEX applicantinfo_dob_year ON applicantinfo(dob_year). Then change your WHERE clause to WHERE dob_year ='1985')
https://use-the-index-luke.com/ is a great site for learning about database indexes and how to use them properly to make your queries fast.

SET Age_Score=(SELECT Age_Score FROM age_scoretbl WHERE SUBSTR(applicantinfo.DOB,7,4)= age_scoretbl.Birth_Year);```

Related

MS Access - SQL Inner Join two conditions and Len-Function

I have a database with a table which I use as master and which is being updated and extended on a daily basis by a table with the same layout. Before I update almost the whole master with daily data, I want to test if the values from a specific column changed during the daily update. Usually this column only contains Null or an "X".
As a prototype I only compared the specific column of Table A and Table B and if there is a difference, set a value with more than one characters into the column (here yesterday's date).
This is the code which worked as a prototype:
UPDATE ReiseMaster
INNER JOIN Update_Import
ON(ReiseMaster.Col3 <> Update_Import.Col3
SET ReiseMaster.Col3 = Date() - 1
Now, the column contains Null, "X" or a date in the master. For the next update I now have to make sure that this previously updated column values which are containing a date as a string will be excluded (otherwise ReiseMaster.Col3 <> Update_Import.Col3 will always be true for them in the future and the date will always be updated which is not intended to happen).
My approach was to exclude all datasets from the master table where the length of the values in the column is longer than 1.
Now here is my problem:
Running the SQL code makes MS Access not responding anymore, the whole program crashes. Can somebody advise me what could be wrong with the following code?
UPDATE ReiseMaster
INNER JOIN ReiseMaster_Import
ON(ReiseMaster.`Attachment Indicator` <> ReiseMaster_Import.`Attachment Indicator` AND LEN(ReiseMaster.`Attachment Indicator`) <= 1)
SET ReiseMaster.`Attachment Indicator` = Date() - 1
Additional info: I use Access VBA to run a code and during that also the SQL-statements which are being saved in a string. About the reason I add a date once I observe a change, I want to use the dates as a reference when the value has been changed for the first time to do further analysis with them in a later stage.

Avoid using complex joins in update queries! Since the entire recordset needs to be updateable, Access tends to have problems with it.
Instead, use a WHERE clause:
UPDATE ReiseMaster
INNER JOIN ReiseMaster_Import
ON(ReiseMaster.[Attachment Indicator] <> ReiseMaster_Import.[Attachment Indicator])
SET ReiseMaster.[Attachment Indicator] = Date() - 1
WHERE LEN(ReiseMaster.[Attachment Indicator]) <= 1
Also, Access uses brackets to escape spaces in column names.
Note that if you're not using any information from the joined table, and just use it to select records, you should use an Exists clause instead:
UPDATE ReiseMaster
SET ReiseMaster.[Attachment Indicator] = Date() - 1
WHERE EXISTS(SELECT 1 FROM ReiseMaster_Import WHERE ReiseMaster.[Attachment Indicator] <> ReiseMaster_Import.[Attachment Indicator])
AND LEN(ReiseMaster.[Attachment Indicator]) <= 1

Why my sql query is so slow in one database?

I am using sql server 2008 r2 and I have two database, which is one have 11.000 record and another is just 3000 record, when i do run this query
SELECT Right(rtrim(tbltransac.No_Faktur),6) as NoUrut,
tbltransac.No_Faktur,
tbltransac.No_FakturP,
tbltransac.Kd_Plg,
Tblcust.Nm_Plg,
GRANDTOTAL AS Total_Faktur,
tbltransac.Nm_Pajak,
tbltransac.Tgl_Faktur,
tbltransac.Tgl_FakturP,
tbltransac.Total_Distribusi
FROM Tblcust
INNER JOIN ViewGrandtotal AS tbltransac ON Tblcust.Kd_Plg = tbltransac.Kd_Plg
WHERE tbltransac.Kd_Trn = 'J'
and year(tbltransac.tgl_faktur)=2015
And ISNULL(tbltransac.No_OPJ,'') <> 'SHOP'
Order by Right(rtrim(tbltransac.No_Faktur),6) Desc
It takes me 1 minute 30 sec in my server (I query it using sql management tool) that have 3000 record but it only took 3 sec to do a query in my another server which is have 11000 record, whats wring with my database?
I've already tried to backup and restore my 3000 record database and restore it in my 11000 record server, it's faster.. took 30 sec to do a query, but it's still annoying if i compare to my 11000 record server. They are in the same spec
How this happend? what i should check? i check on event viewer, resource monitor or sql management log, i couldn't find any error or blocked connection. There is no wrong routing too..
Please help... It just happen a week ago, before this it was fine, and I haven't touch the server more than a month...

as already mentioned before, you have three issues in your query.
Just as an example, change the query to this one:
SELECT Right(rtrim(tbltransac.No_Faktur),6) as NoUrut,
tbltransac.No_Faktur,
tbltransac.No_FakturP,
tbltransac.Kd_Plg,
Tblcust.Nm_Plg,
GRANDTOTAL AS Total_Faktur,
tbltransac.Nm_Pajak,
tbltransac.Tgl_Faktur,
tbltransac.Tgl_FakturP,
tbltransac.Total_Distribusi
FROM Tblcust
INNER JOIN ViewGrandtotal AS tbltransac ON Tblcust.Kd_Plg = tbltransac.Kd_Plg
WHERE tbltransac.Kd_Trn = 'J'
and tbltransac.tgl_faktur BETWEEN '20150101' AND '20151231'
And tbltransac.No_OPJ <> 'SHOP'
Order by NoUrut Desc --Only if you need a sorted output in the datalayer
Another idea, if your viewGrandTotal is quite large, could be an pre-filtering of this table before you join it. Sometimes SQL Server doesn't get a good plan which needs some lovely touch to get him in the right direction.
Maybe this one:
SELECT Right(rtrim(vgt.No_Faktur),6) as NoUrut,
vgt.No_Faktur,
vgt.No_FakturP,
vgt.Kd_Plg,
tc.Nm_Plg,
vgt.Total_Faktur,
vgt.Nm_Pajak,
vgt.Tgl_Faktur,
vgt.Tgl_FakturP,
vgt.Total_Distribusi
FROM (SELECT Kd_Plg, Nm_Plg FROM Tblcust GROUP BY Kd_Plg, Nm_Plg) as tc -- Pre-Filter on just the needed columns and distinctive.
INNER JOIN (
-- Pre filter viewGrandTotal
SELECT DISTINCT vgt.No_Faktur, vgt.No_Faktur, vgt.No_FakturP, vgt.Kd_Plg, vgt.GRANDTOTAL AS Total_Faktur, vgt.Nm_Pajak,
vgt.Tgl_Faktur, vgt.Tgl_FakturP, vgt.Total_Distribusi
FROM ViewGrandtotal AS vgt
WHERE tbltransac.Kd_Trn = 'J'
and tbltransac.tgl_faktur BETWEEN '20150101' AND '20151231'
And tbltransac.No_OPJ <> 'SHOP'
) as vgt
ON tc.Kd_Plg = vgt.Kd_Plg
Order by NoUrut Desc --Only if you need a sorted output in the datalayer
The pre filtering could increase the generation of a better plan.
Another issue could be just the multi-threading. Maybe your query get a parallel plan as it reaches the cost threshold because of the 11.000 rows. The other query just hits a normal plan due to his lower rows. You can take a look at the generated plans by including the actual execution plan inside your SSMS Query.
Maybe you can compare those plans to get a clue. If this doesn't help, you can post them here to get some feedback from me.
I hope this helps. Not quite easy to give you good hints without knowing table structures, table sizes, performance counters, etc. :-)
Best regards,
Ionic

Note: first of all you should avoid any function in Where clause like this one
year(tbltransac.tgl_faktur)=2015
Here Aaron Bertrand how to work with date in Where clause
"In order to make best possible use of indexes, and to avoid capturing too few or too many rows, the best possible way to achieve the above query is ":
SELECT COUNT(*)
FROM dbo.SomeLogTable
WHERE DateColumn >= '20091011'
AND DateColumn < '20091012';
And i cant understand your logic in this piece of code but this is bad part of your query too
ISNULL(tbltransac.No_OPJ,'') <> 'SHOP'
Actually Null <> "Shop" in this case, so Why are you replace it to ""?
Thanks and good luck

Here is some recommendations:
year(tbltransac.tgl_faktur)=2015 replace this with tbltransac.tgl_faktur >= '20150101' and tbltransac.tgl_faktur < '20160101'
ISNULL(tbltransac.No_OPJ,'') <> 'SHOP' replace this with tbltransac.No_OPJ <> 'SHOP' because NULL <> 'SHOP'.
Order by Right(rtrim(tbltransac.No_Faktur),6) Desc remove this, because ordering should be done in presentation layer rather then in data layer.
Read about SARG arguments and predicates:
What makes a SQL statement sargable?
To write an appropriate SARG, you must ensure that a column that has
an index on it appears in the predicate alone, not as a function
parameter. SARGs must take the form of column inclusive_operator
or inclusive_operator column. The column name is alone
on one side of the expression, and the constant or calculated value
appears on the other side. Inclusive operators include the operators
=, >, <, =>, <=, BETWEEN, and LIKE. However, the LIKE operator is inclusive only if you do not use a wildcard % or _ at the beginning of
the string you are comparing the column to

String ending in range of numbers

I have a column with data of the following structure:
aaa5644988
aaa4898494
aaa5642185
aaa5482312
aaa4648848
I have a range that can be anything, like 100-30000 or example. I want to have all values that end in numbers between that range.
I tried
like '%[100-30000]'
but this doesn't work apparently.
I have seen a lot of similar questions but none of the solved my problem
edit I'm using SQL server 2008
Example:
Value
aaa45645695
aaa28568720
aaa65818450
8789212
6566700
For the range 600-1200, I want to retrieve row 1,2,5 because they end with the range.

In SQL, like normally only support % and _ these two operators. That's why like '%[100-30000]' doesn't work.
Depend on your use case, there could be two solutions for this problem:
If you only need to do this query two or three times(didn't care how long it takes), or the dataset is not very big. You can select all the data from this column, and then do the filtering in another programming language.
Take ruby for example, you can do:
column_data = #connection.execute("select * from your_column_name")
result = column_data.map{|x| x.gsub(/^.*[^\d]/, '').to_i }.select{|x| x > 100 && x < 30000}
If you need to do this query regularly, I'd suggest you add a new column to this data table with only the numbers in the current column, so as to get a much better performance in querying speed.
SELECT *
FROM your_table
WHERE number_column BETWEEN 100 AND 30000

Update table with combined field from other tables

I have a table with actions that are being due in the future. I have a second table that holds all the cases, including the due date of the case. And I have a third table that holds numbers.
The problems is as follows. Our system automatically populates our table with future actions. For some clients however, we need to change these dates. I wanted to create an update query for this, and have this run through our scheduler. However, I am kind of stuck at the moment.
What I have on code so far is this:
UPDATE proxima_gestion p
SET fecha = (SELECT To_char(d.f_ult_vencim + c.hrem01, 'yyyyMMdd')
FROM deuda d,
c4u_activity_dates c,
proxima_gestion p
WHERE d.codigo_cliente = c.codigo_cliente
AND p.n_expediente = d.n_expediente
AND d.saldo > 1000
AND p.tipo_gestion_id = 914
AND p.codigo_oficina = 33
AND d.f_ult_vencim > sysdate)
WHERE EXISTS (SELECT *
FROM proxima_gestion p,
deuda d
WHERE p.n_expediente = d.n_expediente
AND d.saldo > 1000
AND p.tipo_gestion_id = 914
AND p.codigo_oficina = 33
AND d.f_ult_vencim > sysdate)
The field fecha is the current action date. Unfortunately, this is saved as a char instead of date. That is why I need to convert the date back to a char. F_ult_vencim is the due date, and hrem01 is the number of days the actions should be placed away from the due date. (for example, this could be 10, making the new date 10 days after the due date)
Apart from that, there are a few more criteria when we need to change the date (certain creditors, certain departments, only for future cases and starting from a certain amount, only for a certain action type.)
However, when I try and run this query, I get the error message
ORA-01427: single-row subquery returns more than one row
If I run both subqueries seperately, I get 2 results from both. What I am trying to accomplish, is that it connects these 2 queries, and updates the field to the new value. This value will be different for every case, as every due date will be different.
Is this even possible? And if so, how?

You're getting the error because the first SELECT is returning more than one row for each row in the table being updated.
The first thing I see is that the alias for the table in UPDATE is the same as the alias in both SELECTs (p). So all of the references to p in the subqueries are referencing proxima_gestion in the subquery rather than the outer query. That is, the subquery is not dependent on the outer query, which is required for an UPDATE.
Try removing "proxima_gestion p" from FROM in both subqueries. The references to p, then, will be to the outer UPDATE query.

Optimizing Oracle query

SELECT MAX(verification_id)
FROM VERIFICATION_TABLE
WHERE head = 687422
AND mbr = 23102
AND RTRIM(LTRIM(lname)) = '.iq bzw'
AND TO_CHAR(dob,'MM/DD/YYYY')= '08/10/2004'
AND system_code = 'M';
This query is taking 153 seconds to run. there are millions of rows in VERIFICATION_TABLE.
I think query is taking long because of the functions in where clause. However, I need to do ltrim rtrim on the columns and also date has to be matched in MM/DD/YYYY format. How can I optimize this query?
Explain plan:
SELECT STATEMENT, GOAL = ALL_ROWS 80604 1 59
SORT AGGREGATE 1 59
TABLE ACCESS FULL P181 VERIFICATION_TABLE 80604 1 59
Primary key:
VRFTN_PK Primary VERIFICATION_ID
Indexes:
N_VRFTN_IDX2 head, mbr, dob, lname, verification_id
N_VRFTN_IDX3 last_update_date
N_VRFTN_IDX4 mbr, lname, dob, verification_id
N_VRFTN_IDX4 verification_id
Though, in the explain plan I dont see indexes/primary key being used. is that the problem?

Try this:
SELECT MAX(verification_id)
FROM VERIFICATION_TABLE
WHERE head = 687422
AND mbr = 23102
AND TRIM(lname) = '.iq bzw'
AND TRUNCATE(dob) = TO_DATE('08/10/2004')
AND system_code = 'M';
Remove that TRUNCATE() if dob doesn't have time on it already, from the looks of it (Date of Birth?) it may not. Past that, you need some indexing work. If you're querying that much in this style, I'd index mbr and head in a 2 column index, if you said what the columns mean it'd help determine the best indexing here.

The only index that is a possible candidate for use in your query is N_VRFTN_IDX2, because it indexes four of the columns you use in your WHERE clause: HEAD, MBR, DOB and LNAME.
However, because you apply functions to both DOB and LNAME they are ineligible for consideration. The optimizer may then decide not to use that index because it thinks HEAD+MBR on their own are an insufficiently selective combination. If you removed the TO_CHAR() call from DOB then you have three leading columns on N_VRFTN_IDX2 which might make it more attractive to the optimizer. Likewise, is it necessary to TRIM() LNAME?
The other thing is, the need to look up SYSTEM_CODE means the query has to read from the table (because that column is not indexed). If N_VRFTN_IDX2 has a poor clustering factoring the optimizer may decide to go for a FULL TABLE SCAN because the indexed reads are an overhead. Whereas if you added SYSTEM_CODE to the index the entire query could be satisfied by an INDEX RANGE SCAN, which would be a lot faster.
Finally, how fresh are your statistics? If your statistics are stale, that might lead the optimizer to make a duff decision. For instance, more accurate statistics might lead the optimizer to use the compound index even with just the two leading columns.

You should turn the literal into a DATE and not the column into a VARCHAR2 like this:
AND dob = TO_DATE('08/10/2004','MM/DD/YYYY')
Or use the preferable ANSI date literal syntax:
AND dob = DATE '2004-08-10'
If the dob column contains time (a date of birth doesn't usually, except presumably in a hospital!) then you can do:
AND dob >= DATE '2004-08-10'
AND dob < DATE '2004-08-11'

Check the datatypes for HEAD and MBR.
The values "687422 and 23102" have the 'feel' of being quite selective. That is, if you have hundreds of thousands of values for head and millions of records in the table, it would seem that HEAD is quite selective. [That could be totally misleading though.]
Anyway, you may find that HEAD and/or MBR are actually stored as VARCHAR2 or CHAR fields rather than NUMBER. If so, comparing the character to a number would prevent the use of the index. Try the following (and I've included the conversion of the dob predicate with a date but added the explicit format mask).
SELECT MAX(verification_id)
FROM VERIFICATION_TABLE
WHERE head = '687422'
AND mbr = '23102'
AND RTRIM(LTRIM(lname)) = '.iq bzw'
AND TRUNCATE(dob) = TO_DATE('08/10/2004','MM/DD/YYYY')
AND system_code = 'M';

Please provide an EXPLAIN output on this query so we know where the slow-down occurs. Two thoughts:
change
AND TO_CHAR(dob,'MM/DD/YYYY')= '08/10/2004'
to
AND dob = <date here, not sure which oracle str2date function you need>
and use a function based index on
RTRIM(LTRIM(lname))

Try this:
SELECT MAX(verification_id)
FROM VERIFICATION_TABLE
WHERE head = 687422
AND mbr = 23102
AND TRIM(lname) = '.iq bzw'
AND dob between TO_DATE('08/10/2004') and TO_DATE('08/11/2004')
AND system_code = 'M';
This way a possible index on dob will be used.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas