Optimizing SQL Stored Procedure Query for better performance - sql

Is there a way to optimize the query below as its took quite awhile to retrieve the massive records from the table (T_School_Class) and (T_School) I had created indexes for Name as well as SchoolCode for T_School. In additional, Temp Table was also created.
SELECT Distinct (S.SchoolCode) As Code, Name from T_STU_School AS S
LEFT JOIN T_STU_School_Class AS SC ON S.SchoolCode = SC.SchoolCode
WHERE S.SchoolCode IN
(SELECT SchoolCode FROM #MainLevelCodeTemp)
AND [Status] = 'A'
AND Name LIKE #Keyword
AND (#AcademicCode = '' OR SC.AcademicLevel IN (#AcademicCode))
Order BY Name ASC;

all the imperatives in the sproc are a waste, you're just forcing SQL to scan T_STU_School multiple times, all that logic should just be added to the where clause:
SELECT Distinct (S.SchoolCode) As Code, Name from T_STU_School AS S
LEFT JOIN T_STU_School_Class AS SC ON S.SchoolCode = SC.SchoolCode
WHERE ((#MainLevelCode LIKE '%J%' AND S.MixLevelType IN ('T1','T2','T6'))
OR (#MainLevelCode LIKE '%S%' AND S.MixLevelType IN ('T1','T2','T5','T6'))
OR (#MainLevelCode LIKE '%P%' AND S.MixLevelType IN ('T1','T2','T6'))
OR (MainLevelCode IN (SELECT Item FROM [dbo].[SplitString](#MainLevelCode, ',')))
OR #MainLevelCode = '')
AND [Status] = 'A'
AND (#Keyword = '' OR Name LIKE #Keyword)
AND (#AcademicCode = '' OR SC.AcademicLevel IN (#AcademicCode))
Order BY Name ASC;
..the reason both tables are still being scanned per your execution plan even though you've created indexes on Name and SchoolCode is because there's no criteria on SchoolCode that would reduce the result set to less than the whole table, and likewise with Name whenever it is blank or starts with a "%". to prevent the full table scans you should create indexes on:
T_STU_School (Status, Name)
T_STU_School_Class (MixLevelType, SchoolCode)
T_STU_School_Class (MainLevelCode, SchoolCode)
..also any time you have stuff like (y='' OR x=y) in the where clause it's a good idea to add an OPTION (RECOMPILE) to the bottom to avoid the eventual bad plan cache nightmare.
..also this line is probably a bug:
AND (#AcademicCode = '' OR SC.AcademicLevel IN (#AcademicCode))
IN won't parse #AcademicCode so this statement is equivalent to SC.AcademicLevel=#AcademicCode

You definitely need an index on T_STU_SCHOOL.SchoolCode. Your query plan shows that 65% of the query time is taken from the index scan that results from the join. An index on the SchoolCode column should turn that into an index seek, which will be much faster.
The Name index is not currently being used, probably because you're passing in values for #keyword that start with a wildcard. Given that Name is on the T_STU_School table, which has a small number rows, you can maybe afford a table scan there in order to use wildcards the way you want to. So you should be able to drop the Name index.

Related

TSQL Improving performance of Update cross apply like statement

I have a client with a stored procedure that currently take 25 minutes to run. I have narrowed the cause of this to the following statement (changed column and table names)
UPDATE #customer_emails_tmp
SET #customer_emails_tmp.Possible_Project_Ref = cp.order_project_no,
#customer_emails_tmp.Possible_Project_id = cp.order_uid
FROM #customer_emails_tmp e
CROSS APPLY (
SELECT TOP 1 p.order_project_no, p.order_uid
FROM [order] p
WHERE e.Subject LIKE '%' + p.order_title + '%'
AND p.order_date < e.timestamp
ORDER BY p.order_date DESC
) as cp
WHERE e.Possible_Project_Ref IS NULL;
There are 3 slightly different version of the above, joining to 1 of three tables. The issue is the CROSS APPLY LIKE '%' + p.title + '%'. I have tried looking into CONTAINS() and FREETEXT() but as far as my testing and investigations go, you cannot do CONTAINS(e.title, p.title) or FREETEXT(e.title,p.title).
Have I miss read something or is there a better way to write the above query?
Any help on this is much appreciated.
EDIT
Updated query to actual query used. Execution plan:
https://www.brentozar.com/pastetheplan/?id=B1YPbJiX5
Tmp table has the following indexes:
CREATE NONCLUSTERED INDEX ix_tmp_customer_emails_first_recipient ON #customer_emails_tmp (First_Recipient);
CREATE NONCLUSTERED INDEX ix_tmp_customer_emails_first_recipient_domain_name ON #customer_emails_tmp (First_Recipient_Domain_Name);
CREATE NONCLUSTERED INDEX ix_tmp_customer_emails_client_id ON #customer_emails_tmp (customer_emails_client_id);
CREATE NONCLUSTERED INDEX ix_tmp_customer_emails_subject ON #customer_emails_tmp ([subject]);
There is no index on the [order] table for column order_title
Edit 2
The purpose of this SP is to link orders (amongst others) to sent emails. This is done via multiple UPDATE statements; all other update statements are less than a second in length; however, this one ( and 2 others exactly the same but looking at 2 other tables) take an extraordinary amount of time.
I cannot remove the filter on Possible_Project_Ref IS NULL as we only want to update the ones that are null.
Also, I cannot change WHERE e.Subject LIKE '%' + p.order_title + '%' to WHERE e.Subject LIKE p.order_title + '%' because the subject line may not start with the p.order_title, for example it could start with FW: or RE:
Reviewing your execution plan, I think the main issue is you're reading a lot of data from the order table. You are reading 27,447,044 rows just to match up to find 783 rows. Your 20k row temp table is probably nothing by comparison.
Without knowing your data or desired business logic, here's a couple things I'd consider:
Updating First Round of Exact Matches
I know you need to keep your %SearchTerm% parameters, but some data might have exact matches. So if you run an initial update for exact matches, it will reduce the ones you have to search with %SearchTerm%
Run something like this before your current update
/*Recommended index for this update*/
CREATE INDEX ix_test ON [order](order_title,order_date) INCLUDE (order_project_no, order_uid)
UPDATE #customer_emails_tmp
SET Possible_Project_Ref = cp.order_project_no
,Possible_Project_id = cp.order_uid
FROM #customer_emails_tmp e
CROSS APPLY (
SELECT TOP 1 p.order_project_no, p.order_uid
FROM [order] p
WHERE e.Subject = p.order_title
AND p.order_date < e.timestamp
ORDER BY p.order_date DESC
) as cp
WHERE e.Possible_Project_Ref IS NULL;
Narrowing Search Range
This will technically change your matching criteria, but there are probably certain logical assumptions you can make that won't impact the final results. Here are a couple of ideas for you to consider, to get you thinking this way, but only you know your business. The end goal should be to narrow the data read from the order table
Is there a customer id you can match on? Something like e.customerID = p.customerID? Do you really match any email to any order?
Can you narrow your search date range to something like x days before timestamp? Do you really need to search all historical orders for all of time? Would you even want a match if an email matches to an order from 5 years ago? For this, try updating your APPLY date filter to something like p.order_date BETWEEN DATEADD(dd,-30,e.[timestamp]) AND e.[timestamp]
Other Miscellaneous Notes
If I'm understanding this correctly, you are trying to link email to some sort of project #. Ideally, when the email are generated, they would be linked to a project immediately. I know this is not always possible resource/time wise, but the clean solution is to calculate this at the beginning of the process, not afterwards. Generally anytime you have to use fuzzy string matching, you will have data issues. I know business always wants results "yesterday" and always pushes for the shortcut, and nobody ever wants to update legacy processes, but sometimes you need to if you want clean data
I'd review your indexes on the temp table. Generally I find the cost to create the indexes and for SQL Server to maintain them as I update the temp table is not worth it. So 9 times out of 10, I leave the temp table as a plain heap with 0 indexes
First, filter the NULLs when you create #customer_emails_tmp, not after. Then you can lose:
WHERE e.Possible_Project_Ref IS NULL. This way you are only bringing in rows you need instead of retrieving rows you don't need, then filtering them.
Next, us this for your WHERE clause:
WHERE EXISTS (SELECT 1 FROM [order] AS p WHERE p.order_date < e.timestamp)
If an order date doesn't have any later timestamps in e, none of the rows in e will be considered.
Next remove the timestamp filter from your APPLY subquery. Now your subquery looks like this:
SELECT TOP 1 p.order_project_no, p.order_uid
FROM [order] AS p
WHERE e.Subject LIKE '%' + p.order_title + '%'
ORDER BY p.order_date DESC
This way you are applying your "Subject Like" filter to a much smaller set of rows. The final query would look like this:
UPDATE #customer_emails_tmp
SET #customer_emails_tmp.Possible_Project_Ref = cp.order_project_no,
#customer_emails_tmp.Possible_Project_id = cp.order_uid
FROM #customer_emails_tmp e
CROSS APPLY (
SELECT TOP 1 p.order_project_no, p.order_uid
FROM [order] p
WHERE e.Subject LIKE '%' + p.order_title + '%'
ORDER BY p.order_date DESC
) as cp
WHERE EXISTS (SELECT 1 FROM [order] AS p WHERE p.order_date < e.timestamp);

Bad performance when not selecting a specific column from a view

Using SQL Server 2016 SP1. I have a view Users that goes like
SELECT
ROW_NUMBER() OVER (ORDER BY ID) AS DataModelID, *
FROM
(Some query) AS tbl
I then select from it
SELECT
U1.ID UserId, U1.IdentityNumber IdentityNumber,
U1.ArabicFirstName, U1.ArabicSecondName
FROM
USERS U1
LEFT JOIN
USERS U2 ON U1.IdentityNumber = U2.IdentityNumber
AND U1.ID <> U2.ID
AND U1.RoleId = 2
WHERE
U2.ID IS NOT NULL
AND U1.IdentityNumber <> ''
AND PATINDEX('[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]', U1.IdentityNumber) = 1
The thing here is with the above query when selecting * or include column DataModelID it runs in 3 secs but when selecting any columns without this one it runs in more than 2 mins.
Why is this happening, running faster when including a column?
I tried everything for the cash to clear it and run multiple times and it has the same results
Without seeing the actual execution plan there is no way to say for sure but as #mvisser mentioned - the likely cause is that the optimizer is choosing a better index when you do a
SELECT * or include column DataModelID than when you don't. There are a number of solutions here, one suggestion would be to look at the execution plan for the queries that run in 3 seconds, note what index is being used and use an index hint (see section G) to force the optimizer to use that index in your queries that don't reference those columns. I would not suggest this though - there are too many unanswered variables to consider this a viable option.
Here's what I recommend:
First, as #Lukasz Szozda mentioned, this is not SARGable:
AND PATINDEX( '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]',U1.IdentityNumber) = 1
But this is:
U1.IdentityNumber LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
So I'd fix that first. Next, the fastest, most sure-fire way to resolve this is to simply include DataModelID in your queries even if you don't need them. You can either filter that column out at the application level or create a stored proc that populates a temp table then, for the final result set, you can retrieve your results from that temp table excluding DataModelID.
OPTION #2
You can create an Indexed View on you USERS table that looks something like this:
CREATE VIEW dbo.vwUSERS_clean
WITH SCHEMABINDING AS
SELECT U1.ID, UserId, U1.IdentityNumber IdentityNumber,
U1.ArabicFirstName, U1.ArabicSecondName, DataModelID, U2.IdentityNumber
FROM USERS U1
WHERE U2.ID IS NOT NULL
AND U1.IdentityNumber <> ''
AND U1.IdentityNumber LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]';
GO
Then create a unique, clustered index on it. Next you would change the query that you posted to reference your indexed view (e.g. change both references to USERS to dbo.vwUSERS_clean WITH (NOEXPAND)).
Note that ROW_NUMBER is not allowed in indexed views but, if you make ID your clustered index (or the first column in a composite clustered index) then there will be no cost to Adding ROW_NUMBER() OVER ORDER BY ID to queries that reference that Indexed view.

Clarification on Indexes

To illustrate my question, I will use the following example:
CREATE INDEX supplier_idx
ON supplier (supplier_name);
Will the searching on this table only be sped up if the supplier_name column is specified in the SELECT clause? What if we select the supplier_name column as well as other columns in the SELECT clause? Is searching sped up if this column is used in a WHERE clause, even if it is not in the SELECT clause?
Do the same rules apply to the following index as well:
CREATE INDEX supplier_idx
ON supplier (supplier_name, city);
Indexes can be complex, so a full explanation would take a lot of writing. There are many resources on the internet. (Helpful link here to Oracle indexes)
However, I can just answer your questions simply.
CREATE INDEX supplier_idx
ON supplier (supplier_name);
This means that any joins (and similar) using the col supplier_name and using the col supplier_name in a WHERE clause will benefit from an index.
For example
SELECT * FROM SomeTable
WHERE supplier_name = 'Smith'
But simply using the supplier_name column in a SELECT clause will not benefit from having an index (unless you add complexity to the SELECT clause, which I will cover...). For example - this will not benefit from an Index on supplier_name
SELECT
supplier_name
FROM SomeTable WHERE ID = 1
However, if you added some complexity to your SELECT statement, your index could indeed speed it up...For example:
SELECT
supplier_name -- no index benefit
,(SELECT TOP 1 somedata FROM Table2 WHERE supplier_name = Table2.name) AS SomeValue
-- the line above uses the index as supplier_name is used in WHERE
, CASE WHEN supplier_name = 'Best Supplier'
THEN 'Best'
ELSE 'Worst'
END AS FindBestSupplier
-- Also the CASE statement will use the index on supplier_name
FROM SomeTable WHERE ID = 1
(The 'complexity' above still basically shows that if the field 'supplier_name' is used in CASE, or WHERE aswell as JOINS and aggregations, then the INDEX is very beneficial...This example above is a combination of many clauses wrapped into one SELECT statement)
But your composite index
CREATE INDEX supplier_idx
ON supplier (supplier_name, city);
would be beneficial in specific and important cases (Eg: where the city is in the SELECT clause and the supplier_name is used in the WHERE clause), for example
SELECT
city
FROM SomeTable WHERE supplier_name = 'Smith'
The reason is that city is stored alongside the supplier_name index values, so when the index finds the supplier_name value, it immediately has a copy of the city value (stored in the index) and does not need to hit the database files to find any more data. (If city was not in the index, it would have to hit the database to pull the city value out, as it does with most data required in the SELECT statement usually)
The joins will benefit from an index also, with the example:
SELECT
* FROM SomeTable T1
LEFT JOIN AnotherTable T2
ON T1.supplier_name = T2.supplier_name_2
AND T1.city = T2.city_2
So in summary, if you use the field in any comparison expression like a WHERE clause or a JOIN , or a GROUP BY clause (and the aggregations SUM, MIN, MAX etc)...then an Index is very beneficial for Tables with over a few thousand rows...
(Usually only makes a big difference when you have at least 10,000 rows in a Table, but this can vary depending on your complexity)
SQL Server (for example) always creates any missing indexes that it needs (and then discards them)..So if you do not create the correct indexes manually - the system can slow down as it creates the missing indexes on the fly each time it needs them. (SQL Server will show you hints on what indexes it thinks you need for a certain query)
Indexes can slow down UPDATES or INSERTS, so they must be used with a little wisdom and balance...(Sometimes indexes are deleted before a batch of UPDATEs is performed and then the index re-created again, although this is kinda extreme)

Can this SQL Query be optimized to run faster?

I have an SQL Query (For SQL Server 2008 R2) that takes a very long time to complete. I was wondering if there was a better way of doing it?
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND t.Code NOT IN (SELECT Code FROM ExcludedCodes)
Table1 has around 90Million rows in it and is indexed by Name and Code.
ExcludedCodes only has around 30 rows in it.
This query is in a stored procedure and gets called around 40k times, the total time it takes the procedure to finish is 27 minutes.. I believe this is my biggest bottleneck because of the massive amount of rows it queries against and the number of times it does it.
So if you know of a good way to optimize this it would be greatly appreciated! If it cannot be optimized then I guess im stuck with 27 min...
EDIT
I changed the NOT IN to NOT EXISTS and it cut the time down to 10:59, so that alone is a massive gain on my part. I am still going to attempt to do the group by statement as suggested below but that will require a complete rewrite of the stored procedure and might take some time... (as I said before, im not the best at SQL but it is starting to grow on me. ^^)
In addition to workarounds to get the query itself to respond faster, have you considered maintaining a column in the table that tells whether it is in this set or not? It requires a lot of maintenance but if the ExcludedCodes table does not change often, it might be better to do that maintenance. For example you could add a BIT column:
ALTER TABLE dbo.Table1 ADD IsExcluded BIT;
Make it NOT NULL and default to 0. Then you could create a filtered index:
CREATE INDEX n ON dbo.Table1(name)
WHERE IsExcluded = 0;
Now you just have to update the table once:
UPDATE t
SET IsExcluded = 1
FROM dbo.Table1 AS t
INNER JOIN dbo.ExcludedCodes AS x
ON t.Code = x.Code;
And ongoing you'd have to maintain this with triggers on both tables. With this in place, your query becomes:
SELECT #Count = COUNT(Name)
FROM dbo.Table1 WHERE IsExcluded = 0;
EDIT
As for "NOT IN being slower than LEFT JOIN" here is a simple test I performed on only a few thousand rows:
EDIT 2
I'm not sure why this query wouldn't do what you're after, and be far more efficient than your 40K loop:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
WHERE src.Code NOT IN (SELECT Code FROM dbo.ExcludedCodes)
GROUP BY src.Name;
Or the LEFT JOIN equivalent:
SELECT src.Name, COUNT(src.*)
FROM dbo.Table1 AS src
INNER JOIN #temptable AS t
ON src.Name = t.Name
LEFT OUTER JOIN dbo.ExcludedCodes AS x
ON src.Code = x.Code
WHERE x.Code IS NULL
GROUP BY src.Name;
I would put money on either of those queries taking less than 27 minutes. I would even suggest that running both queries sequentially will be far faster than your one query that takes 27 minutes.
Finally, you might consider an indexed view. I don't know your table structure and whether your violate any of the restrictions but it is worth investigating IMHO.
You say this gets called around 40K times. WHy? Is it in a cursor? If so do you really need a cursor. Couldn't you put the values you want for #name in a temp table and index it and then join to it?
select t.name, count(t.name)
from table t
join #name n on t.name = n.name
where NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t.code)
group by t.name
That might get you all your results in one query and is almost certainly faster than 40K separate queries. Of course if you need the count of all the names, it's even simpleer
select t.name, count(t.name)
from table t
NOT EXISTS (SELECT Code FROM ExcludedCodes WHERE Code = t
group by t.name
NOT EXISTS typically performs better than NOT IN, but you should test it on your system.
SELECT #count = COUNT(Name)
FROM Table1 t
WHERE t.Name = #name AND NOT EXISTS (SELECT 1 FROM ExcludedCodes e WHERE e.Code = t.Code)
Without knowing more about your query it's tough to supply concrete optimization suggestions (i.e. code suitable for copy/paste). Does it really need to run 40,000 times? Sounds like your stored procedure needs reworking, if that's feasible. You could exec the above once at the start of the proc and insert the results in a temp table, which can keep the indexes from Table1, and then join on that instead of running this query.
This particular bit might not even be the bottleneck that makes your query run 27 minutes. For example, are you using a cursor over those 90 million rows, or scalar valued UDFs in your WHERE clauses?
Have you thought about doing the query once and populating the data in a table variable or temp table? Something like
insert into #temp (name, Namecount)
values Name, Count(name)
from table1
where name not in(select code from excludedcodes)
group by name
And don't forget that you could possibly use a filtered index as long as the excluded codes table is somewhat static.
Start evaluating the execution plan. Which is the heaviest part to compute?
Regarding the relation between the two tables, use a JOIN on indexed columns: indexes will optimize query execution.

How do I go about optimizing an Oracle query?

I was given a SQL query, saying that I have to optimize this query.
I came accross explain plan. So, in SQL developer, I ran explain plan for ,
It divided the query into different parts and showed the cost for each of them.
How do I go about optimizing the query? What do I look for? Elements with high costs?
I am a bit new to DB, so if you need more information, please ask me, and I will try to get it.
I am trying to understand the process rather than just posting the query itself and getting the answer.
The query in question:
SELECT cr.client_app_id,
cr.personal_flg,
r.requestor_type_id
FROM credit_request cr,
requestor r,
evaluator e
WHERE cr.evaluator_id = 96 AND
cr.request_id = r.request_id AND
cr.evaluator_id = e.evaluator_id AND
cr.request_id != 143462 AND
((r.soc_sec_num_txt = 'xxxxxxxxx' AND
r.soc_sec_num_txt IS NOT NULL) OR
(lower(r.first_name_txt) = 'test' AND
lower(r.last_name_txt) = 'newprogram' AND
to_char(r.birth_dt, 'MM/DD/YYYY') = '01/02/1960' AND
r.last_name_txt IS NOT NULL AND
r.first_name_txt IS NOT NULL AND
r.birth_dt IS NOT NULL))
On running explain plan, I am trying to upload the screenshot.
OPERATION OBJECT_NAME OPTIONS COST
SELECT STATEMENT 15
NESTED LOOPS
NESTED LOOPS 15
HASH JOIN 12
Access Predicates
CR.EVALUATOR_ID=E.EVALUATOR_ID
INDEX EVALUATOR_PK UNIQUE SCAN 0
Access Predicates
E.EVALUATOR_ID=96
TABLE ACCESS CREDIT_REQUEST BY INDEX ROWID 11
INDEX CRDRQ_DONE_EVAL_TASK_REQ_NDX SKIP SCAN 10
Access Predicates
CR.EVALUATOR_ID=96
Filter Predicates
AND
CR.EVALUATOR_ID=96
CR.REQUEST_ID<>143462
INDEX REQUESTOR_PK RANGE SCAN 1
Access Predicates
CR.REQUEST_ID=R.REQUEST_ID
Filter Predicates
R.REQUEST_ID<>143462
TABLE ACCESS REQUESTOR BY INDEX ROWID 3
Filter Predicates
OR
R.SOC_SEC_NUM_TXT='XXXXXXXX'
AND
R.BIRTH_DT IS NOT NULL
R.LAST_NAME_TXT IS NOT NULL
R.FIRST_NAME_TXT IS NOT NULL
LOWER(R.FIRST_NAME_TXT)='test'
LOWER(R.LAST_NAME_TXT)='newprogram'
TO_CHAR(INTERNAL_FUNCTION(R.BIRTH_DT),'MM/DD/YYYY')='01/02/1960'
As a quick update to your query, you're going to want to refactor it to something like this:
SELECT
cr.client_app_id,
cr.personal_flg,
r.requestor_type_id
FROM
credit_request cr
inner join requestor r on
cr.request_id = r.request_id
inner join evaluator e on
cr.evaluator_id = e.evaluator_id
WHERE
cr.evaluator_id = 96
and cr.request_id != 143462
and (r.soc_sec_num_txt = 'xxxxxxxxx'
or (
lower(r.first_name_txt) = 'test'
and lower(r.last_name_txt) = 'newprogram'
and r.birth_dt = date '1960-01-02'
)
)
Firstly, joining by commas creates a cross join, which you want to avoid. Luckily, Oracle's smart enough to do it as an inner join since you specified join conditions, but you want to be explicit so you don't accidentally miss something.
Secondly, your is not null checks are pointless--if a column is null, and = check you do will return false for that row. In fact, any comparison with a null column, even null = null returns false. You can try this with select 1 where null = null and select 1 where null is null. Only the second one returns.
Thirdly, Oracle's smart enough to compare dates with the ISO format (at least the last time I used it, it was). You can just do r.birth_dt = date '1960-01-02' and avoid doing a string format on that column.
That being said, your query isn't exactly poorly written in terms of egregious performance mistakes. What you want to look for are indices. Does evaluator have one on evaluator_id? Does credit_request? What types are they? Typically, evaluator will have a one on the PK evaluator_id, and credit_request will have one for that column, as well. The same for requestor and the request_id columns.
Other indices you may want to consider are all the fields you're using to filter. In this case, soc_sec_num_txt, first_name_txt, last_name_txt, birth_dt. Consider putting a multi-column index on the latter three, and a single column index on the soc_sec_num_txt column.
After refactoring the query comes indexes, so following on from #eric's post:
credit_request:
You're joining this onto requestor on request_id, which I hope is unique. In your where clause you then have a condition on evaluator_id and select client_app_id and personal_flg in the query. So, you probably need a unique index, on credit_request of (request_id, evaulator_id, client_app_id, personal_flg.
By putting the columns you're selecting into the index you avoid the by index rowid, which means that you have selected your values from the index then re-entered the table to pick up more information. If this information is already in the index then there's no need.
You're joining it onto evaluator on evaluator_id, which is included in the first index.
requestor:
This is being joined onto on request_id and your where clause include soc_sec_num_text, lower(first_name_txt), lower(last_name_txt) and birth_dt. So, you need a unique if possible, index on (request_id, soc_sec_num_text) because of the or this is further complicated because you should really have an index on as many of the conditions as possible. You're also selecting requestor_type_iud.
In this case to avoid a functional index, with many columns, I'd index on (request_id, soc_sec_num_text, birth_dt ) if you have the space, time and inclination then adding lower(first_name_txt)... etc to this may improve the speed depending on how selective the column is. This means that if there are far more values in for instance, first_name_txt than birth_dt you'd be better of putting this in front of birth_dt in the index so your query has less to scan if it's a non-unique index.
You notice that I haven't added the selected column into this index as you're already going to have to go into the table so you gain nothing by adding it.
evaluator:
This is only being joined on evaluator_id so you need a unique, if possible, index on this column.