optimize query using NOT EXIST - sql

I want to optimize my query using NOT EXIST in following query how can i do so, and please also explain its execution plan
Select I_Ftn, I_Col, count(c.i_id_num) cnt
From DSCL_ALL.W_CALENDER c
Where c.UNIT_CODE= '01'
AND c.i_g_vill = '45'
and c.i_g_code = '1'
and c.survey_year = '2012-2013'
and c.i_number not in (select m.m_indent from w_mill_pur m where m.unit_code = c.unit_code and m.m_vill = c.i_g_vill and m.m_grow = c.i_g_code)
Group By I_Ftn, I_Col
ORDER BY I_ftn, I_col)

you may want to try this :
Select I_Ftn, I_Col, count(c.i_id_num) cnt
From DSCL_ALL.W_CALENDER c
Where c.UNIT_CODE= '01'
AND c.i_g_vill = '45'
and c.i_g_code = '1'
and c.survey_year = '2012-2013'
and not exists (select 1 from w_mill_pur m where m.unit_code = c.unit_code and m.m_vill = c.i_g_vill and m.m_grow = c.i_g_code and m.m_indent = c.i_number)
Group By I_Ftn, I_Col
ORDER BY I_ftn, I_col)
It's more efficient because of the added where clause : Oracle is able to run a more filtered subquery and then just test if the result set is empty or not.
You may also want to check that you have a (unit_code, m_vill, m_grow, m.m_indent) index for w_mill_pur.
The "not in" way require one more join in the main query (the subquery result set with the main one).
Regards,

There is difference between "NOT IN" and "NOT EXISTS". Please follow the link by ASKTOM -
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:442029737684

Related

SQL query to filter by specific date criteria

SQL query to filter by specific date criteria
SQL Server Management Studio V17.7
Question: I am looking for guidance related to a view on how to select records where the Start_Date falls between a defined date range when either the Admit_Status = 1 or Admit_Status = 0 as shown below:
Criteria should be something like: if admit status = 1 then dbo.PT_ASSIGNMENT.START_DATE >= referral_date to ifnull dbo.PT_ADMISSION.TERMINATION_DATE then now() else dbo.PT_ADMISSION.TERMINATION_DATE or if admit status = 0 then start_date >= referral_date to ifnull dbo.PT_ADMISSION.PROSPECT_TERM_DATE then now() else dbo.PT_ADMISSION.PROSPECT_TERM_DATE
My SQL query (View) excluding the above question:
SELECT dbo.RES_BASIC.RESOURCE_ID,
dbo.PT_ADMISSION.ADMISSION_ID,
dbo.PT_ASSIGNMENT.START_DATE,
(CASE PT_ADMISSION.PROSPECT_ADMIT_DATE WHEN NULL THEN PT_ADMISSION.ADMIT_DATE ELSE PT_ADMISSION.PROSPECT_ADMIT_DATE END) AS REFERRAL_DATE,
dbo.PT_ADMISSION.ADMIT_DATE,
dbo.PT_ADMISSION.PROSPECT_ADMIT_DATE,
dbo.PT_ADMISSION.PROSPECT_TERM_DATE,
dbo.PT_ADMISSION.TERMINATION_DATE,
CASE WHEN PT_ADMISSION.ADMIT_DATE IS NOT NULL THEN 1 ELSE 0 END AS ADMIT_STATUS,
dbo.PT_BASIC.PATIENT_CODE,
dbo.RES_BASIC.NAME_FULL
FROM dbo.PT_BASIC
INNER JOIN dbo.PT_STATUS
ON dbo.PT_BASIC.PATIENT_ID = dbo.PT_STATUS.PATIENT_ID
INNER JOIN dbo.A_PATIENT_STATUS
ON dbo.PT_STATUS.ADMIN_SET_ID = dbo.A_PATIENT_STATUS.ADMIN_SET_ID
AND dbo.PT_STATUS.STATUS_CODE = dbo.A_PATIENT_STATUS.STATUS_CODE
INNER JOIN dbo.O_DATASET
ON dbo.PT_BASIC.DATASET_ID = dbo.O_DATASET.DATASET_ID
INNER JOIN dbo.PT_ADMISSION
ON dbo.PT_BASIC.PATIENT_ID = dbo.PT_ADMISSION.PATIENT_ID
AND dbo.PT_STATUS.ADMISSION_ID = dbo.PT_ADMISSION.ADMISSION_ID
INNER JOIN dbo.PT_ASSIGNMENT
ON dbo.PT_BASIC.PATIENT_ID = dbo.PT_ASSIGNMENT.PATIENT_ID
INNER JOIN dbo.A_ASSIGNMENT_TYPE
ON dbo.PT_ASSIGNMENT.ADMIN_SET_ID = dbo.A_ASSIGNMENT_TYPE.ADMIN_SET_ID
AND dbo.PT_ASSIGNMENT.ASSIGNMENT_TYPE = dbo.A_ASSIGNMENT_TYPE.TYPE_ID
INNER JOIN dbo.RES_BASIC
ON dbo.PT_ASSIGNMENT.RESOURCE_ID = dbo.RES_BASIC.RESOURCE_ID
WHERE (dbo.O_DATASET.DATASET_NAME = 'XXXXXXXXXX')
AND (dbo.A_ASSIGNMENT_TYPE.DESCRIPTION = 'REFERRING PHYSICIAN')
GROUP BY dbo.RES_BASIC.NAME_FIRST + ' ' + dbo.RES_BASIC.NAME_LAST,
dbo.RES_BASIC.RESOURCE_ID,
dbo.PT_ADMISSION.ADMISSION_ID,
dbo.PT_BASIC.PATIENT_CODE,
dbo.PT_ASSIGNMENT.START_DATE,
dbo.PT_ADMISSION.PROSPECT_TERM_DATE,
dbo.PT_ADMISSION.PROSPECT_ADMIT_DATE,
dbo.PT_ADMISSION.TERMINATION_DATE,
dbo.PT_ADMISSION.ADMIT_DATE,
dbo.RES_BASIC.NAME_FULL
You can put pretty much any "if" into a condition with simple AND and OR use.
I am having a hard time mentally parsing your "something like" portion, but to give a generic example.
IF A THEN B ELSE C
can be translated to (A AND B) OR (NOT A AND C)
Note: Bill Braskey's "comment" is also worth considering. If the conditional logic gets complicated enough, it can be less work for the database to UNION queries with simpler conditions. You'd still need the condition in one to be A AND B and the other to be NOT A AND C to apply the conditions appropriately, but you'd be simplifying from the overall condition (especially when you consider "C" could actually be a translation of IF D THEN E ELSE F.
Maybe just take the easy way out and write two queries with the separate requirements and do a union

SQL GROUP BY function returning incorrect SUM amount

I've been working on this problem, researching what I could be doing wrong but I can't seem to find an answer or fault in the code that I've written. I'm currently extracting data from a MS SQL Server database, with a WHERE clause successfully filtering the results to what I want. I get roughly 4 rows per employee, and want to add together a value column. The moment I add the GROUP BY clause against the employee ID, and put a SUM against the value, I'm getting a number that is completely wrong. I suspect the SQL code is ignoring my WHERE clause.
Below is a small selection of data:
hr_empl_code hr_doll_paid
1 20.5
1 51.25
1 102.49
1 560
I expect that a GROUP BY and SUM clause would give me the value of 734.24. The value I'm given is 211461.12. Through troubleshooting, I added a COUNT(*) column to my query to work out how many lines it's running against, and it's giving a result of 1152, furthering reinforces my belief that it's ignoring my WHERE clause.
My SQL code is as below. Most of it has been generated by the front-end application that I'm running it from, so there is some additional code in there that I believe does assist the query.
SELECT DISTINCT
T000.hr_empl_code,
SUM(T175.hr_doll_paid)
FROM
hrtempnm T000,
qmvempms T001,
hrtmspay T166,
hrtpaytp T175,
hrtptype T177
WHERE 1 = 1
AND T000.hr_empl_code = T001.hr_empl_code
AND T001.hr_empl_code = T166.hr_empl_code
AND T001.hr_empl_code = T175.hr_empl_code
AND T001.hr_ploy_ment = T166.hr_ploy_ment
AND T001.hr_ploy_ment = T175.hr_ploy_ment
AND T175.hr_paym_code = T177.hr_paym_code
AND T166.hr_pyrl_code = 'f' AND T166.hr_paid_dati = 20180404
AND (T175.hr_paym_type = 'd' OR T175.hr_paym_type = 't')
GROUP BY T000.hr_empl_code
ORDER BY hr_empl_code
I'm really lost where it could be going wrong. I have stripped out the additional WHERE AND and brought it down to just T166.hr_empl_code = T175.hr_empl_code, but it doesn't make a different.
By no means am I any expert in SQL Server and queries, but I have decent grasp on the technology. Any help would be very appreciated!
Group by is not wrong, how you are using it is wrong.
SELECT
T000.hr_empl_code,
T.totpaid
FROM
hrtempnm T000
inner join (SELECT
hr_empl_code,
SUM(hr_doll_paid) as totPaid
FROM
hrtpaytp T175
where hr_paym_type = 'd' OR hr_paym_type = 't'
GROUP BY hr_empl_code
) T on t.hr_empl_code = T000.hr_empl_code
where exists
(select * from qmvempms T001,
hrtmspay T166,
hrtpaytp T175,
hrtptype T177
WHERE T000.hr_empl_code = T001.hr_empl_code
AND T001.hr_empl_code = T166.hr_empl_code
AND T001.hr_empl_code = T175.hr_empl_code
AND T001.hr_ploy_ment = T166.hr_ploy_ment
AND T001.hr_ploy_ment = T175.hr_ploy_ment
AND T175.hr_paym_code = T177.hr_paym_code
AND T166.hr_pyrl_code = 'f' AND T166.hr_paid_dati = 20180404
)
ORDER BY hr_empl_code
Note: It would be more clear if you have used joins instead of old style joining with where.

SQL query running slow after including the OR checking inside WHERE clause

This is my query .After i comment the part OR (NITransactionStatus = 'SUCCESS') ,there is no slowness.
How can i modify this query such that there is no slow and also i need to include both the conditions thats inside 'WHERE' clause?
SELECT COUNT(*)
FROM IN_CTD D
INNER JOIN IN_CTC C
ON C.InwardCustFileId = D.InwardCustFileId
WHERE (D.CurrentStatusId = 30) OR (NITransactionStatus = 'SUCCESS')
Trying to make it brief :
While i excute this query,it takes too much time to complete its execution . After i comment the 'OR' checking,there is no slowness.The 'IN_CTD' table mentioned in the query contains 2830539 records and the table 'IN_CTC' have 1965 records.How can i modify this query including the 'OR' checking such that it won't take much time to execute ?
Maybe you can try sub query. If NITransactionStatus column is on IN_CTD table, try this:
SELECT COUNT(*)
FROM IN_CTC C
INNER JOIN (SELECT InwardCustFileId FROM IN_CTD WHERE (CurrentStatusId = 30) OR (NITransactionStatus = 'SUCCESS')) AS D
ON C.InwardCustFileId = D.InwardCustFileId
If NITransactionStatus column is on IN_CTC table, try this:
SELECT COUNT(*)
FROM IN_CTC C
INNER JOIN (SELECT InwardCustFileId FROM IN_CTD WHERE CurrentStatusId = 30) AS D
ON C.InwardCustFileId = D.InwardCustFileId
WHERE (NITransactionStatus = 'SUCCESS')
Hope it can help.

group by clause issue

I have written a query in access and now I am trying to write the same in SQL Server I am getting following error:
Msg 164, Level 15, State 1, Procedure OQRY_STEP_1_1, Line 15
Each GROUP BY expression must contain at least one column that is not an outer reference.
My SQL Query is as follows:
SELECT
ns11.SYS_ID,
ns11.SUB_NET_ID,
ns11.TEMP_ID,
ns11.EQ_ID,
ns11.NODE_NAME,
ns11.EQ_NAME,
ns11.VAR_NAME,
ns11.VAR_SET,
ns11.VAR_SUBSET,
ns11.EQ_TYPE,
ns11.RHS_RELN,
ns11.RHS_OBJECT,
ns11.EQ_TP_OFFSET,
ns11.RHS_TP_OFFSET,
ns11.RETAIN,
nmte.RHS_VAR_SET,
nmte.RHS_VAR_SUBSET,
nmte.RHS_VAR_NAME,
0 AS RHS_VAR_TYPE,
CASE
WHEN [asp].[VALUE] = NULL THEN 0
ELSE [asp].[VALUE]
END RHS_VALUE
INTO ##OT_STEP_1_1
FROM (##NT_STEP_1_1 ns11
INNER JOIN ##NT_MASTER_TEMP_EQUATION nmte
ON (ns11.SYS_ID = nmte.SYS_ID)
(ns11.SUB_NET_ID = nmte.SUB_NET_ID)
AND (ns11.TEMP_ID = nmte.TEMP_ID)
AND (ns11.EQ_ID = nmte.EQ_ID)
AND (ns11.NODE_NAME = nmte.NODE_NAME)
AND (nmte.SYS_ID = ns11.SYS_ID)
AND (nmte.SUB_NET_ID = ns11.SUB_NET_ID))
LEFT JOIN AMST_SIM_PAR asp ON
(nmte.SYS_ID = asp.SYS_ID)
AND (nmte.SUB_NET_ID = ns11.SUB_NET_ID)
AND (nmte.RHS_VAR_NAME = asp.VAR_NAME)
GROUP BY
ns11.SYS_ID,
ns11.SUB_NET_ID,
ns11.TEMP_ID,
ns11.EQ_ID,
ns11.NODE_NAME,
ns11.EQ_NAME,
ns11.VAR_NAME,
ns11.VAR_SET,
ns11.VAR_SUBSET,
ns11.EQ_TYPE,
ns11.RHS_RELN,
ns11.RHS_OBJECT,
ns11.EQ_TP_OFFSET,
ns11.RHS_TP_OFFSET,
ns11.RETAIN,
nmte.RHS_VAR_SET,
nmte.RHS_VAR_SUBSET,
nmte.RHS_VAR_NAME,
0,
CASE
WHEN [asp].[VALUE] = NULL THEN 0
ELSE [asp].[VALUE]
END
ORDER BY
CASE
WHEN [asp].[VALUE] = NULL THEN 0
ELSE [asp].[VALUE]
END;
I am not sure why it is not taking 0 in the group by clause?
I think the GROUP BY ..., 0, ... is the issue here. Try removing that 0 from there. There is no point grouping by a constant.
Sidenote:
CASE WHEN [AMST_SIM_PAR].[VALUE] = NULL
THEN 0
ELSE [AMST_SIM_PAR].[VALUE]
END
should be be written with IS NULL instead of = NULL or as:
COALESCE( [AMST_SIM_PAR].[VALUE], 0 )
I think the constant '0' in your group by is the problem.
Are you using ANSI_NULLS? SQL-92 defines "= NULL" or "<> NULL" to always return false. Try changing "= NULL" to "IS NULL".
Also in your left join you have a criteria that doesn't match the outer table. The inner join already links SUB_NET_ID on those two tables so you can remove it from your left join.
Since you are not taking any aggregates, why not just use DISTINCT instead of repeating all that noise in the GROUP BY? Also the ORDER BY is not very useful because you are using SELECT INTO, which creates a new table, which by definition is an unordered set of rows. In order to get the data out of that table in the right "order" you should use an ORDER BY when you eventually select out of it. If you want the data optimized for joins or what have you after the table is created, create a clustered index after the SELECT INTO. Finally, why are you using ##global temp tables? You know that two users can't execute this code at the same time, right?
All that said, here is a much simpler and easier to read version:
SELECT DISTINCT
n.SYS_ID,
n.SUB_NET_ID,
n.TEMP_ID,
n.EQ_ID,
n.NODE_NAME,
n.EQ_NAME,
n.VAR_NAME,
n.VAR_SET,
n.VAR_SUBSET,
n.EQ_TYPE,
n.RHS_RELN,
n.RHS_OBJECT,
n.EQ_TP_OFFSET,
n.RHS_TP_OFFSET,
n.RETAIN,
te.RHS_VAR_SET,
te.RHS_VAR_SUBSET,
te.RHS_VAR_NAME,
RHS_VAR_TYPE = 0,
RHS_VALUE = COALESCE(a.VALUE, 0)
INTO ##OT_STEP_1_1
FROM ##NT_STEP_1_1 AS n
INNER JOIN ##NT_MASTER_TEMP_EQUATION AS te
ON n.SYS_ID = te.SYS_ID
AND n.SUB_NET_ID = te.SUB_NET_ID
AND n.TEMP_ID = te.TEMP_ID
AND n.EQ_ID = te.EQ_ID
AND n.NODE_NAME = te.NODE_NAME
AND te.SYS_ID = n.SYS_ID
AND te.SUB_NET_ID = n.SUB_NET_ID
LEFT OUTER JOIN dbo.AMST_SIM_PAR AS a
ON te.SYS_ID = a.SYS_ID
AND te.SUB_NET_ID = n.SUB_NET_ID
AND te.RHS_VAR_NAME = a.VAR_NAME;

Problem with adding custom sql to finder condition

I am trying to add the following custom sql to a finder condition and there is something not quite right.. I am not an sql expert but had this worked out with a friend who is..(yet they are not familiar with rubyonrails or activerecord or finder)
status_search = "select p.*
from policies p
where exists
(select 0 from status_changes sc
where sc.policy_id = p.id
and sc.status_id = '"+search[:status_id].to_s+"'
and sc.created_at between "+status_date_start.to_s+" and "+status_date_end.to_s+")
or exists
(select 0 from status_changes sc
where sc.created_at =
(select max(sc2.created_at)
from status_changes sc2
where sc2.policy_id = p.id
and sc2.created_at < "+status_date_start.to_s+")
and sc.status_id = '"+search[:status_id].to_s+"'
and sc.policy_id = p.id)" unless search[:status_id].blank?
My find statement:
Policy.find(:all,:include=>[{:client=>[:agent,:source_id,:source_code]},{:status_changes=>:status}],
:conditions=>[status_search])
and I am getting this error message in my log:
ActiveRecord::StatementInvalid (Mysql::Error: Operand should contain 1 column(s): SELECT DISTINCT `policies`.id FROM `policies` LEFT OUTER JOIN `clients` ON `clients`.id = `policies`.client_id WHERE ((((policies.created_at BETWEEN '2009-01-01' AND '2009-03-10' OR policies.created_at = '2009-01-01' OR policies.created_at = '2009-03-10')))) AND (select p.*
from policies p
where exists
(select 0 from status_changes sc
where sc.policy_id = p.id
and sc.status_id = '2'
and sc.created_at between 2009-03-10 and 2009-03-10)
or exists
(select 0 from status_changes sc
where sc.created_at =
(select max(sc2.created_at)
from status_changes sc2
where sc2.policy_id = p.id
and sc2.created_at < 2009-03-10)
and sc.status_id = '2'
and sc.policy_id = p.id)) ORDER BY clients.created_at DESC LIMIT 0, 25):
what is the major malfunction here - why is it complaining about the columns?
The conditions modifier is expecting a condition (e.g. a boolean expression that could go in a where clause) and you are passing it an entire query (a select statement).
It looks as if you are trying to do too much in one go here, and should break it down into smaller steps. A few suggestions:
use the query with find_by_sql and don't mess with the conditions.
use the rails finders and filter the records in the rails code
Also, note that constructing a query this way isn't secure if the values like status_date_start can come from users. Look up "sql injection attacks" to see what the problem is, and read the rails documentation & examples for find_by_sql to see how to avoid them.
Ok, I've managed to retool this so it is more friendly to a conditions modifier and I think it is doing the sql query correctly.. however, it is returning policies that when I try to list the current status (the policy.status_change.last.status) it is set to the same status used in the query - which is not correct
here is my updated condition string..
status_search = "status_changes.created_at between ? and ? and status_changes.status_id = ?) or
(status_changes.created_at = (SELECT MAX(sc2.created_at) FROM status_changes sc2
WHERE sc2.policy_id = policies.id and sc2.created_at < ?) and status_changes.status_id = ?"
is there something obvious to this that is not returning all of the remaining associated status changes once it finds the one in the query?
here is the updated find..
Policy.find(:all,:include=>[{:client=>[:agent,:source_id,:source_code]},:status_changes],
:conditions=>[status_search,status_date_start,status_date_end,search[:status_id].to_s,status_date_start,search[:status_id].to_s])