NOT IN query not working, SQL Server 2008 - sql

The first part of the query before not in runs and gives me a list of 100 records. The second query runs and gives me a list of 75 records. The query I'm trying to write using not in is to get the records that are in one result set, but not the other. The error I get is incorrect syntax near the word not.
SELECT distinct Patient.patientid
FROM Patient INNER JOIN
patientICD ON Patient.patientid = patientICD.patientid AND Patient.admissiondate = patientICD.admissiondate AND
Patient.dischargedate = patientICD.dischargedate INNER JOIN
tblICD ON patientICD.primarycode = tblICD.ICD_ID
WHERE (tblICD.descrip LIKE N'%diabetes%') and not in
(
SELECT distinct Patient.patientid
FROM Patient INNER JOIN
patientICD ON Patient.patientid = patientICD.patientid AND Patient.admissiondate = patientICD.admissiondate AND
Patient.dischargedate = patientICD.dischargedate INNER JOIN
tblICD ON patientICD.primarycode = tblICD.ICD_ID
WHERE (tblICD.icd_id LIKE N'25000')
)
Is it ever allowed to write a query with expression AND NOT IN (select query?

You need to specify what field is not in the second query
and Patient.patientid not in

Did you mean to write this?
WHERE (tblICD.descrip LIKE N'%diabetes%') and Patient.patientid not in
UPDATE
Would it be possible to rewrite the entire thing as this?
SELECT distinct Patient.patientid
FROM Patient INNER JOIN
patientICD ON Patient.patientid = patientICD.patientid AND Patient.admissiondate = patientICD.admissiondate AND
Patient.dischargedate = patientICD.dischargedate INNER JOIN
tblICD ON patientICD.primarycode = tblICD.ICD_ID
WHERE tblICD.descrip LIKE N'%diabetes%' AND tblICD.icd_id NOT LIKE N'25000'

You forgot a field before NOT.

I believe you need to specify the column that the not in is looking at. So according to your script I think you would want and Patient.patientid not in

Purely stylistic: you can "squeeze out" the patientICD*tblICD product, and put it into a CTE, and reference that twice, like: (untested)
WITH zzz AS (
SELECT pic.patientid , pic.admissiondate , pic.dischargedate
, tab.ICD_ID , tab.descrip
FROM patientICD pic
JOIN tblICD tab ON pic.primarycode = tab.ICD_ID
)
SELECT DISTINCT p.patientid
FROM Patient p
JOIN zzz one ON one.patientid = p.patientid
AND one.admissiondate = p.admissiondate
AND one.dischargedate = p.dischargedate
WHERE one.descrip LIKE N'%diabetes%'
AND p.patientid NOT IN (
SELECT two.patientid
FROM zzz two
WHERE two.admissiondate = p.admissiondate
AND two.dischargedate = p.dischargedate
AND two.icd_id LIKE N'25000'
);
NOTE: I don't like the LIKE N'25000'. I think an exact match would be fine. And the icd_id-field should be numeric, probably. And the {admissiondate,dischargedate} pair should be modelled out, too; possibly by using a diagnosis_id or incident_id.

Related

Oracle compare query results with multiple joins against a table

I need to compare query results against a table. I have the following query.
select
i.person_id,
a.appellant_first_name,
a.appellant_middle_name,
a.appellant_last_name,
s.*
from CWLEGAL.individuals i inner join CWLEGAL.tblappealsdatarevisionone a
on i.casenm = a.D_N_NUMBER1 and
i.first_name = a.appellant_first_name and
i.last_name = a.appellant_last_name
inner join CWLEGAL.tblappealstosupremecourt s
on a.DATABASEIDNUMBER = s.DBIDNUMBER
order by orclid21;
I need to see what orclid21's in cwlegal.tblappealstosupremecourt don't appear in the above query.
I was able to get this to work.
select
i.person_id,
a.appellant_first_name,
a.appellant_middle_name,
a.appellant_last_name,
s.*
from CWLEGAL.tblappealstosupremecourt s
join CWLEGAL.tblappealsdatarevisionone a
on a.DATABASEIDNUMBER = s.DBIDNUMBER
left outer join CWLEGAL.individuals i on
i.casenm = a.D_N_NUMBER1 and
i.first_name = a.appellant_first_name and
i.last_name = a.appellant_last_name
where person_id is null
order by orclid21
You are making the first inner join between i and a, the result of which you're joining with s.
Now, if you want to see which records won't join, that's known as anti-join, and in whatever database you're querying it, it may be achieved by either selecting a null result or taking those records as a new result.
Examples, with taking your query (the whole code in the question) as q, assuming you've kept all the needed keys in it:
Example 1:
with your_query as q
select s.orclid21 from q
left join CWLEGAL.tblappealstosupremecourt s
on q.DATABASEIDNUMBER = s.DBIDNUMBER
and s.orclid21 is null
Example 2:
with your_query as q
select s.orclid21 from q
right join CWLEGAL.tblappealstosupremecourt s
on q.DATABASEIDNUMBER != s.DBIDNUMBER
Example 3:
with your_query as q
select s.orclid21 from CWLEGAL.tblappealstosupremecourt s
where s.DBIDNUMBER not in (select distinct q.DATABASEIDNUMBER from q)

How do I fix the syntax of a sub query with joins?

I have the following query:
SELECT tours_atp.NAME_T, today_atp.TOUR, today_atp.ID1, odds_atp.K1, today_atp.ID2, odds_atp.K2
FROM (players_atp INNER JOIN (players_atp AS players_atp_1 INNER JOIN (today_atp INNER JOIN odds_atp ON (today_atp.TOUR = odds_atp.ID_T_O) AND (today_atp.ID1 = odds_atp.ID1_O) AND (today_atp.ID2 = odds_atp.ID2_O) AND (today_atp.ROUND = odds_atp.ID_R_O)) ON players_atp_1.ID_P = today_atp.ID2) ON players_atp.ID_P = today_atp.ID1) INNER JOIN tours_atp ON today_atp.TOUR = tours_atp.ID_T
WHERE (((tours_atp.RANK_T) Between 1 And 4) AND ((today_atp.RESULT)="") AND ((players_atp.NAME_P) Not Like "*/*") AND ((players_atp_1.NAME_P) Not Like "*/*") AND ((odds_atp.ID_B_O)=2))
ORDER BY tours_atp.NAME_T;
I'd like to add a field to this query that provides me with the sum of a field in another table (FS) with a few criteria applied.
I've been able to build a stand alone query to get the sum of FS by ID_T as follows:
SELECT tbl_Ts_base_atp.ID_T, Sum(tbl_Ts_mkv_atp.FS) AS SumOfFS
FROM tbl_Ts_base_atp INNER JOIN tbl_Ts_mkv_atp ON tbl_Ts_base_atp.ID_Ts = tbl_Ts_mkv_atp.ID_Ts
WHERE (((tbl_Ts_base_atp.DATE_T)>Date()-2000 And (tbl_Ts_base_atp.DATE_T)<Date()))
GROUP BY tbl_Ts_base_atp.ID_T, tbl_Ts_mkv_atp.ID_Ts;
I now want to match up the sum of FS from the second query to the records of the first query by ID_T. I realise I need to do this using a sub query. I'm confident using these when there's only one table but I consistently get 'syntax errors' when there are joins.
I simplified the first query down to remove all the WHERE conditions so it was easier for me to try and error check but no luck. I guess the resulting SQL will also be easier for you guys to follow:
SELECT today_atp.TOUR, (SELECT Sum(tbl_Ts_mkv_atp.FS)
FROM tbl_Ts_mkv_atp INNER JOIN (tbl_Ts_base_atp INNER JOIN today_atp ON tbl_Ts_base_atp.ID_T = today_atp.TOUR) ON tbl_Ts_mkv_atp.ID_Ts = tbl_Ts_base_atp.ID_Ts AS tt
WHERE tt.DATE_T>Date()-2000 And tt.DATE_T<Date() AND tt.TOUR=today_atp.TOUR
ORDER BY tt.DATE_T) AS SumOfFS
FROM today_atp
Can you spot where I'm going wrong? My hunch is that the issue is in the FROM line of the sub query but I'm not sure. Thanks in advance.
It's difficult to advise an appropriate solution without knowledge of how the database tables relate to one another, but assuming that I've correctly understood what you are looking to achieve, you might wish to try the following solution:
select
tours_atp.name_t,
today_atp.tour,
today_atp.id1,
odds_atp.k1,
today_atp.id2,
odds_atp.k2,
subq.sumoffs
from
(
(
(
(
today_atp inner join odds_atp on
today_atp.tour = odds_atp.id_t_o and
today_atp.id1 = odds_atp.id1_o and
today_atp.id2 = odds_atp.id2_o and
today_atp.round = odds_atp.id_r_o
)
inner join players_atp as players_atp_1 on
players_atp_1.id_p = today_atp.id2
)
inner join players_atp on
players_atp.id_p = today_atp.id1
)
inner join tours_atp on
today_atp.tour = tours_atp.id_t
)
inner join
(
select
tbl_ts_base_atp.id_t,
sum(tbl_ts_mkv_atp.fs) as sumoffs
from
tbl_ts_base_atp inner join tbl_ts_mkv_atp on
tbl_ts_base_atp.id_ts = tbl_ts_mkv_atp.id_ts
where
tbl_ts_base_atp.date_t > date()-2000 and tbl_ts_base_atp.date_t < date()
group by
tbl_ts_base_atp.id_t
) subq on
tours_atp.tour = subq.id_t
where
(tours_atp.rank_t between 1 and 4) and
today_atp.result = "" and
players_atp.name_p not like "*/*" and
players_atp_1.name_p not like "*/*" and
odds_atp.id_b_o = 2
order by
tours_atp.name_t;

Access: Integrating Subquery into excisting Query

The "Last" function in the query below (line 4 & 5)that I'm using is not exactly what I'm after. The last function finds the last record in that table.
What i need find is the most recent record in the table according to a date field.
SELECT
tblinmate.statusid,
tblinmate.activedate,
Last(tblclassificationhistory.classificationid) AS LastOfclassificationID,
Last(tblsquadhistory.squadid) AS LastOfsquadID,
tblperson.firstname,
tblperson.middlename,
tblperson.lastname,
tblinmate.prisonnumber,
tblinmate.droppeddate,
tblinmate.personid,
tblinmate.inmateid
FROM tblsquad
INNER JOIN (tblperson
INNER JOIN ((tblinmate
INNER JOIN (tblclassification
INNER JOIN tblclassificationhistory
ON tblclassification.classificationid =
tblclassificationhistory.classificationid)
ON tblinmate.inmateid =
tblclassificationhistory.inmateid)
INNER JOIN tblsquadhistory
ON tblinmate.inmateid =
tblsquadhistory.inmateid)
ON tblperson.personid = tblinmate.personid)
ON tblsquad.squadid = tblsquadhistory.squadid
GROUP BY tblinmate.statusid,
tblinmate.activedate,
tblperson.firstname,
tblperson.middlename,
tblperson.lastname,
tblinmate.prisonnumber,
tblinmate.droppeddate,
tblinmate.personid,
tblinmate.inmateid;
This query below does just that, finds the most recent record in a table according to a date field.
my problem is i dont know how to integrate this Query into the above to replace the "Last" function
SELECT a.inmateID,
a.classificationID,
b.max_date
FROM (
SELECT tblClassificationHistory.inmateID,
tblClassificationHistory.classificationID,
tblClassificationHistory.reclassificationDate
FROM tblinmate
INNER JOIN tblClassificationHistory
ON tblinmate.inmateID = tblClassificationHistory.inmateID
) a
INNER JOIN (
SELECT tblClassificationHistory.inmateID,
MAX(tblClassificationHistory.reclassificationDate) as max_date
FROM tblinmate
INNER JOIN tblClassificationHistory
ON tblinmate.inmateID = tblClassificationHistory.inmateID
GROUP BY tblClassificationHistory.inmateID
) b
ON a.inmateID = b.inmateID
AND a.reclassificationDate = b.max_date
ORDER BY a.inmateID;
I got a tip from another forum to combine queries like this
SELECT qryMainTemp.*, qrySquad.*, qryClassification.*
FROM (qryMainTemp INNER JOIN qrySquad ON qryMainTemp.inmateID = qrySquad.inmateID) INNER JOIN qryClassification ON qryMainTemp.inmateID = qryClassification.inmateID;
and it worked :) i separated the first query into the two queries it was made of and then combined the three like shown above.
Sadly this made another problem arise the query is now not up-datable..working on a solution for this

Subquery with multiple joins involved

Still trying to get used to writing queries and I've ran into a problem.
Select count(region)
where (regionTable.A=1) in
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
)
The inner query gives an ID number in one column, the amount of times they appear in the table, and then a bit attribute if they are in region A. The outer query works but the error I get is incorrect syntax near the keyword IN. Of the inner query, I would like a number of how many of them are in region A
You must specify table name in query before where
Select count(region)
from table
where (regionTable.A=1) in
And you must choose one of them.
where regionTable.A = 1
or
where regionTable.A in (..)
Your query has several syntax errors. Based on your comments, I think there is no need for a subquery and you want this:
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id, regionTable.A
which can be further simplified to:
select jxn.id, count(jxn.id) as counts
, 1 as A --- you can even omit this line
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id
You are getting the error because of this line:
where (regionTable.A=1)
You cannot specify a condition in a where in clause, it should only be column name
Something like this may be what you want:
SELECT COUNT(*)
FROM
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from
jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
) sq
WHERE sq.a = 1

Super Slow Query - sped up, but not perfect... Please help

I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.